← Library

concepts · tweet · 9 min

AI-Assisted Vibe Coding Development Approach

Ben Tossell · Jan 13, 2026

I've spent 3 billion tokens in four months. Every single one through a terminal, watching an agent write code I couldn't write myself.

You may class me as a 'vibe-coder'. But I think that term overlooks any kind of skill involved in the work itself. Much like 'no-code' did circa 2019 (when I started my no-code education company later acquired by Zapier).

I don't read the code. But I read the agent output religiously. And in doing so, I'm picking up a ton of knowledge around how code works, how projects work, where things fail, where they succeed.

That's my version of learning to program. The new technical class.

A few things I've actually shipped in these last few months:

Personal Site. I revamped my personal site and made it look like a terminal CLI tool and it was so much better than my previous attempt at the start of this year.

Feed. I built a simple social tracker for mentions of Factory on Twitter, posts from our subreddit, and GitHub issues. It's open-source and I've gotten 100+ stars on it with several folks cloning for themselves.

Factory Wrapped. I built the first version of our 'wrapped' product. Showed it to the team and they loved it, so they wanted to bake it into the actual product itself, which is now live. Adding new guides, rearranging things. This wouldn't technically feel like coding, but to me it is. It's still the same process.

Custom CLIs. I've created a few CLIs—like a Pylon CLI which then has been picked up by the team to help with customer support queries. I built a CLI to help users with adding tokens to their accounts. Plus a Linear and Gmail CLI.

A crypto tracker. I invested in a co that accurately predicts positive, negative or neutral signals in dynamic data (financial, weather, fitness, protein folding). So I built a tracker that automatically opens and closes short/long positions based on the predictions - kinda like a mini-hedge fund.

Droidmas. Twelve days, twelve experiments or games that touched the different themes people are talking about on Twitter—memory, context management, vibe coding, things of that nature.

An AI-directed video demo system. Effectively, I give it a prompt to create a video. It opens up ghostty, runs the commands, can open other windows like a browser, records the screen. Acts as its own director, producer and editor. The agent itself is watching what's happening during the recording and can respond as and when things happen. If there's an issue or a bug or it needs to wait for a response, it will do that. I used this to create a video that was posted by OpenAI.

A Telegram bot powered by Droid Exec so I could have my local repos synced on a VPS and just chat to my repos as a chatbot. I try to as closely mimic the CLI experience but from a messaging app (I dislike Telegram but couldn't be bothered with the arduous Whatsapp for Business setup).

And about 50 other things I'm not mentioning or have been left to die.

I use a CLI exclusively. Terminal over web interfaces, always. It's just more capable as a general agent, and I get to see it work.

I may have an idea for something, or a pain, or there's an issue with something that I feel like could be solved with code (basically everything these days). So I'll just spin up a new project in Droid (Factory's CLI).

I generally just talk to the model a couple of times to start feeding in context about what I'm trying to do, then I'll switch into spec mode to start getting a plan going on what I wanna build.

In spec mode I'll basically question a bunch of things. Like I don't understand what this is, or why would we need that over this, can't we do it this way?

I'll link docs and GitHub repos for the agent to explore.

Then I let Opus 4.5 with autonomy high just rip. I'll watch the stream, see what's happening, and when there are any errors. I may jump in to question it or guide it down a different path.

I start the server, test it, give feedback and iterate.

So I kind of build ahead of myself first. I try and just build the thing. And then all of the gaps and all of the issues that I run into are the opportunities for me to learn. Is that a thing that is part of the system that I've seen across other repos that I should build up a sort of templated system to handle? Should this go into an

that actually follows me around and does the same thing on all of the other repos I'm going to be working on?

I've been spending more time trying to figure out the best

setup for myself because this is effectively like the instruction manual.

I've got a repos folder locally—that's where all my coded projects go. In that repos folder is an

that says to explicitly set up each new repo with what to do and not to do, how to do things with GitHub, how to commit, all that kind of stuff. And whether it should use my work GitHub account or my personal GitHub account.

Running tests. End-to-end tests is one of these things I never really paid attention to previously. But now I'm really keen to have end-to-end tests on everything. Given my current knowledge and capability, when I'm building things and testing them, there often might be silly bugs that I just should have caught or tested had there been tests in the first place.

And I often look at others'

files to see what I can borrow for my own. I'm constantly trying to improve my doc to make each and every new working session smoother.

I'm also making sure that I install the Droid GitHub app on every repo that I create. So when I'm deploying to GitHub, I make sure I'm submitting pull requests so I can have Droid review it—and I can tag Droid to make fixes itself with a custom prompt. I can trigger it from issues or from pull requests.

It lets me code from my phone, and add new things when I'm out and about. That in combination with my Telegram bot makes it really easy for me to do things when I'm not at my desk.

I also use Slack with my agent. I create a new channel for each repo and just fire off things as and when. I often spin up new channels for new ideas. Slack's a great 1-person product (+ agent(s)).

Bash commands. It really clicked for me when I'd been running the changelog process for a while—it's the same process over and over. I finally understood the 'workflow'. So I got droid to create the slash command flow and it's the first slash command that I actually have used properly, which runs a number of bash commands and also prompts the model to do certain things like reading through GitHub diffs, checking what is behind a feature flag and what's not, putting things into the right sections of new features, bug fixes, that kind of thing.

From there I started getting more into bash + cli's. I've stopped using MCPs—I use the CLI versions of most things over MCPs. Yes, because MCPs take up context but mostly I feel like it's simpler - I usually only need a few of the tools an MCP would include. So with Supabase, Vercel and Github, I'm always using the CLI's over the MCP's.

I often build my own CLIs for things. For example, I built my own Linear CLI so I could query my own issues and run everything from the terminal instead of going to the desktop or web interface.

VPS. I abstractly knew what it was—it's like another computer that is on all the time somewhere else. But until I truly needed one I didn't really know what I needed to do there, and there's still a lot I need to learn. But effectively, now when I'm running the crypto tracker, I have a ton of data that's being pulled every single minute and I need that to always stay on.

I also use the VPS when using my Droid Telegram bot and use something called SyncThing to sync my local repos to my VPS so that my repos are always up to date and they're in the same state as I left it. So I can just pick it up on the go.

Skills. I've tried to use them a bit more. I've been using them not only just as knowledge, but also with bash commands + CLIs. I've got a Gmail CLI that I can pull into any projects, it's portable, it lives at my root directory. So anytime I need Gmail in my system—I've got a Gmail triage system that I use—it just uses the CLI.

Not to be like everyone else on Twitter when they see Andrej Karpathy tweeting something, but this really rang true to me: there's a new programmable layer of abstraction to master.

When it was the no-code days, the abstraction layer that I was mastering was drag and drop tools like Webflow, Zapier, and Airtable—stitching them together and making it feel like real software (until you hit a limit).

But now instead of me thinking I've got to learn to write code from scratch in order to be able to do all of this, what I need to learn is actually how to work with an AI agent. How can I prompt it well? How can I make sure it's got the right context? And also how can it help me understand what we're doing, how do the pieces work together, how can I improve my own system over time?

Including all of the things like agents, subagents, prompts, context, memory, skills, hooks, etc.

I read people like Peter Steinberger who is an actual programmer and is shipping a ton like crazy. And seeing in his posts almost the simplicity of his system, where he just talks to the model, lets it do its thing, doesn't really worry about extra slash commands, subagents, hooks, skills(although he's coming round to skills) - this just gives me permission and confidence that I don't need some ultra complex system.

Looking at Twitter you see a lot of people really optimising or potentially over-optimising their own system. That can feel daunting for folks like me, but also that's what I think some of the beauty of this is: it's a completely customizable system, so you can make it work for you however you'd like it to work. You can have a plan mode that you create with a custom slash command that runs for twenty minutes like Kieran does, or you can just talk to the model like Peter does.

Another thing while following other engineers