concepts · youtube · 13 min
AI Agent Loops vs Human-in-the-Loop
Greg Isenberg · Jun 14, 2026
WTF Is an "AI Agent Loop"? Genius or Hype?
Channel: Greg Isenberg
Published: June 9, 2026
URL: https://www.youtube.com/watch?v=7clJ8IH784Q
S/o Coderabbit for sponsoring today's vid: https://coderabbit.link/greg
On this episode Greg sits down with Professor Ras Mic to break down agentic loops. They define what a loop is, explain why well-known builders like Boris and Peter swear by them, and stay honest about who they truly serve. Mic argues that human-in-the-loop remains the strongest setup today, and walks through the one loop he runs every day for code review using Cursor, GitHub, and Greptile.
Transcript
Everyone is talking about agentic loops, but the reality is most people don't know what it is or how to use them. In this episode, Greg brought on Professor Ross Mic to clearly explain what it is, whether it's hype or real, and how to use it. And if you stick around to the end of the episode, he shows the most concrete use case of agentic loops that you can use starting today.
Greg: Ross Mic, welcome to the pod. By the end of this episode, what are people going to learn?
Ross: You're going to understand what a loop is, you're going to understand why people are fanning out about it, and you're going to understand why it is a terrible mistake, and unless you have money to burn, that you are not to do it. I'm also going to play the other side, and I'm going to show you a loop that I use, but the general consensus, I think, is wrong, and we're going to talk about it.
Greg: Okay, so by the end of the episode, people are going to understand what an agentic loop is, why the most well-known people in the AI industry are obsessed about it, you're going to keep it real with what we need to know about it and what we can avoid, and you're going to show a real use case, a real example of how to actually use an agentic loop.
Ross: Exactly.
What Is an Agentic Loop?
Ross: Let's start with diagrams. I paid like $3 to get these stick figures, so I hope people appreciate them. This is me and you, right? This is your average Joe Schmo who does not work at Anthropic or OpenAI. And this is Boris and Peter and anyone else who has unlimited access to models.
The way me and you have been working is what is called a human in the loop. You and I will prompt our computer, or better yet, our AI agent, right? Whether you're using Cursor, Claude, Codex, doesn't matter. You are prompting it yourself, right? You're telling it, "Hey, build me this landing page. Build this feature, X, Y, and Z." You are communicating with an AI agent via a prompt. And then, a result is generated, right? Usually what you and I will do is we will view this result, we will test this result, and we will keep on iterating. This is the loop where it goes back to us, right?
Let's say I'm working on an app, and this app is a to-do list app, Greg. The first thing that I'll probably want to do is I want to build up the landing page because I want to get this out to the public, so maybe they can sign up and join the waitlist. So, I'll prompt to build me a landing page, and let's say I like the landing page. Next, I'll work on authentication, and then once I'm happy with authentication, then I'll work on the back end. This is what we are used to. And to be sharp with it, this is called human in the loop. Meaning, it is the agent that's building, but it is you that is directing, governing, and allowing things to happen.
What everyone has been talking about, particularly Boris and Peter, is they don't write prompts, they generate—they build loops. Essentially, what they're talking about is they're building a system where the AI agent generates a result. But instead of a human being in the loop, the human is in the loop one time, meaning it fires off said loop. But then the rest of the time, it's the agent checking. The agent is generating a result. That result is then fed back into the agent. The agent then looks at the result and continues to work.
In theory, this sounds cool because what essentially we're saying is, "Hey, I'm just going to have some sort of spec.md file or some prd.md or whatever.md file and this is going to be like a to-do list, a task list and this is going to give all of the information the AI agent needs to build this." Now, this sounds cool and this low-key might be the future. But here is where it goes terribly wrong.
The Problem with Full Agentic Loops
Let me paint this analogy. We're building a startup, right? You and me, Greg, we're building a startup. We hire a very smart developer and we tell this developer, "This is the app we want to build. These are the things that it needs." And the developer goes on and builds the entire thing without consulting us. In building that entire thing, that developer is going to have to make assumptions, right? Assumptions on how the product looks, how it's going to feel, certain architectural decisions. There's a lot of assumptions that are going to be made in the nitty-gritty. Now, you might think your plan document covers everything, but truth to the matter, it never does. There's always an edge case, there's always something that's missed. So, what the developer's going to do is that developer's going to make a lot of assumptions. Those assumptions might not be aligned with our product vision. Now, you have a developer who's come with a finished product, and now there's a bunch of things in order, but it's not the way we want it.
In the same way, when you have this stacked PRD.md file or this whatever markdown file you have, and you give it to the agent, and you run it in this loop, meaning it takes the feedback, it takes the result, takes it as feedback, and continues to generate code, what happens is you now have an agent that's going to make assumptions. And believe me, when you give the agent the floor to give assumptions, most of the time it's going to get it wrong. But, not only is it going to get it wrong, it's going to burn a lot of money.
I say this with all love, but Boris and Peter come from a place where they have no token budgets, right? They can burn unlimited tokens. If I had unlimited tokens, I'd be doing the same thing, too. But, this is not productive. This is great for research, and I'll actually share a loop that I use, but this idea of constructing a meta harness, where you give the agent feedback automatically, like it gets the information, the result it's generated, and it loops on it, it is a catastrophe. We've tried this, right? We had Ralph loops, we had Ralph Wigam. There's even like /goal, which has been pretty popular the last couple of weeks. These are great to build prototypes. These are great to experiment with. Like, let's say you wanted to experiment with something, you want some minuscule tool built out, but you care about the nitty-gritty details, these are great. But, if you care about the details and you don't have tokens to burn, this is the worst thing to be trending right now, in my humble opinion.
Greg: So, /goal is also trending at the moment. You know, is /goal a loop? Like, how should people think about /goal and a loop?
Ross: They're all the same thing. They have different names, like /goal. I know on Cursor, I think it's /loop. And then, on another tool, it's /whatever. They're all the same thing. And basically, how all of them work on a high level is you type in /goal, and then you give it some prompt, right? Like, you give it some prompt. And then, you can also like attach some markdown file. And you tell it like, yeah, build this entire thing out. Don't stop until you're done. Don't make any mistakes. Again, these are cool, but the two issues are:
Number one: They burn a lot of tokens, right? If you are not on the $200 a month plan, this shouldn't even be a thought. Like, if you're on the $20 or I think there's a $100 a month plan, you shouldn't even think about this, right? Cause it's just going to burn your token usage.
Number two: You think your plan is good, but it's not. Because it's impossible for you as a human to contextualize every single detail about the product that you want in one document, right? Things evolve, trends change. One day liquid glass is cool, the next day we're changing how liquid we want it. It's very impossible for you to fill your thoughts and exactly how you want the product to be in one document. If anyone works in service, whether you run an agency or develop software for other people and companies, we try all the time to get all the thoughts out of someone's head, and there's always something. There's always, "Oh, you missed this or I wanted it like this, this is what I meant." How much more do you think an AI agent's going to understand you if we as humans have hard times understanding each other, right?
This should only be used for experimentation. Let me share with you a fun cool little tool I built the other day, Greg. I was doing a talk and I wanted to build an Among Us simulator for AI models, right? Basically, it's this game where there's one bad guy, one impostor, and everyone's trying to figure out who it is. And I wanted to have like my own benchmark to find out which models are good at lying. I didn't care about the details, I didn't care about how it looked. I just wanted the simulation and the benchmark to work. So, I told it, "I want the simulation, I want this benchmark, I want it to do this, go and do it." It took about an hour and a half and it got it done. Now, there were a lot of details that I had in mind that it completely got wrong that I didn't specify in the initial instruction set. But guess what? Because I didn't care about the details, it was a great thing. I didn't spend a lot of time, I just got /goal to take care of everything.
But when you and I are trying to use AI to build something meaningful, I 100% stand in the fact that the human still needs to be in the loop. AI can replicate sauce, it can't create sauce. So, if I just have these giant loops running and then once they're done, maybe I'll go in and fix things up. Sure, you can make that argument, but I hope you have money to burn. Like, that's the ultimate thing. This will burn money. It sounds cool, but it'll burn money.
Greg: What I'm hearing you say is that the loops are going to create a slot machine.
Ross: That's basically what it is. Now, I have no doubt the Borises and the Peters are building very sophisticated loops. I can almost imagine they have something like some sort of test suite, right? Where they write tests for the agent to run the code against so it's a certain type of quality. I'm sure they also have some sort of browser use capability so the agent can see the page live and can take screenshots. I'm sure they have an insane harness or meta harness around the agent so that this loop can be more successful than the average loop. But at the end of the day, the one argument I'll fight back with is this is going to burn a lot of tokens. And if you don't believe me, all you have to do is look at Peter's tweet where in 1 month he burnt 1.3 million dollars worth of tokens.
I don't want to sound like a doomer, like an old guy saying these loops suck. There are use cases and I'll share one, Greg, if that's okay, where my code review process is a loop.
A Real Use Case: Code Review Loop
I use Cursor for the most part as my harness of choice. With Cursor, I use GitHub as my source control—basically, a place where I store code and version code. Every time I push a feature, I am pushing code to GitHub. And in GitHub, I have a code review agent installed. There are many kinds. My particular one that I use is Greptile, but I know people use Code Rabbit, Microscope. They're all great. I use Greptile.
What happens is whenever I push a feature to GitHub, the code that's being pushed to GitHub is AI generated. But then I have a code review agent that reviews the AI generated code. And what's cool about Greptile is it gives me this review, right? It'll be like, "Oh, you missed this. There's this security thing. You, this is broken in this edge case." It's pretty good. But my favorite thing is it gives you a score. A score out of five, right? It could be two out of five, one out of five, five out of five, four out of five, whatever. It's a score out of five.
And the mental model I now have is I will not push anything to production, meaning I will not allow code to go live unless the score is greater than four out of five, right? If the score is not greater than four out of five, this code needs to be reviewed.
Now, here is where I loop. I have this skill called Grep loop, right? And basically it's a skill that tells the agent, "Oh, check GitHub, read the review, and then fix the review, and then push to GitHub." So what happens is when I see a score, let's say it got a two or a three out of five, my rule is that it has to be at least four out of five and greater. So what I'm going to do is I'm going to go back to Cursor. I'm going to write Grep loop. And then when I write Grep loop, what happens is Cursor reads the review that Greptile wrote on GitHub, and then it feeds the review back into Cursor. Cursor then makes the changes, pushes the changes to GitHub, and then waits for Greptile to do a new review. Every time you push to GitHub, Greptile does a new review. If the review still is a three out of five, guess what happens? The loop continues, and then more changes are made. And then let's say it's a four out of five. It doesn't give up. It keeps...
(capture appears truncated)