Just talk to it – A way of agentic engineering
I'm very confused.
In the picture right at the top of the article, the top of the bell curve is using 8 agents in parallel, and yada yada yada.
And then they go on to talk about how they're using 9 agents in parallel at a cost of 1000 dollars a month for a 300k line (personal?) project?
I dunno, this just feels like as much effort as actually learning how to write the code yourself and then just doing it, except, at the end... all you have is skills for tuning models that constantly change under you.
And it costs you 1000 dollars a month for this experience?
Looks like many are questioning the credibility of the author as a dev/engineer. Peter is the founder of PDFKit and well respected name in iOS circles for his contribution to .framework and ideas about building modular iOS apps. He may be overselling here, but pretty sure he can manage AI generated code
It all sounds somewhat impressive (300k lines written and maintained by AI) but it's hard to judge how well the experience transfers without seeing the code and understanding the feature set.
For example, I have some code which is a series of integrations with APIs and some data entry and web UI controls. AI does a great job, it's all pretty shallow. The more known the APIs, the better able AI is to fly through that stuff.
I have other code which is well factored and a single class does a single thing and AI can make changes just fine.
I have another chunk of code, a query language, with a tokenizer, parser, syntax tree, some optimizations, and it eventually constructs SQL. Making changes requires a lot of thought from multiple angles and I could not safely give a vague prompt and expect good results. Common patterns need to fall into optimized paths, and new constructs need consideration about how they're going to perform, and how their syntax is going to interact with other syntax. You need awareness not just of the language but also the schema and how the database optimizes based on the data distribution. AI can tinker around the edges but I can't trust it to make any interesting changes.
Everytime I read something like this, I question myself, what I am doing wrong ? And I tried all kinds of AI tools. But I am not even close to claiming that AI writes 50% of my code. My work which sometimes include feature enhancements and maintenance is where I get even less. I have to be extremely careful and make sure nothing unwanted or addition that I am unaware of has been added. Maybe it's me and I am not good yet to get to 100% AI code generation.
I was hoping that there would be some discussion of what exactly the project is. The author says they have 300k loc so far. That’s a considerable codebase. What kind of react app could this possibly be? I would love to know.
> This post is 100% organic and hand-written. I love AI, I also recognize that some things are just better done the old-fashioned way.
I’m curious why they feel that writing a blog post with a particular tone and writing style is more complex than writing what is apparently a truly massive and complex app
I'd love to be able to watch people work, who say that they're sucessful with these tools. If there are any devs live streaming software development on Twitch, or just making screen casts (without too many cuts) of how they use these tools in day-to-day work, I'd love to see it.
I’m not making any psychiatric diagnoses based on GitHub repos or YouTube videos.
But. Sometimes when I see someone talking about cranking out hundreds of thousands of lines of vibe coded apps, I go watch their YouTube videos, or checkout their dozens of unconnected, half finished repos.
Every single time I get a serious manic vibe.
I have to imagine that like pair programming, this multi-AI approach would be significantly more tiring than one-window, one-programmer coding. Do you have to force yourself to take breaks to keep up your stamina?
I wonder how many lines of code he generates and reviews a day. With five subscriptions I do not think it is possible to read it all. You can generate more code than you can read with just one subscription.
Meta feedback: there are so many external links in this post (the majority being to Twitter) that it really feels like the audience is just...people who follow this guy on Twitter. There must be about 25 separate links to one-liner tweets.
Surely one of those 9 parallel AI agents could add something like footnotes with context?
I get the idea that different AI have different characters, and different people can either 'get along with them' or not.
To wit, I have absolutely no problems with claude code, but anytime I try to do anything useful with chatgpt it turns into (effectively) a shouting match; There's just no way that particular AI and I can see eye-to-eye. (there's underlying procedural reasons I think)
The author of this piece has the exact opposite experience. Apparently they hate Claude with a passion, but love ChatGPT. Weird!
Use of the bell curve for this meme considered harmful.
If you're going to use AI like that, it's not a clear win over writing the code yourself (unless you're a mid programmer). The whole point of AI is to automate shit, but you've planted a flag on the minimal level of automation you're comfortable with and proclaimed a pareto frontier that doesn't exist.
I'm currently satisfied with Claude Code, but this article seems to sing the praises of Codex. I am dubious Whether it's actually superior or it's 'organic marketing' by OAI (Given that they undoubtedly do this, and other shady practices).
I'll give codex a try later to compare.
Feels like with 3 agents coding non-stop you're no longer a coder but rather a manager of sorts...is it even possible to code / fix things by yourself in an environment such as this?
Is using these terminal agents with customer noncompete and no privacy questionable when cursor has the same models and privacy mode?
I use Claude Code every day and find that it still requires a lot of hand-holding. Maybe codex is better. But just in my last session today, Claude wrote 100 lines of test code that could have been 20, and 30 lines of production code that could have been 5. I'm glad I do not have to maintain 300 kloc of 100% AI-generated code. But at the end of the day, what counts is velocity and quality, and it seems OP is happy. The tools certainly are useful.
One more article praising Codex CLI over Claude Code. Decided to give it a try this morning.
A simple task that would have taken literally no more than 2 minutes in Claude Code is, as of now, 9m+ and still "inspecting specific directory", with an ever increasing list of read files, not a single line of code written.
I might be holding it wrong.
> But Claude Code now has Plugins
> Do you hear that noise in the distance? It’s me sigh-ing. (...) Yes, maintaining good documents for specific tasks is a good idea. I keep a big list of useful docs in a docs folder as markdown.
I'm not that familiar with Claude Code Plugins, but it looks like it allows integrations with Hooks, which is a lot more powerful than just giving more context. Context is one thing, but Hooks let you codify guardrails. For example where I work we have a setup for Claude Code that guides it through common processes, like how to work with Terraform, Git or manage dependencies and the whitelisting or recommendation towards dependencies. You can't guarantee this just by slapping on more context. With Hooks you can both auto-approve or auto-deny _and_ give back guidance when doing so, for me this is a killer feature of Claude Code that lets it act more intelligently without having to rely on it following context or polluting the context window.
Cursor recently added a feature much like Claude Code's hooks, I hope to see it in Codex too.
So, just trying to understand this - he admits to code being slop and in the same sentences states that agents (which created the slop in the first place) also refactor it? Where is the logic in that?
Another person building zero stakes tools for AI coding. 300k LOC also isn’t impressive if it should have been 10k.
Is this paid content from OpenAI?
Have any of these no-bs articles described how they handle schema changes? Are they even using a “real” DB or is it all local, single user sqllite? Because I can see a disaster looming letting a vibe coder’s agent loose on a database.
And does it require auth? How is that spec’d out and validated? What about RBAC or anything? How would you even get the LLM to constantly follow rules for that?
Don’t get me wrong these tools are pretty cool but the old adage “if it sounds too good to be true, it probably is” always applies.
>With Claude Code I often have multi-second freezes and it’s process blows up to gigabytes of memory.
I am in a mood where I find it excessively funny that, all that talk about AI, agents, billions of dollars, tera-watts/-hours spent, and people still manage to publish posts with the "its/it's" mistake.
(I am not a native English speaker, so I notice it at a higher rate than people who learned English "by ear".)
Maybe you don't care or you find it annoying to have it pointed out, but it says something about fundamentals. You know, "The way you do one thing is the way you do all things".