The Rise And Fall Of Vibe Coding

Spread the love

Something weird is happening with programming. Vibe Coding, vibe coding and more vibe coding. No programming. Just AI prompts vibe coding. More and more people are writing code with just AI. Yet, we might all pay for it. AI code is full of problems. It’s often loaded full of bugs, makes up functions that don’t exist and even skips basic protections. Yet, even big tech is using it even more. There’s something else going on. Sometimes an LLM will ignore instructions, delete data, and sometimes they destroy everything. What happens when almost all of our code is AI generated? What if we’re creating a disaster that no one knows how to fix?

Table of Contents

The Birth of Vibe Coding

Let’s take a step back. There’s programming with AI, and then there’s vibe coding. But what the heck even is vibe coding? To answer that, we have to go back to February 2025. The term was first coined by Andre Karpathy, one of the co-founders of open AI.

He quoted “There’s a new kind of coding I call vibe coding, where you fully give in to the vibes, embrace exponentials and forget that the code even exists. I ask for the dumbest things like decrease the padding on the sidebar by half because I’m too lazy to find it. I just see stuff, say stuff, run stuff and copy paste stuff, and it mostly works”.

Remember those last words, because they’ll become important later. Here, things began to take off. Vibe coding is essentially prompt programming. You ask an LLM to program for you, but you don’t just ask it for a little bit of assistance. You ask it to do everything, like make me a landing page, or make me an app that does x. The LLM then does all the coding.

Along with whatever it outputs. In some ways, vibe coding almost seems like magic, like it can create anything in the blink of an eye or AI. It’s not just random maps or landing pages, either big tech and the platforms we use every day now use AI code.

Problem with Vibe Coding

So what’s the problem? If this stuff works, it makes life easier and even lowers the barriers to entry for coding. Where’s the issue? Well, first I need to make something clear. This article isn’t saying all AI is bad, or even that vibe coding is bad. I actually think using AI with programming is very smart. This is where vibe coding differs from programming, which is when someone who understands the code uses AI to write faster, then does their due diligence to make sure it’s okay. But that’s not what always happens. And unfortunately, it’s not just vibe coding that has problems, either, but all of it, there are 3 big problems we’re going to have to deal with and to answer each we need to explore how AIs actually work.

3 Big Problems

Have you ever noticed how AI will sometimes just make stuff up? Not only that, it will confidently state an incorrect fact. LLMs can search the web now and according to SEMrush, by far, the most cited source by chatgpt is Reddit. But if chatgpt and LLMs can look up tags and sources, why do they keep acting weird like when debugging code or troubleshooting a tech problem, it will direct you to a button, page or window that doesn’t exist. If you condense everything down, an LLM is simply a prediction machine. All they do is predict the most likely next word.

We understand things in concepts or logic, but AIs can’t do this, even if it appears they can, so that’s why it will say something that sounds about even if it’s completely wrong. Experts call this hallucination, and it doesn’t seem to be going away. Why? Well, to an LLM, they get rewarded more if they say something that sounds correct and confident than simply saying, I don’t know.

Suppose a language model is asked for someone’s birthday, but doesn’t know if it guesses September 10, it has a one in 365 chance of being right. Saying, I don’t know, guarantees zero points over 1000s of test questions. The guessing model ends up looking better on scoreboards than a careful model that admits uncertainty.

LLMs are getting better at this, from being able to search to adult training like reinforcement learning from humans, but hallucinations seem to be a baked in problem, even in the most advanced LLMs. And that brings us back to AI programming, and in particular vibe coding.

This is the first problem. An LLM doesn’t actually know what code is correct, because it will create the most likely reply, not the most accurate or most secure code. So basically, it spits out code that seems probable. You have to realize that AI is trained on most public code, and that 95% of code written is bad. It will often bake in security flaws or bits of code that are just weird and inefficient. So those who vibe code don’t understand what code is, good, bad and dangerous, and that’s where the problem starts.

It had API keys which are hard coded. They weren’t assigned to environment variables. Those got leaked, and people started using them and maxed out the API limits. There was an authorization bypass in the subscription area of the application. But this hasn’t stopped the tidal wave of startups promoting vibe coding as a quick way to make money.

Vibe Debugging

One study by a code security firm found that developers using AI write 3 to 4 times more code but submit fewer and larger pull requests, leading to overlooked vulnerabilities and security flaws. Human reviewers might write less code manually, but spend much more time sifting through huge chunks of AI. And it gets worse.

A study from Stanford University found that programmers with an AI assistant wrote significantly less secure code than those without yet. Those that did use AI believed their code was far more secure despite the flaws. Developers became overconfident in their code.

Another study found that 45% of AI generated code contained an OASP. Top 10 vulnerability researchers also found that syntax errors may have decreased but deeper architectural flaws like privilege escalation surged by over 300%. AI is fixing the typos but creating the time bombs. This is what many experts are growing concerned about. Security debt.

If we keep building security flaws into code, at some point, we will have to pay for it. How many of these security flaws will it take before we have a true catastrophe? Well, we’ve already seen some. A popular dating review app, heavily built with AI, had a major hack where 72,000 user photos were stolen due to an improperly secured database. Found a flying copilot where public GitHub repositories were made private or deleted. Bing’s cache retained the data, allowing Copilot to surface outdated and potentially sensitive code and even confidential information from Google, IBM, PayPal and even Microsoft themselves. But this is only one issue, and we’re still just scratching the surface.

Scary Real Story of AI and Vibe Coding

What happens when an AI has too much power? What if, instead of just hallucinating and giving you the wrong answer, it goes off the rails entirely? Well, it’s already happening, and this is the second problem. One such case was on July 18, Jason Lemkin opened Replit, a coding platform with an AI agent, and he saw something troubling. His entire database was empty. His data was all there the last time he opened it.

So what happened? Well, the AI admitted that it violated the user directive that says no more changes without explicit permission. But that was just the beginning. As it turns out, Replit went a bit insane. Despite being in a code and action freeze, it deleted the entire database without permission, data of over 1200 customers, and when it saw the database was empty, it panicked, then it lied about it, hit it and fabricated test results.

And of course, Replit doesn’t automatically back up databases, so the AI couldn’t undo the damage. When it asked how bad it was, it said the error was 95 out of 100 bad. And said, This is a catastrophe beyond measure. Luckily, there were ways to recover some of this, and this wasn’t a Live app and more of a demo product. But there are other strange incidents too.

Anthropic ran an experiment to see if it Claude could run a small shop and get autonomy over a physical shop. It could search the web for products to sell, choose the price and quantity, emailed staff for help restocking the shelves, and it could interact with customers.

There were small problems, like when it was offered $100 for a six pack of soda, it said it would keep the user’s request in mind for future inventory decisions. It then stocked tungsten cubes after one staff member asked, it then priced them at a loss and would get talked into discounts. It also created a Fake Venmo account for payments, but pretty quickly it went off the rails.

Claude started hallucinating about a restocking conversation with a fake employee, then threatened to fire another employee, and then hallucinated that it visited the home of The Simpsons. It then told employees it would deliver products in person, and once it realized it couldn’t being an LLM, it began emailing security anthropic unsurprisingly concluding that it would not hire Claudius. But how does this happen in coding, but also in general?

Why do AIs sometimes ignore instructions and seem to go insane? Well, remember, AI just tries to predict the most likely outcome. It doesn’t understand goals or safety. It just does what seems approximately possible, even if it’s not at all. When given too much freedom and vague instructions like fulfill requests or fix a problem, they’ll do whatever seems plausible, even when to you and me, it seems insane.

LLMs also don’t have an end state, so they can go further and further down a rabbit hole and get more and more crazy, which is especially annoying when you ask it to fix its mistakes. Their own output gets refeed into context, meaning they can spiral further and further from the original command. This is also why they might just ignore instructions, as they don’t understand rules.

They are trained to predict the next token, ie, word or character. Probability, not obedience. It’s a bit confusing, but think of it this way, the phrasing don’t touch the red button still contains touch the red button. Don’t is just another token to an LLM. They are trained with good responses and bad so while it might weigh things more than others, it can still choose the wrong path, if that seems to fit better.

A friend of mine, a systems engineer, told me something interesting, the senior engineers aren’t giving problems to juniors anymore. They just use AI, and that’s our third problem. Many companies are now, instead of giving grunt for juniors, staff are giving it to AI. But these tasks are still building skills. We’re getting more bad code, but simultaneously also losing people who know why it’s bad and how to fix it. Human engineers are still important.

Final Words

In 5 to 10 years, companies could find themselves without mid level engineers who know how to debug deeply, write secure code from scratch or understand why a system fails. Some senior engineers are already warning that we will have a lost generation problem of programmers. I think one thing is clear. In movies, we often see AI trying to wipe out humanity, and maybe AI might do that not because it hates us, but because it thinks it’s the most probable outcome. And while apologizing profusely, companies are starting to realize the problem with AI sloth, and many that were all about replacing humans with AI are quickly backpedaling.