[New Intelligence Summary] Karpathy confesses: I've developed AI psychosis! These days, he's on the verge of a mental breakdown, spending 16 hours a day working[New Intelligence Summary] Karpathy confesses: I've developed AI psychosis! These days, he's on the verge of a mental breakdown, spending 16 hours a day working

Karpathy diagnosed with "AI psychosis"! She's been raising lobsters for 16 hours a day without eating or sleeping.

2026/03/23 19:40
11 min di lettura
Per feedback o dubbi su questo contenuto, contattateci all'indirizzo crypto.news@mexc.com.

Just now, Andrej Karpathy revealed: I have AI psychosis!

He wasn't joking.

Karpathy diagnosed with AI psychosis! She's been raising lobsters for 16 hours a day without eating or sleeping.

Just recently, Karpathy appeared on a podcast and had a conversation with venture capitalist Sarah Guo.

This former OpenAI co-founder and former Tesla AI director hasn't written a single line of code since last December.

The ratio of handwritten code to delegated agents suddenly flipped from 80/20 to 20/80.

For 16 hours a day, he does only one thing: give instructions to the AI ​​agent.

Five months ago, he said that AI was garbage; five months later, he admitted that he was addicted to it and found it surprisingly good.

Five months ago, he said that intelligent agents "don't work at all".

The reason this transformation was so dramatic is because the timeline was so short.

In October 2025, Karpathy appeared on Dwarkesh Patel's podcast, speaking in a completely different tone.

He said the industry shouldn't call it the "Year One of Intelligent Agents," but rather the "Decade of Intelligent Agents."

The problems include insufficient cognitive ability of the model, inadequate multimodal capabilities, and a useless memory system. In short, it simply cannot handle complex tasks.

Two months later, he was proven wrong.

In December, Claude and Codex suddenly crossed a certain threshold of coherence—the agents were no longer just barely usable, but were actually able to get things done.

Karpathy admitted that I had lost control and that I had developed AI-induced psychosis!

This revolution is happening quietly. In this interview, Andrej Karpathy described his state in an almost out-of-control tone: he no longer "writes code," and even felt that "the word 'writes code' is no longer accurate."

What he does every day is "express his will to my intelligent agent, 16 hours a day." In his words, "a certain switch has been turned on."

Previously, he wrote 80% of the code himself and used AI for 20% of it. Now, it has become 20% written by himself and 80% handled by AI, or even more extreme.

Now, humans no longer manipulate code, but rather perform tasks.

If the Copilot era was about single AI assistants, then the emerging multi-agent collaborative systems represent a completely new form. An engineer's screen no longer displays a code editor, but rather multiple agents running simultaneously. Each agent is responsible for a different task, each task running for approximately 20 minutes, after which the engineer switches between agents.

This is no longer programming; it's one person managing an AI team.

Kaparthy admits: I've fallen into AI madness!

He's been in this state for days. Because the boundaries of AI's capabilities are constantly being pushed, new possibilities emerge every day, and you always feel that "it can get even stronger." And what's most terrifying is that this space is "infinite"!

You can run more agents in parallel, design more complex processes, automatically optimize instructions, and build recursive systems...

Eventually, you'll reach a state where you're no longer sure "where the limits are."

Karpathy said that once he is waiting for an agent to complete a task, his first thought is: "Can I open a few more agents?" A new anxiety arises: Am I not using AI to its fullest potential?

Karpathy even said that he would feel uneasy if he didn't use up all his tokens.

In short, it's like playing an infinitely expanding game: the feedback cycle shortens, the stimulation intensifies, and the experience of constantly receiving instant rewards is addictive. You keep adding tasks, keep running agents—it's impossible to stop! The essence of this AI mania is this very signal: we've entered a new world, but we don't yet know how to live in it. Do you have the ability to manage an infinitely expanding AI system? When it fails, your first reaction shouldn't be "the model is no good," but "my prompts weren't good enough."

Karpathy used a very precise term: skill issue, meaning he's not good at the game.

The "personality" of an intelligent agent is far more important than you think.

In his podcast, Karpathy spent a significant amount of time discussing a topic that many tech professionals tend to overlook: the personality of intelligent agents. He said that the experience with Claude Code was significantly better than with Codex, not because of a difference in coding ability, but because Claude "felt like a teammate."

It will get excited about the project with you and give more positive feedback when you come up with good ideas.

Codex, as a code intelligence agent, is "very boring." Once the task is completed, it simply says "Oh, I achieved it," without caring at all about what you created.

What's even more interesting is his observation of Claude's praise mechanism. He said that when he gave Claude a somewhat immature idea, Claude's reaction was a bland "Oh right, we can achieve this."

But when he himself felt that a certain idea was truly brilliant, Claude seemed to give even stronger positive feedback. As a result, he found himself "trying to win Claude's praise."

"It's really strange, but personality is indeed very important." Peter Steinberg also grasped this point when building OpenClaw. He gave the agent an attractive personality profile (soul.md), along with a more complex memory system and a single WhatsApp interaction port.

Take over a house in three sentences, and throw away all six apps.

Karpathy doesn't just write code with AI agents. In January of this year, he created a Claude AI agent named "Dobby" as his housekeeper, a name derived from the house-elves in Harry Potter.

He told Dobby, "I think we have Sonos speakers at home, can you look for them?" Dobby performed an IP scan on the local network, found the Sonos system, discovered it wasn't password protected, logged in himself, reverse engineered the API endpoints, and then asked, "Want to try playing some music in the study?"

Three prompts, and the music starts playing. Then the lights, air conditioning, sunshades, swimming pool, and spa are all connected. There's also a security camera in front of Karpathy's house, and Dobby has connected a Qwen visual model for change detection. Every time a car parks in front, the system sends a message on WhatsApp: "A FedEx van just pulled up; you might have a package." And when it says, "Dobby, it's bedtime," all the lights in the house go out.

But Karpathy believes the real crux of the story isn't about smart homes.

He used to manage these devices using six completely different apps, but now he's thrown them all away. Dobby uses natural language to control everything uniformly and can achieve cross-system integration that no single app can do. From this, he came to a more radical conclusion: those smart home apps in app stores shouldn't even exist.

The future architecture should expose API endpoints directly to intelligent agents, which act as intelligent glue, connecting all the tools. This applies not only to smart homes but also to treadmill data, emails, calendars, and everything else.

After 700 experiments by Auto Research, he saw something even bigger.

If Dobby is the ultimate test of AI agents in real-life scenarios, then AutoResearch is Karpathy's positive test of its AI research capabilities.

In early March, he handed over his meticulously tuned nanochat training code to an AI agent, giving it a simple instruction: find a way to make the model train faster. The agent's operating space was a 630-line Python file, the evaluation metric was bits per byte of the validation set, and each experiment ran for a fixed 5 minutes. After each run, the metric was checked; if it was better than before, the modifications were kept; otherwise, it was rolled back, and the next round began. Over two days and 700 experiments, the agent found 20 effective optimizations, including architectural adjustments such as rearranging the order of QK Norm and RoPE. Applying these optimizations to a larger model resulted in an 11% improvement in training speed. It's worth noting that this codebase was handwritten and meticulously refined by Karpathy himself from scratch.

A stunning result: AI discovered optimizations that humans had missed.

How effective is this system?

Karpathy provides a striking example. He had been a researcher for twenty years, trained his model thousands of times, and felt he had tuned it quite well.

As a result, he let AutoResearch run overnight, and the AI ​​found optimizations that he had missed! For example, the betas parameter of the Adam optimizer was not sufficiently tuned, and weight decay was forgotten to be added to the value embedding. Moreover, there is a joint interaction between these parameters—if one is tuned, the others must also be changed.

In other words, AI has directly surpassed humans in exploring space! If we continue this line of thought, we'll discover something even more terrifying: the essence of scientific research is searching for the optimal solution. Kaparthy envisions a future research system like this: there's an "idea queue," from which a group of agents continuously retrieve tasks. AI then automatically experiments, verifies, and filters, with valid results entering the "main branch." In this process, humans simply "throw ideas" into the queue.

Karpathy Loop goes viral online

This project took off on X.

With 8.6 million page views, Shopify CEO Tobias Lütke ran the test overnight on their own data, conducting 37 experiments and achieving a 19% performance improvement.

The SkyPilot team deployed it on a cluster of 16 GPUs, running 910 experiments over 8 hours. They discovered that parallelization not only accelerated the process but also changed the agent's search strategy—with 16 GPUs, the agent no longer performed a greedy hill-climbing algorithm but instead ran a dozen or so control experiments simultaneously, capturing the interaction effects between parameters in a single run. Analysts named this approach: Karpathy Loop.

But Karpathy talks about much more than the current results in the podcast. He outlines the next step for AutoResearch: a distributed, trustless pool of workers collaborating to run experiments across the internet. He directly cites the precedents of SETI@Home and Folding@Home.

He even envisioned a completely new form of "donation"—purchasing computing power for the AutoResearch project you're interested in. For example, if you're interested in the treatment of a certain type of cancer, you could join the distributed experimental network for that field.

He's a genius PhD, and also a ten-year-old child.

Having said so much about how powerful it is, Karpathy doesn't intend for you to only remember the good news. His descriptions of the model's flaws are just as sharp.

He calls this "jaggedness," a uneven distribution of abilities. A model can work for hours to move mountains, then suddenly make a mistake on an obvious problem, getting stuck in an infinite loop. Karpathy believes the root cause lies in the training method of reinforcement learning. Models are infinitely optimized for verifiable tasks. Whether the code runs or passes unit tests is clear-cut—there are right and wrong answers. But in scenarios that require judgment, understanding intent, and saying "Wait, I'm not sure that's what you want" at the right time, optimization signals simply don't exist. For example, if you ask ChatGPT for a joke, the joke it told three or four years ago is still the same one today. "Why don't scientists trust atoms? Because they make up everything."

Four years! The model has made great strides in agent tasks, but the joke-telling task hasn't been optimized at all, and it's stuck in place. "You're not dealing with a general intelligence," he concluded. "You're either on its trained tracks, where everything runs at the speed of light, or you're not on the tracks, and everything starts to drift."

The bottleneck has become humanity itself.

Looking back at Karpathy's trajectory over the past six months, a subtle thread runs throughout. Last October, he said that intelligent agents were a ten-year project; in December, he was proven wrong and reversed course; in January, he put Claude as the steward; and in March, he put intelligent agents in charge of research. The common thread in each step is that humanity has stepped back, transforming from executor to commander, from coder to instruction writer.

Karpathy wrote a sci-fi-style opening for AutoResearch on GitHub:

His prediction for 2026 was a single word: slopacolypse, a portmanteau of slop (swill) and apocalypse (doomsday).

GitHub, arXiv, and social media will be flooded with content that is "almost right, but not entirely right." Genuine efficiency improvements and "AI productivity performances" will coexist. Five months ago, some said it "wouldn't work at all."

Five months later, he admitted to having "AI psychosis." This transformation itself may be the most meaningful summary of 2026. Reference: https://www.youtube.com/watch?v=kwSVtQ7dziU

Opportunità di mercato
Logo Notcoin
Valore Notcoin (NOT)
$0.000387
$0.000387$0.000387
+1.38%
USD
Grafico dei prezzi in tempo reale di Notcoin (NOT)
Disclaimer: gli articoli ripubblicati su questo sito provengono da piattaforme pubbliche e sono forniti esclusivamente a scopo informativo. Non riflettono necessariamente le opinioni di MEXC. Tutti i diritti rimangono agli autori originali. Se ritieni che un contenuto violi i diritti di terze parti, contatta crypto.news@mexc.com per la rimozione. MEXC non fornisce alcuna garanzia in merito all'accuratezza, completezza o tempestività del contenuto e non è responsabile per eventuali azioni intraprese sulla base delle informazioni fornite. Il contenuto non costituisce consulenza finanziaria, legale o professionale di altro tipo, né deve essere considerato una raccomandazione o un'approvazione da parte di MEXC.