AI models are getting more powerful, but the data they’re trained on is getting worse, says Intuition founder Billy Luedtke. As AI systems grow more pervasive, users are increasingly running into limitations that are hard to fix. While the models…AI models are getting more powerful, but the data they’re trained on is getting worse, says Intuition founder Billy Luedtke. As AI systems grow more pervasive, users are increasingly running into limitations that are hard to fix. While the models…

Interview | Big tech is training AI on junk data: Intuition

By: Crypto.news

2025/10/31 03:41

AI$0,06797-%12,60

MORE$0,00949-%16,24

BILLY$0,001918-%5,70

AI models are getting more powerful, but the data they’re trained on is getting worse, says Intuition founder Billy Luedtke.

Summary

AI is only as good as the data we feed it, says Billy Luedtke, founder of Intuition
We’re in a “slop-in, slop-out” era, as AI becomes recursive
Decentralized models have the edge with tech and user experience

As AI systems grow more pervasive, users are increasingly running into limitations that are hard to fix. While the models improve, the underlying data these models are trained on remains the same. What is more, recursion, or AI models training on data generated by other AI, might actually make it worse.

To talk about the future of AI, crypto.news spoke to Billy Luedtke, founder of Intuition, a decentralized protocol focused on bringing verifiable attribution, reputation, and data ownership to AI. Luedtke explains why the current data sets for AI are fundamentally flawed and what can be done to fix it.

Crypto.news: Everyone right now is focused on AI infrastructure — GPUs, energy, data centers. Are people underestimating the importance of the trust layer in AI? Why is it important?

Billy Luedtke: 100%. People are definitely underestimating it — and it matters for several reasons.

First, we’re entering what I call a “slop-in, slop-out” era. AI is only as good as the data it consumes. But that data — especially from the open web — is largely polluted. It’s not clean. It’s not reflective of human intention. Much of it comes from gamified behavior online: likes, reviews, engagement hacks — all filtered through attention-optimized algorithms.

So when AI scrapes the internet, what it sees isn’t a holistic picture of who we are. It’s seeing people playing the platform. I don’t behave the same way on Twitter as I do in real life. None of us do. We’re optimizing for the algorithm — not expressing genuine thought.

It’s recursive, too. The platforms train us, and we feed more distorted behavior back in. That creates a feedback loop — a spiral — that distorts AI’s perception of humanity even more. We’re not teaching it what we think; we’re teaching it what we think will get likes.

The average user isn’t Googling, comparing sources, or thinking critically. They’re just asking ChatGPT or another model and taking the response at face value.

That’s dangerous. If the model is opaque — a black box — and the company that controls it also controls what information you’re shown or not shown, then that’s total narrative control. It’s centralized, unaccountable, and extremely powerful.

Imagine asking Grok for the best podcast, and the answer is whoever paid Elon the most. That’s not intelligence — it’s just advertising in disguise.

CN: So how do we fix that? How do we build systems that prioritize truth and value instead of engagement?

BL: We need to flip the incentives. These systems should serve people — not institutions, not shareholders, not advertisers. That means building a new layer for the internet: identity and reputation primitives. That’s what we’re doing at Intuition.

We need verifiable attribution: who said what, when, and in what context. And we need a portable, decentralized reputation that helps determine how much we can trust any given source of data — not based on vibe, but on actual contextual track record.

Reddit is a perfect example. It’s one of the largest sources of training data for models. But if a user sarcastically says, “Just k*** yourself,” that can get scraped and show up in a model’s recommendation to someone asking for medical advice.

That’s horrifying — and it’s what happens when models don’t have context, attribution, or reputation weighting. We need to know: Is this person credible in medicine? Are they reputable in finance? Is this a trusted source, or just another random comment?

CN: When you talk about attribution and reputation, this data needs to be stored somewhere. How do you think about that in terms of infrastructure — especially with issues like copyright and compensation?

BL: That’s exactly what we’re solving at Intuition. Once you have verifiable attribution primitives, you know who created what data. That allows for tokenized ownership of knowledge — and with that, compensation.

So instead of your data living on Google’s servers or OpenAI’s APIs, it lives on a decentralized knowledge graph. Everyone owns what they contribute. When your data gets traversed or used in an AI output, you get a share of the value it generates.

That matters because right now we’re digital serfs. We spend our most valuable resources — time, attention, and creativity — generating data that someone else monetizes. YouTube isn’t valuable because it hosts videos; it’s valuable because people curate it. Without likes, comments, or subscriptions, YouTube is worthless.

So we want a world where everyone can earn from the value they generate — even if you’re not an influencer or extrovert. If you’re consistently early to finding new artists, for example, your taste has value. You should be able to build a reputation around that and monetize it.

CN: But even if we get transparency, these models are still really hard to interpret. OpenAI itself can’t fully explain how its models make decisions. What happens then?

BL: Great point. We can’t fully interpret model behavior — they’re just too complex. But what we can control is the training data. That’s our lever.

I’ll give you an example: I heard about a research paper where one AI was obsessed with owls and another was great at math. They only trained together on math-related tasks. But at the end, the math AI also started loving owls — just by absorbing the pattern from the other.

It’s crazy how subliminal and subtle these patterns are. So the only real defense is intention. We need to be deliberate about what data we feed these models. We need to “heal ourselves,” in a way, to show up online in a more authentic, constructive way. Because AI will always reflect the values and distortions of its creators.

CN: Let’s talk business. OpenAI is burning cash. Their infrastructure is extremely expensive. How can a decentralized system like Intuition compete — financially and technically?

BL: There are two core advantages we have: composability and coordination.

Decentralized ecosystems — especially in crypto — are incredibly good at coordination. We’ve got global, distributed teams all working on different components of the same larger problem. Instead of one company burning billions fighting the world, we’ve got hundreds of aligned contributors building interoperable tools.

It’s like a mosaic. One team works on agent reputation, another on decentralized storage, another on identity primitives — and we can stitch those together.

That’s the superpower.

The second advantage is user experience. OpenAI is locked into its moat. They can’t let you port your context from ChatGPT to Grok or Anthropic — that would erode their defensibility. But we don’t care about vendor lock-in.

In our system, you’ll be able to own your context, take it with you, and plug it into whichever agent you want. That makes for a better experience. People will choose it.

CN: What about infrastructure costs? Running large models is extremely expensive. Do you see a world where smaller models run locally?

BL: Yes, 100%. I actually think that’s where we’re headed — toward many small models running locally, connected like neurons in a distributed swarm.

Instead of one big monolithic data center, you’ve got billions of consumer devices contributing compute. If we can coordinate them — which is what crypto excels at — that becomes a superior architecture.

And this is why we’re also building agent reputation layers. Requests can be routed to the right specialized agent for the job. You don’t need one massive model to do everything. You just need a smart system for task routing — like an API layer across millions of agents.

CN: What about determinism? LLMs aren’t great for tasks like math, where you want exact answers. Can we combine deterministic code with AI?

BL: That’s what I want. We need to bring back determinism into the loop.

We started with symbolic reasoning — fully deterministic — and then we swung hard into deep learning, which is nondeterministic. That gave us the explosion we’re seeing now. But the future is neurosymbolic — combining the best of both.

Let the AI handle the fuzzy reasoning. But also let it trigger deterministic modules — scripts, functions, logic engines — where you need precision. Think: “Which of my friends likes this restaurant?” That should be 100% deterministic.

CN: Zooming out: we’ve seen companies integrate AI across their operations. But results have been mixed. Do you think the current generation of LLMs truly boosts productivity?

BL: Absolutely. The singularity is already here — it’s just unevenly distributed.

If you’re not using AI in your workflow, especially for code or content, you’re working at a fraction of the speed others are. The tech is real, and the efficiency gains are massive. The disruption has already happened. People just haven’t fully realized it yet.

CN: Final question. A lot of people are saying this is a bubble. Venture capital is drying up. OpenAI is burning money. Nvidia’s financing its own customers. How does this end?

BL: Yes, there’s a bubble — but the tech is real. Every bubble pops, but what’s left afterward are the foundational technologies. AI is going to be one of them. The dumb money — all those wrapper apps with no real innovation — that’s getting flushed. But deep infrastructure teams? They’ll survive.

In fact, this could go one of two ways: We get a soft correction and come back to reality, but progress continues. Or, productivity gains are so immense that AI becomes a deflationary force on the economy. GDP could 10x or 100x in output capacity. If that happens, the spending was worth it — we level up as a society.

Either way, I’m optimistic. There’ll be chaos and job displacement, yes — but also the potential for an abundant, post-scarcity world if we build the right foundation.

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

Share Insights