When AI “Solves” Math, Check the Hype First

When AI "Solves" Math, Check the Hype First - Professional coverage

According to MIT Technology Review, in mid-October 2024, OpenAI research scientist Sébastien Bubeck posted on X that the firm’s latest large language model, GPT-5, had been used to find solutions to 10 unsolved mathematics problems from a list of over 1,100 known as Erdős problems. He declared “Science acceleration via AI has officially begun.” However, Thomas Bloom, the mathematician who maintains the erdosproblems.com website tracking these puzzles, quickly disputed the claim on X. Bloom clarified that the site not listing a solution doesn’t mean one doesn’t exist, just that he, a single person, wasn’t aware of it. It turned out GPT-5 hadn’t created new solutions but had instead found 10 existing ones in the vast literature that Bloom hadn’t yet cataloged.

Special Offer Banner

The hype trap

Here’s the thing: this is a perfect microcosm of what’s broken in AI discourse right now. A researcher at a leading lab makes a breathtaking, world-changing announcement… on social media. The immediate reaction is celebration and viral amplification. The necessary gut check—the peer review, the skepticism—comes later, as an afterthought in the replies. And by then, the narrative of a monumental breakthrough is already out there. Bubeck’s post framed it as AI pushing the boundaries of human knowledge. The reality was more mundane: AI is really, really good at searching through stuff humans have already written. Both are impressive, but they’re fundamentally different achievements. One is about creation, the other about discovery. Conflating them isn’t just sloppy; it actively misleads the public about what this technology can currently do.

The actual breakthrough

But let’s not throw the baby out with the bathwater. Because the second takeaway is arguably cooler, if you’re not blinded by the initial hype. GPT-5’s ability to trawl through “millions of mathematics papers,” as Bloom noted, and surface specific, obscure solutions that a dedicated expert hadn’t found is genuinely amazing. Think about it. The model basically did in moments what would take a human researcher weeks of literature review. As mathematician François Charton pointed out, this is where LLMs could be revolutionary for fields like math: as super-powered research assistants that can navigate the entire corpus of human knowledge. That’s a tool that accelerates science in a very real, practical way. It’s just not as sexy as “AI solves unsolved problems.” So why do we always lead with the sexier, wronger claim?

A culture problem

This incident points to a deeper cultural problem in tech, especially in the fiercely competitive AI race. There’s immense pressure to announce “firsts” and demonstrate dominance. Social media, with its reward system of likes and retweets, is the perfect engine for this kind of boosterism. It encourages the grand statement, not the nuanced caveat. The result? We get a distorted view of progress. Real, incremental advances—like a model getting better at information retrieval—get overshadowed by breathless claims of artificial general intelligence peeking around the corner. It erodes trust. And it sets everyone up for a crash when the promised magic doesn’t materialize. The fix is simple, but hard: less knee-jerk posting, more thorough checking. Or, as Bloom’s crisp correction showed, sometimes the real breakthrough is just someone taking the time to say, “Wait, that’s not quite right.”

Leave a Reply

Your email address will not be published. Required fields are marked *