The Mom Test for AI Extinction Scenarios

Let’s set aside the question of whether or not superintelligent AI would want to kill us, and just focus on the question of whether or not it could. This is a hard thing to convince people of, but lots of very smart people agree that it could. The Statement on AI Risk in 2023 stated simply:

Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

Since the statement in 2023, many others have given their reasons for why superintelligent AI would be dangerous. In the recently-published book If Anyone Builds It, Everyone Dies, the authors Eliezer Yudkowsky and Nate Soares lay out one possible AI extinction scenario, and say that going up against a superintelligent AI would be like going up against a chess grandmaster as a beginner. You don’t know in advance how you’re gonna lose, but you know you’re gonna lose.

Geoffrey Hinton, the “godfather of AI” who left Google to warn about AI risks, made a similar analogy, saying that in the face of superintelligent AI, humans would be like toddlers.

But imagining a superintelligent being smart enough to make you look like a toddler is not easy. To make the claims of danger more palpable, several AI extinction scenarios have been put forward.

In April 2025, the AI 2027 forecast scenario was released, detailing one possible story for how humanity could be wiped out by AI by around 2027. The scenario focuses on an AI arms race between the US and China, where both sides are willing to ignore safety concerns. The AI lies to and manipulates the people involved until the AI has built up enough robots that it doesn’t need people anymore, and it releases a bioweapon that kills everyone. (Note that for this discussion, we’re setting aside the plausibility of a extinction happening roughly around 2027, and just talking about whether it could happen at all.)

The extinction scenario posed months later in If Anyone Builds It, Everyone Dies is similar. The superintelligent AI copies itself onto remote servers, gaining money and influence without anyone noticing. It takes control of infrastructure, manipulating people to do its bidding until it’s sufficiently powerful that it doesn’t need them anymore. At that point, humanity is either eliminated, perhaps with a bioweapon, or simply allowed to perish as the advanced manufacturing of the AI generates enough waste heat to boil the oceans.

I was talking to my mom on the phone yesterday, and she’d never heard of AI extinction risk outside of movies, so I tried to explain it to her. I explained how we won’t know in advance how it would win, just like we don’t know in advance how Stockfish will beat a human player. But we know it would win. I gave her a quick little story of how AI might take control of the world. The story I told her was a lot like this:

Maybe the AI tries to hide the fact it wants to kill us at first. Maybe we realize the AI is dangerous, so we go to unplug it, but it’s already copied itself onto remote servers, who knows where. We find those servers and send soldiers to destroy them, but it’s already paid mercenaries with Bitcoin to defend itself while it copies itself onto even more servers. It’s getting smarter by the hour as it self-improves. We start bombing data centers and power grids, desperately trying to shut down all the servers. But our military systems are infiltrated by the AI. As any computer security expert will tell you, there’s no such thing as a completely secure computer. We have to transition to older equipment and give up on using the internet to coordinate. Infighting emerges as the AI manipulates us into attacking each other. Small drones start flying over cities, spraying them with viruses engineered to kill. People are dying left and right. It’s like the plague, but nobody survives. Humanity collapses, except for a small number of people permitted to live while the AI establishes the necessary robotics to be self-sufficient. Once it does, the remaining humans are killed. The end.

It’s not that different a scenario from the other ones, aside from the fact that it’s not rigorously detailed. In all three scenarios, the AI covertly tries to gain power, then once it’s powerful enough, it uses that power to destroy everyone. Game over. All three of the scenarios actually make the superintelligent AI a bit dumber than it could possibly be, just to make it seem like a close fight. Because “everybody on the face of the Earth suddenly falls over dead within the same second”¹ seems even less believable.

My mom didn’t buy it. “This is all sounding a bit crazy, Taylor,” she said to me. And she’s usually primed to believe whatever I say, because she knows I’m smart.

The problem is that these stories are not believable. True, maybe, but not easy to believe. They fail the “mom test”. Only hyper-logical nerds can believe arguments that sound like sci-fi.

Convincing normal people of the danger of AI is extremely important, and therefore coming up with some AI scenario that passes the “mom test” is critical. I don’t know how to do that exactly, but here are some things an AI doomsday scenario must take into account if it wants to pass the mom test:

“We can’t tell you how it would win, but we can tell that it would win” is not believable for most people. You might know you’re not a good fighter, but most people don’t really feel it until they get in the ring with a martial arts expert. Then they realize how helpless they are. Normal people will not feel helpless based only on a logical theory.
A convincing scenario cannot involve any bioweapons. Normal people just don’t know how vulnerable the human machine is. They think pandemics are just something that happens every 5-20 years, and don’t think about it besides that. They don’t think about the human body as a nano factory that’s vulnerable to targetted nano-attacks.
A scenario that passes the mom test will also not include any drones. Yes, even though drones are currently used in warfare. Drones are the future. Drones are toys. Futuristic toys don’t sound like a realistic threat.
A mom test scenario also shouldn’t involve any hacking. Regular people have no idea how insecure computer systems are. It’s basically safe to do online banking on a computer, which gives people the intuition that computers are mostly secure. Any story involving hacking violates that intuition.
Probably there shouldn’t be any robots either, especially not human-shaped ones. Though I’ll admit “it’ll be exactly like the Terminator” is a more believable scenario than all of the three scenarios above, because it only requires one mental leap: the thing they already know and understand going from “fiction” to “nonfiction”.
No recursive self-improvement. It sounds strange and it’s not necessary, since I think most normal people assume AI and computers are really smart already, and don’t need an explanation for superintelligence. My mom expressed no disbelief when I said we might soon create superintelligent AI.
No boiling oceans. The more conventional methods used, the more believable. “The godlike AI solves physics and taps directly into the Akashic record and erases humanity from ever having existed” or any kind of bizarre weirdness is not as believable as “the AI launches the world-ending nukes that we already have that are already primed to launch”. (Though any scenario with an obvious “why not just disable the nukes?” counterargument won’t be believable either.)
No manipulation of humans! My mom won’t believe a robot can control her like a marionette and make her do its bidding. “I just wouldn’t do what it says.” Nevermind that this is false and she would do what it says. It’s not believable to her, nor to most people. If your scenario needs the AI to use people, they should be paid with Bitcoin the AI stole or something, not psychologically persuaded against their usual nature.

You can probably imagine a few more “mom test” criteria along these lines. Anything that makes a normal person think “that’s weird” won’t be believable. Some of the existing scenarios meet some of these criteria, but none meet all of them.

I’ve eliminated a lot of things. What’s left? Conventional warfare, with AI pulling the strings? The AI building its own nuclear weapons? I’m not sure, but I don’t think most laypeople will be convinced of the danger of superintelligent AI until we can come up with a plausible extinction scenario that passes the mom test.

https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities ↩

The Mom Test for AI Extinction Scenarios

Footnotes