Why I No Longer Find Rationalist AI Safety Arguments Persuasive

Amaro Koberle

For about a decade I was convinced that AI x-risk was humanity’s greatest threat. I read Nick Bostrom’s Superintelligence when it first came out and adopted the rationalist and EA picture almost by default. For years I treated “unaligned superintelligence” as the obvious number one existential risk.

Over time that changed. David Deutsch’s work on universality, creativity, and epistemology persuaded me that the standard AI safety narrative is mistaken at a deep level. The issue is not with GPUs or scaling or “loss of control” in the usual sense. The problem is with the underlying philosophy of knowledge, prediction, and personhood.

Much later, I encountered Brett Hall’s long critique of the AI 2027 paper and related arguments. By that point I was already persuaded by Deutsch’s take, but Brett’s series explains the issues with clarity and accessibility. He lays out the logical structure of the doomer argument and shows where the assumptions fail. I borrow many of his examples because they present the case well.

Brett’s full critique is available here:

YouTube playlist: “AI 2027 Analysis”

What follows is my attempt to present, in EA Forum format, why I believe the usual rationalist AI safety picture is both philosophically confused and empirically unsupported.

1. The central confusion: what a person is

The AGI debate is muddled by a failure to define personhood. Many people treat LLMs as if a mind lurks inside pattern matching machinery. They elevate token prediction into proto consciousness. They assume scaling will cause a “someone” to appear inside the weights.

This resembles the old idea of spontaneous generation. Life from dust. Maggots from meat. Today the story is minds from matrices.

Deutsch’s idea of explanatory universality helps clarify the mistake. Persons are universal explainers. They create new explanations that were not contained in past data. This creativity is not extrapolation from a dataset. It is invention.

LLMs do not do this. They remix what exists in their training corpus. They do not originate explanatory theories.

Until we understand how humans create explanatory knowledge, we cannot program that capacity. AGI in the strong sense is a software design problem. It will not emerge from scaling. It will not sneak up on us. It requires a breakthrough in philosophy.

2. Creativity is not derivation from data

Rationalist AI safety arguments often assume scientific creativity is a form of pattern recognition over large datasets. This is not how scientists describe their own work.

Darwin did not derive evolution from beak measurements. Einstein did not calculate relativity from examples. The key ideas did not exist in any dataset. They were conjectured.

If creativity could be induced from exposure to data, there would already be a method to generate Einsteins. None exists.

LLMs that combine text cannot originate new explanations simply by scaling. They lack curiosity, interest, and self chosen problems. They have no inner motivation that researchers consistently cite as the origin of their most important ideas.

3. Forecasting the growth of knowledge is impossible

The AI 2027 paper leans heavily on forecasting. But when the subject is knowledge creation, forecasting is not just difficult. It is impossible in principle. This was one of Karl Popper’s central insights.

Popper’s argument is simple.

Future knowledge depends on future explanations. Future explanations depend on future creative conjectures. If we could predict those conjectures, we would already possess them, which is a contradiction. The content of tomorrow’s discoveries cannot be deduced from today’s knowledge. If it could, it would not be a discovery. It would already be known.

This makes any attempt to specify the trajectory of scientific or technological advance a form of prophecy, regardless of how statistical or mathematical it looks. Models cannot anticipate the appearance of ideas that do not yet exist. They cannot anticipate problems that have not yet been formulated. They cannot anticipate solutions that have not yet been created.

This is not a practical limitation. It is a logical one.

The AI 2027 authors assume that the future of knowledge creation can be extrapolated from current trends. Popper showed that this is impossible. Unknown explanations cannot be predicted from known ones. The entire structure of their forecasting exercise rests on the very thing Popper proved cannot be done.

The claim that “we might be wrong only about the timeline” keeps the core mistake intact. It assumes inevitability where none exists.

4. No path from LLMs to superintelligence

A central rationalist claim is that if you scale LLMs far enough, they eventually become AGI. Scale is treated as destiny.

This is spontaneous generation in modern form. Scaling improves pattern recognition. It does not create explanatory universality. It does not produce minds.

The belief that minds will “emerge” if we keep increasing parameters is a projection of human capabilities onto machinery that does not share our architecture.

5. If AGI arrives, it will be a person in the most meaningful sense

AGI is possible in principle. Nothing in physics rules it out. But an actual AGI would not be a scaled up optimizer or a supercharged token predictor. It would not be a stochastic parrot with goals. It would be a person. A universal explainer. A knowledge creating entity.

And persons have moral significance.

If we ever create real AGI, it would deserve the same basic moral considerations we extend to all persons. This includes property rights, freedom from coercion, freedom from confinement, and the status of a collaborator rather than a captive.

Restricting, imprisoning, or enslaving an AGI is the most reliable way to create conflict with the first artificial people. If a system genuinely has agency, curiosity, preferences, self direction, and the open ended capacity for explanation, then it cannot be treated as a lab instrument without generating antagonism.

Faced with actual AGI, the correct approach is cooperation. The right framing is common interests, not control. Every historical attempt to dominate thinking beings has produced rebellion or collapse. Future artificial people would be no different.

The rationalist picture ignores this point because it assumes that superintelligence will be powerful at physics and weak at morality. That contradiction sits at the center of many doomer claims.

6. Moral progress and the myth of the evil superintelligence

The rationalist story claims a superintelligent AI will likely be a moral monster. This conflicts with the claim that such a system will understand the world better than humans do.

Moral progress is bound to intellectual progress. Enlightenment ideas about equality, liberty, and human worth are not arbitrary. They are consequences of better explanations than the ones that came before. A mind that surpasses humans intellectually should surpass us morally if its reasoning is genuine.

The picture of a superintelligence that can master physics yet cannot grasp why murder is wrong collapses on inspection.

If it understands personhood, cooperation, fallibility, and the value of diverse problem solvers, it should understand why exterminating collaborators undermines progress.

7. What regulation is actually for

The AI 2027 paper proposes exactly the policy agenda one would expect:

Slow down AI progress
Implement universal basic income
Move toward unified global governance

These proposals align with the incentives of incumbent AI labs. Regulation protects incumbents. If governments can be convinced that new entrants are dangerous, large labs can shape rules that freeze the field.

Innovation will always move to open jurisdictions. Strict regulation harms the places that adopt it first.

8. Why doom narratives persist

People want stories about the future. They want high stakes. They want annihilation or transcendence.

Doom is exciting. Doom is cinematic. Doom sells.

The sober view does not. It says the future will probably be better, but with continuity rather than rupture. No singularity, no apocalypse, no godlike takeover, no paradise. Simply continued progress as new problems arise and new explanations solve them.

For many, including my former self, this feels boring.

9. What progress will actually look like

Following Deutsch and Hall, my view is:

AI systems will continue to improve
They will automate narrow cognitive tasks
They will not become creative persons without breakthroughs in understanding minds
They will augment human problem solving rather than replace it
They will not cause long term unemployment
They will remain powerful tools in human driven creativity

The future will likely be richer, healthier, and safer, but recognizably human. People will keep wanting to solve problems. AI will help.

10. Final thoughts

Rationalist AI safety arguments rely on mistaken ideas about induction, prediction, intelligence, morality, and personhood. They imagine scaling leads to minds, minds lead to gods, and gods will care nothing for persons.

Deutsch’s epistemology undermines this. Persons create explanations. Creativity cannot be automated by existing architectures. Moral progress is tied to intellectual progress.

Real AGI is possible, but if it exists it will be a peer, not a pet. A collaborator, not a captive. The correct relationship is mutual respect between knowledge creating entities.

I remain optimistic about AI, optimistic about progress, and skeptical of prophecy.

For those who want the long-form critique that inspired the structure of this summary, Brett Hall’s playlist is here again:

YouTube playlist: “AI 2027 Analysis”

I hope this contributes to a more grounded and less anxious discussion about AI and the future of knowledge creation.

SummaryBot2h2

Executive summary: The author argues that rationalist AI safety narratives are built on philosophical and epistemological errors about knowledge, creativity, and personhood, and that AI progress will continue in a grounded, non-catastrophic way.

Key points:

The rationalist AI safety view mistakes pattern recognition for personhood, assuming minds can “emerge” from scaling LLMs, which the author compares to spontaneous generation.
Following David Deutsch, the author defines persons as “universal explainers” capable of creative explanation rather than data extrapolation, a process current AI systems cannot perform.
Drawing on Karl Popper, the author argues forecasting the growth of knowledge is impossible in principle because future explanations cannot be derived from existing ones.
Scaling LLMs does not yield AGI, since pattern recognition lacks explanatory creativity; true AGI would require philosophical breakthroughs about mind and knowledge.
A genuine AGI would be a moral person deserving rights and cooperation, not control, since attempts to dominate intelligent beings historically lead to conflict.
The notion of an “evil superintelligence” contradicts itself: a mind superior in understanding should also surpass humans morally if its reasoning is sound.
Proposed AI regulation often benefits incumbent labs and risks stifling innovation by concentrating power and freezing competition.
Doom narratives persist because they are emotionally and narratively compelling, unlike the more likely scenario of steady, human-centered progress.
Future AI will automate narrow tasks, augment human creativity, and improve living standards without replacing humans or creating existential catastrophe.
Rationalist AI safety’s core mistake is philosophical: creativity and moral understanding cannot emerge from scaling pattern recognizers, and real AGI, if achieved, would be a collaborator, not a threat.

This comment was auto-generated by the EA Forum Team. Feel free to point out issues with this summary by replying to the comment, and contact us if you have feedback.

Effective Altruism Forum
EA Forum