Pronouns: she/her or they/them.
I got interested in effective altruism back before it was called effective altruism, back before Giving What We Can had a website. Later on, I got involved in my university EA group and helped run it for a few years. Now I’m trying to figure out where effective altruism can fit into my life these days and what it means to me.
The recent anthology Essays on Longtermism, which is open access and free to read here, has several essays with good criticisms of longtermism. You might find some of those essays interesting. The authors included in that anthology are a mix of proponents of longtermism and critics of longtermism.
This is not necessarily to disagree with any of your specific arguments or your conclusion, but I think for people who have not been extremely immersed in effective altruist discourse for years, what has been happening with effective altruism over the last 5-10 years can easily be mis-diagnosed.
In the last 5-10 years, has EA shifted significantly toward prioritizing very long-term outcomes (i.e. outcomes more than 1,000 years in the future) over relatively near-term outcomes (i.e. outcomes within the next 100 years)? My impression is no, not really.
Instead, what has happened is that a large number of people in EA have come to believe that there’s more than a 50% chance of artificial general intelligence being created within the next 20 years, with many thinking there’s more than a 50% chance of it being created within 10 years. If AGI is created, many people in EA believe there is a significant risk of human extinction (or another really, really bad outcome). "Significant risk" could mean anywhere from 10% to over 50%. People vary on that.
This is not really about the very long-term future. It’s actually about the near-term future: what happens within the next 10-20 years. It’s not a pivot from the near-term to the very long-term, it’s a pivot from global poverty and factory farming to near-term AGI. So, it’s not really about longtermism at all.
The people who are concerned about existential risk from near-term AGI don’t think it’s only a justified worry if you account for lives in the distant future. They think it’s a justified worry if you only account for people who already alive right now. The shift in opinion is not anything to do with arguments about longtermism, but about people thinking AGI is much more likely much sooner than they previously did, and also them accepting arguments that AGI would be incredibly dangerous if created.
The pivot in EA over the last 5-10 years has also not, in my observation, been a pivot from global poverty and factory farming to existential risk in general, but a pivot to only specifically existential risk from near-term AGI.
To put my cards on the table, my own personal view is:
The x-risks you discussed in your post are humans vs. humans risks: nuclear war, bioweapons, and the humans creating AGI. These are far more complex. Asteroids don’t respond to our space telescopes by attempting to disguise themselves to evade detection. But with anything to do with humans, humans will always respond to what we do, and that response is always at least somewhat unpredictable.
I still think we should do things to reduce the risk from nuclear war and bioweapons. I’m just saying that these risks are more complex and uncertain than risks from nature. So, it’s more harder to do the cost-effectiveness math that shows spending to reduce these risks is justified. However, so much in the world can’t be rigorously analyzed with that kind of math, so that’s not necessarily an argument against it!
As for climate change, I agree it's important, and maybe some people in EA have done some good work in this area — I don't really know — but it seems like there's already so much focus on it from so many people, many of whom are extremely competent, it's hard to see what EA would contribute by focusing on it. By contrast, global poverty charity effectiveness wasn't a topic many people outside of international development thought about — or at least felt they could do anything about — before GiveWell and effective altruism. Moreover, there wasn't any social movement advocating for people to donate 10% of their income to help the global poor.
The Grok chart contains no numbers, which is so strange I don't think you can conclude much from it except "we used more RL than last time."
Isn't the point just that the amount of compute used for RL training is now roughly the same as the amount of compute used for self-supervised pre-training? Because if this is true, then obviously scaling up RL training compute another 1,000,000x is obviously not feasible.
My main takeaway from this post is not whether RL training would continue to provide benefits if it were scaled up another 1,000,000x, just that the world doesn't have nearly enough GPUs, electricity, or investment capital for that to be possible.
This is a really compelling post. This seems like the sort of post that could have a meaningful impact on the opinions of people in the finance/investment world who are thinking about AI. I would be curious to see how equity research analysts and so on would react to this post.
This is a very strong conclusion and seems very consequential if true:
This leaves us with inference-scaling as the remaining form of compute-scaling.
I was curious to see if you had a similar analysis that supports the assertion that "the scaling up of pre-training compute also stalled". Let me know if I missed something important. For the convenience of other readers, here are some pertinent quotes from your previous posts.
From "Inference Scaling Reshapes AI Governance" (February 12, 2025):
But recent reports from unnamed employees at the leading labs suggest that their attempts to scale up pre-training substantially beyond the size of GPT-4 have led to only modest gains which are insufficient to justify continuing such scaling and perhaps even insufficient to warrant public deployment of those models. A possible reason is that they are running out of high-quality training data. While the scaling laws might still be operating (given sufficient compute and data, the models would keep improving), the ability to harness them through rapid scaling of pre-training may not.
There is a lot of uncertainty about what is changing and what will come next.
One question is the rate at which pre-training will continue to scale. It may be that pre-training has topped out at a GPT-4 scale model, or it may continue increasing, but at a slower rate than before. Epoch AI suggests the compute used in LLM pre-training has been growing at about 5x per year from 2020 to 2024. It seems like that rate has now fallen, but it is not yet clear if it has gone to zero (with AI progress coming from things other than pre-training compute) or to some fraction of its previous rate.
This strongly suggests that even though there are still many more unused tokens on the indexed web (about 30x as many as are used in GPT-4 level pre-training), performance is being limited by lack of high-quality tokens. There have already been attempts to supplement the training data with synthetic data (data produced by an LLM), but if the issue is more about quality than raw quantity, then they need the best synthetic data they can get.
From "The Extreme Inefficiency of RL for Frontier Models" (September 19, 2025):
LLMs and next-token prediction pre-training were the most amazing boost to generality that the field of AI has ever seen, going a long way towards making AGI seem feasible. This self-supervised learning allowed it to imbibe not just knowledge about a single game, or even all board games, or even all games in general, but every single topic that humans have ever written about — from ancient Greek philosophy to particle physics to every facet of pop culture. While their skills in each domain have real limits, the breadth had never been seen before. However, because they are learning so heavily from human generated data they find it easier to climb towards the human range of abilities than to proceed beyond them. LLMs can surpass humans at certain tasks, but we’d typically expect at least a slow-down in the learning curve as they reach the top of the human-range and can no longer copy our best techniques — like a country shifting from fast catch-up growth to slower frontier growth.
The overall concept we're talking about here is to what extent the outlandish amount of capital that's being invested in AI has increased budgets for fundamental AI research. My sense of this is that it's an open question without a clear answer.
DeepMind has always been doing fundamental research, but I actually don't know if that has significantly increased in the last few years. For all I know, it may have even decreased after Google merged Google Brain and DeepMind and seemed to shift focus away from fundamental research and toward productization.
I don't really know, and these companies are opaque and secretive about what they're doing, but my vague impression is that ~99% of the capital invested in AI over the last three years is going toward productizing LLMs, and I'm not sure it's significantly easier to get funding for fundamental AI research now than it was three years ago. For all I know, it's harder.
My impression is from anecdotes from AI researchers. I already mentioned Andrej Karpathy saying that he wanted to do fundamental AI research at OpenAI when he re-joined in early 2023, but the company wanted him to focus on product. I got the impression he was disappointed and I think this is a reason he ultimately quit a year later. My understanding is that during his previous stint at OpenAI, he had more freedom to do exploratory research.
The Turing Award-winning researcher Richard Sutton said in an interview something along the lines of no one wants to fund basic research or it's hard to get money to do basic research. Sutton personally can get funding because of his renown, but I don't know about lesser-known researchers.
A similar sentiment was expressed by the AI researcher François Chollet here:
Now LLMs have sucked the oxygen out of the room. Everyone is just doing LLMs. I see LLMs as more of an off-ramp on the path to AGI actually. All these new resources are actually going to LLMs instead of everything else they could be going to.
If you look further into the past to like 2015 or 2016, there were like a thousand times fewer people doing AI back then. Yet the rate of progress was higher because people were exploring more directions. The world felt more open-ended. You could just go and try. You could have a cool idea of a launch, try it, and get some interesting results. There was this energy. Now everyone is very much doing some variation of the same thing.
Undoubtedly, there is an outrageous amount of money going toward LLM research that can be quickly productized, toward scaling LLM training, and towards LLM deployment. Initially, I thought this meant the AI labs would spend a lot more money on basic research. I was surprised each time I heard someone such as Karpathy, Sutton, or Chollet giving evidence in the opposite direction.
It's hard to know what's the God's honest truth and what's bluster from Anthropic, but if they honestly believe that they will create AGI in 2026 or 2027, as Dario Amodei has seemed to say, and if they believe they will achieve this mainly by scaling LLMs, then why would they invest much money in basic research that's not related to LLMs or scaling them and that, even if it succeeds, probably won't be productizable for at least 3 years? Investing in diverse basic research would be hedging their bets. Maybe they are, or maybe they're so confident that they feel they don't have to. I don't know.
This is what Epoch AI says about its estimates:
Based on our compute and cost estimates for OpenAI’s released models from Q2 2024 through Q1 2025, the majority of OpenAI’s R&D compute in 2024 was likely allocated to research, experimental training runs, or training runs for unreleased models, rather than the final, primary training runs of released models like GPT-4.5, GPT-4o, and o3.
That's kind of interesting in its own right, but I wouldn't say that money allocated toward training compute for LLMs is the same idea as money allocated to fundamental AI research, if that's what you were intending to say.
It's uncontroversial that OpenAI spends a lot on research, but I'm trying to draw a distinction between fundamental research, which, to me, connotes things that are more risky, uncertain, speculative, explorative, and may take a long time to pay off, and research that can be quickly productized.
I don't understand the details of what Epoch AI is trying to say, but I would be curious to learn.
Do unreleased models include as-yet unreleased models such as GPT-5? (The timeframe is 2024 and OpenAI didn't release GPT-5 until 2025.) Would it also include o4? (Is there still going to be an o4?) Or is it specifically models that are never intended to be released? I'm guessing it's just everything that hasn't been released yet, since I don't know how Epoch AI would have any insight into what OpenAI intends to release or not.
I'm also curious how much trial and error goes into training for LLMs. Does OpenAI often abort training runs or find the results to be disappointing? How many partial or full training runs go into training one model? For example, what percentage of the overall cost is the $400 million estimated for the final training run of GPT-4.5? 100%? 90%? 50%? 10%?
Overall, this estimate from Epoch AI doesn't seem to tell us much about what amount of money or compute OpenAI is allocating to fundamental research vs. R&D that can quickly be productized.
the fact that insane amounts of capital are going into 5+ competing companies providing commonly-used AI products should be strong evidence that the economics are looking good
Can you clarify what you mean by "the economics are looking good”? The economics of what are looking good for what?
I can think of a few different things this could mean, such as:
Those aren’t the only possible interpretations, but those are three I thought of.
if AGI is technically possible using something like current tech, then all the incentives and resources are in place to find the appropriate architectures.
You’re talking about research rather than scaling here, right? Do you think there is more funding for fundamental AI research now than in 2020? What about for non-LLM fundamental AI research?
The impression I get is that the vast majority of the capital is going into infrastructure (i.e. data centres) and R&D for ideas that can quickly be productized. I recall that the AI researcher/engineer Andrej Karpathy rejoined OpenAI (his previous employer) after leaving Tesla, but ended up leaving OpenAI after not too long because the company wanted him to work on product rather than on fundamental research.
As someone who doesn’t think LLMs will scale to AGI, I skipped over pretty much all of your OP as off-topic from my perspective
Okay, good to know.
I know that there are different views, but it seems like a lot of people in EA have started taking near-term AGI a lot more seriously since ChatGPT was released, and those people generally don't give the other views — the views on which LLMs aren't evidence of near-term AGI — much credence. That's why the focus on LLMs.
The other views tend to be highly abstract, highly theoretical, highly philosophical and so to argue about them you basically have to write the whole Encyclopedia Britannica and you can't point to clear evidence from tests, studies, economic or financial indicators, and practical performance to make a case about AGI timelines within about 2,000 words.
Trying to argue those other views is not something I want to do, but I do want to argue about near-term AGI in a context where people are using LLMs as their key evidence for it.
Because my brain works that way, I'm tempted to argue about the other views as well, but I never find those kinds of discussions satisfying. It feels like by the time you get a few exchanges deep into those discussions (either me personally or people in general), it gets into "How many angels can dance on the head of a pin?" territory. For any number of sub-questions under that very abstract AGI discussion, maybe the answer is this, maybe it's that, but nobody actually knows, there's no firm evidence, there's no theoretical consensus, and in fact the theorizing is very loose and pre-paradigmatic. (This is my impression after 15-20 years observing these discussions online and occasionally participating in them.) I think my response to these ideas should be, "Yeah. Maybe. Who knows?" because I don't think there's much to say beyond that.
My claim is that, in the context of this paragraph, “extremely unlikely” (as in “<0.1%”) is way way too confident. Technological forecasting is hard, a lot can happen in seven years … I think there’s just no way to justify such an extraordinarily high confidence [conditioned on LLMs not scaling to AGI as always].
If you had said “<20%” instead of “<0.1%”, then OK sure, I would have been in close-enough agreement with you, that I wouldn’t have bothered replying.
Does that help? Sorry if I’m misunderstanding.
I didn't actually give a number for what I think are the chances of going from conception of a new AI paradigm to a working AGI system in 7 years. I did say it's extremely unlikely, which is the same language I used for AGI within 7 years overall. I said I think the overall chances of AGI within 7 years is significantly less than 0.1%, so it's understandable you might think by saying going from a new paradigm to working AGI in 7 years is extremely unlikely, I also mean I think that has a significantly less than 0.1% chance of success, or a similar number.
The relationship between the overall chance of AGI within 7 years and the chance of AGI conditional on the right paradigm being conceived isn't clear because that depends on a third variable, which is the chance that the right paradigm has already been conceived (or soon will be) — and also how long ago it was conceived (or how soon it will be). That seems basically unknowable to me.
I haven't really thought about what number I would assign to that specific outcome: a new AI paradigm going from conception to a working AGI system within 7 years. It seems very unlikely to me. In general, I don't like the practice of just thinking up numbers to assign to things like that. It could be an okay practice if people didn't take these numbers as literally and seriously as they do. Then it wouldn't really matter. But people take these numbers really seriously and I think that's unwise, and I don't like contributing to that practice if I can help it.
I do think where guessing a number is helpful is when it helps convey an intuition that might be otherwise hard to express. If you just had a first date and your ask asks how it went, and you say, "It was a 7 out of 10," that isn't a rigorous scale, your friend isn't expecting that all first dates of that quality will always be given a 7 rather than a 6 or an 8, but it helps convey a sense of somewhere between bad and fantastic. I think giving a number to a probability can be helpful like that. I think it can also be helpful to compare the probability of an event, like AGI being created within 7 years, to the probability of another event, which is why I came up with the Jill Stein example. (The problem is for this to work your interlocutor or your audience has to share your intuitive sense of how probable the other event is.)
I don't know how you would try to rigorously estimate how long it would take to go from the right idea about AGI to a working AGI system. This depends largely on what the right idea is, which is precisely what we don't know. So, there is irreducible uncertainty here.
We can come up with points of comparison. You used LLMs from 2018 to 2025 as as an example — 7 years. I brought up backpropagation in 1970 to AlexNet in 2011 as another potential point of comparison — 41 years. You could also choose the conception of connectionism in 1943 to AlphaGo beating Lee Seedol in 2016 as another comparison — 73 years. Or you can take Yann LeCun's guess of at least 12 years and probably much more from his position paper to human-level AI, or Richard Sutton's guess of a 25% chance of "understanding the mind" (still not sure if that implies the ability to build AGI) in 8 years after publishing the Alberta Plan for AI Research. Who knows which of these points of comparison is most apt? Maybe none of them are particularly apt. Who knows.
The other thing I tried was considering the computation required for AGI in comparison to the human brain. This is almost as fraught as the above. We don't know for sure how much computation the human brain uses. We don't know at all whether AGI will require as much computation, or much less, or much more. Who knows?
In principle, almost anything could happen at almost any time, even if it goes against how we thought the world works, and this is uncomfortable, but it's true. (I don't just mean with AI, I mean with everything. Volcanoes, aliens, physics, cosmology, the fabric of society — everything.)
What to do in the face of that uncertainty is a discussion that I think belongs in and under another post. For example, if we assume at least for the sake of argument that we have no idea which of several various ideas for building AGI will turn out to be correct, such as program synthesis, LeCun's energy-based models, the Alberta Plan, Numenta's Thousand Brains approach, whole brain emulation, and so on — and also if we have no idea whether all of these ideas will turn out to be the wrong ones — is there a strongly defensible course of action for preparing for AGI? Is there, indeed, a strongly defensible case for why AGI would be dangerous?
I worry that such a discussion would quickly get into the "How many angels can dance on the head of a pin?" territory I said I don't like. But I would be impressed if someone could make a strong case for some course of action that makes sense even under a high level of irreducible uncertainty about which theoretical ideas will underpin the design of AGI and about when it will ultimately arrive.
I imagine this would be hard to do, however. For example, suppose Scenario A is that: the MIRI worldview on AI alignment is correct, there will be a hard takeoff, and AGI will be designed with a combination of deep learning and symbolic AI. Suppose Scenario B is: the MIRI worldview is false, whole brain emulation is the fastest possible path to AGI, and it will slowly scale up from a mouse brain emulation around 2065 to a human brain emulation around 2125,[1] and gradually from 2125 to 2165 it (or, more accurately, they) will become like AlphaGo for everything — a world champion at all tasks. Is there any strongly defensible course of action that makes sense if we don't know whether Scenario A or Scenario B is true (or many other possible scenarios I could describe) and if we can't even cogently assign probabilities to these scenarios? That sounds like a very tall order.
It's especially a tall order if part of the required defense is arguing why the proposed course of action wouldn't backfire and make things worse.
Who’s to say that this “AI paradigm beyond LLMs” hasn’t already been discovered ten years ago or more? There are a zillion speculative non-LLM AI paradigms that have been under development for years or decades. Nobody has heard of them because they’re not doing impressive things yet. That doesn’t mean that there hasn’t already been a lot of development progress.
Yeah, maybe. Who knows?
2065 for a mouse brain and 2125 for a human brain are real guesses from an expert survey:
Zeleznikow-Johnston A, Kendziorra EF, McKenzie AT (2025) What are memories made of? A survey of neuroscientists on the structural basis of long-term memory. PLoS One 20(6): e0326920. https://doi.org/10.1371/journal.pone.0326920
I'm really curious what people think about this, so I posted it as a question here. Hopefully I'll get some responses.
What part do you think is uncertain? Do you think RL training could become orders of magnitude more compute efficient?