This post has two parts: first, a (non-exhaustive) survey of some indications that the AI industry might be in a financial bubble. Second, an analysis that combines these indications with technical considerations relevant to whether the AI industry is in a bubble.
An economic bubble (also called a speculative bubble or a financial bubble) is a period when current asset prices greatly exceed their intrinsic valuation, being the valuation that the underlying long-term fundamentals justify. Bubbles can be caused by overly optimistic projections about the scale and sustainability of growth (e.g. dot-com bubble), and/or by the belief that intrinsic valuation is no longer relevant when making an investment (e.g. Tulip mania). They have appeared in most asset classes, including equities (e.g. Roaring Twenties), commodities (e.g. Uranium bubble), real estate (e.g. 2000s US housing bubble), and even esoteric assets (e.g. Cryptocurrency bubble). Bubbles usually form as a result of either excess liquidity in markets, and/or changed investor psychology.
If we were to operationalize the concept of an AI bubble and whether such a bubble has popped, we could look at several financial measures, including investment in datacentre construction. Probably the most straightforward operationalization would be say that the bubble has popped if the stock prices of a set of public companies, such as Nvidia, Microsoft, and Google, drop by at least a certain percentage for at least a certain amount of time.
We could also defer the question to judges and say that there is an AI bubble which was popped if the Wall Street Journal, the Financial Times, Bloomberg, the Economist, and the New York Times all say so. Or we could say that if a credible survey of people with relevant financial expertise finds that at least, say, 90% of respondents say a bubble has popped, a bubble has popped.
Opinions on a potential AI bubble
Among financial experts, opinion on whether AI is a bubble is still mixed (as, seemingly, it would have to be before a bubble has actually popped, given the nature of financial markets). My personal, subjective impression is that opinion has started to tip more toward thinking there's a bubble in recent months. There is some survey evidence for this, too.
Bank of America (BofA) conducted a survey of fund managers in October 2025:
A BofA Global Research's monthly fund manager survey revealed that 54% of investors said they thought that AI stocks were in a bubble compared with 38% who do not believe that a bubble exists.
In September 2025, just a month earlier, the same survey found that 41% of investors thought AI stocks were in a bubble.
What follows are the opinions of a few prominent individuals with relevant expertise.
Jared Bernstein, the chair of Joe Biden's Council of Economic Advisors and Barack Obama's chief economist, wrote an op-ed in the New York Times in October 2025 arguing the AI industry is in a bubble:
We believe it’s time to call the third bubble of our century: the A.I. bubble.
While no one can be certain, we believe this is more likely the case than not.
Jim Covello, Head of Global Equity Research at Goldman Sachs, had strong words about the current AI investment boom in June 2024 (emphasis added by me):
Many people seem to believe that AI will be the most important technological invention of their lifetime, but I don’t agree given the extent to which the internet, cell phones, and laptops have fundamentally transformed our daily lives, enabling us to do things never before possible, like make calls, compute and shop from anywhere. Currently, AI has shown the most promise in making existing processes—like coding—more efficient, although estimates of even these efficiency improvements have declined, and the cost of utilizing the technology to solve tasks is much higher than existing methods. For example, we’ve found that AI can update historical data in our company models more quickly than doing so manually, but at six times the cost.
More broadly, people generally substantially overestimate what the technology is capable of today. In our experience, even basic summarization tasks often yield illegible and nonsensical results. This is not a matter of just some tweaks being required here and there; despite its expensive price tag, the technology is nowhere near where it needs to be in order to be useful for even such basic tasks. This is not a matter of just some tweaks being required here and there; despite its expensive price tag, the technology is nowhere near where it needs to be in order to be useful for even such basic tasks. And I struggle to believe that the technology will ever achieve the cognitive reasoning required to substantially augment or replace human interactions. Humans add the most value to complex tasks by identifying and understanding outliers and nuance in a way that it is difficult to imagine a model trained on historical data would ever be able to do.
And:
The idea that the transformative potential of the internet and smartphones wasn’t understood early on is false. I was a semiconductor analyst when smartphones were first introduced and sat through literally hundreds of presentations in the early 2000s about the future of the smartphone and its functionality, with much of it playing out just as the industry had expected. One example was the integration of GPS into smartphones, which wasn’t yet ready for prime time but was predicted to replace the clunky GPS systems commonly found in rental cars at the time. The roadmap on what other technologies would eventually be able to do also existed at their inception. No comparable roadmap exists today. AI bulls seem to just trust that use cases will proliferate as the technology evolves. But eighteen months after the introduction of generative AI to the world, not one truly transformative—let alone cost-effective—application has been found.
And:
The big tech companies have no choice but to engage in the AI arms race right now given the hype around the space and FOMO, so the massive spend on the AI buildout will continue. This is not the first time a tech hype cycle has resulted in spending on technologies that don’t pan out in the end; virtual reality, the metaverse, and blockchain are prime examples of technologies that saw substantial spend but have few—if any—real world applications today. And companies outside of the tech sector also face intense investor pressure to pursue AI strategies even though these strategies have yet to yield results. Some investors have accepted that it may take time for these strategies to pay off, but others aren’t buying that argument. Case in point: Salesforce, where AI spend is substantial, recently suffered the biggest daily decline in its stock price since the mid-2000s after its Q2 results showed little revenue boost despite this spend.
And:
I place low odds on AI-related revenue expansion because I don't think the technology is, or will likely be, smart enough to make employees smarter. Even one of the most plausible use cases of AI, improving search functionality, is much more likely to enable employees to find information faster than enable them to find better information.
And:
Over-building things the world doesn’t have use for, or is not ready for, typically ends badly. The NASDAQ declined around 70% between the highs of the dot-com boom and the founding of Uber. The bursting of today’s AI bubble may not prove as problematic as the bursting of the dot-com bubble simply because many companies spending money today are better capitalized than the companies spending money back then. But if AI technology ends up having fewer use cases and lower adoption than consensus currently expects, it’s hard to imagine that won’t be problematic for many companies spending on the technology today.
And, finally:
How long investors will remain satisfied with the mantra that “if you build it, they will come” remains an open question. The more time that passes without significant AI applications, the more challenging the AI story will become. And my guess is that if important use cases don’t start to become more apparent in the next 12-18 months, investor enthusiasm may begin to fade. But the more important area to watch is corporate profitability. Sustained corporate profitability will allow sustained experimentation with negative ROI projects. As long as corporate profits remain robust, these experiments will keep running. So, I don’t expect companies to scale back spending on AI infrastructure and strategies until we enter a tougher part of the economic cycle, which we don’t expect anytime soon. That said, spending on these experiments will likely be the one of the first things to go if and when corporate profitability starts to decline.
Jeremy Grantham, a prominent asset manager who is known for publicly identifying past bubbles, said in March 2024 he thinks AI is in a bubble (and reiterated this opinion on a podcast in May 2025):
And many such revolutions are in the end often as transformative as those early investors could see and sometimes even more so – but only after a substantial period of disappointment during which the initial bubble bursts. Thus, as the most remarkable example of the tech bubble, Amazon led the speculative market, rising 21 times from the beginning of 1998 to its 1999 peak, only to decline by an almost inconceivable 92% from 2000 to 2002, before inheriting half the retail world!
So it is likely to be with the current AI bubble.
This month (November 2025), the hedge fund manager Michael Burry — who is known for predicting the subprime mortgage crisis, a story made widely-known in the book The Big Short and the subsequent film adaptation — called AI investment a "bubble" and harshly criticized the way Meta and Oracle are accounting for the depreciation of their GPUs and datacentres.
Suggestive evidence: circular financing
One piece of evidence for an AI bubble is the amount of circular financing that has occurred amongst the companies involved. Per the New York Times:
Many of the deals OpenAI has struck — with chipmakers, cloud computing companies and others — are strangely circular. OpenAI receives billions from tech companies before sending those billions back to the same companies to pay for computing power and other services.
For example:
From 2019 through 2023, Microsoft was OpenAI’s primary investor. The tech giant pumped more than $13 billion into the start-up. Then OpenAI funneled most of those billions back into Microsoft, buying cloud computing power needed to fuel the development of new A.I. technologies.
Bloomberg has also covered circular deals, which are often more complex than the above, as illustrated by this diagram:
Direct evidence: small effects on productivity and profitability
To me, what's more compelling is the evidence that the business customers of AI companies such as OpenAI, Anthropic, and Microsoft are not seeing much benefit from AI products in terms of employee productivity or financial outcomes such as profitability.
The consulting firm McKinsey has found that, as of 2025, the vast majority of companies have seen no significant increase in revenue or reduction in costs due to generative AI:
McKinsey research shows that while 80 percent of companies report using the latest generation of AI, the same percentage have seen no significant gains in topline or bottom-line performance. AI tools that help with general tasks can make employees more productive, but the small time savings they create often don't lead to noticeable financial benefits.
The MIT Media Lab released a report in August 2025 that had similar findings:
Despite the rush to integrate powerful new models, about 5% of AI pilot programs achieve rapid revenue acceleration; the vast majority stall, delivering little to no measurable impact on P&L [profit and loss]. The research—based on 150 interviews with leaders, a survey of 350 employees, and an analysis of 300 public AI deployments—paints a clear divide between success stories and stalled projects.
Many businesses are giving up on using AI, as reported by the New York Times:
But the percentage of companies abandoning most of their A.I. pilot projects soared to 42 percent by the end of 2024, up from 17 percent the previous year, according to a survey of more than 1,000 technology and business managers by S&P Global, a data and analytics firm.
An October 2025 report by the Boston Consulting Group said much the same, via Business Insider:
According to a new report from Boston Consulting Group, only 5% of companies in its2025 study of more than 1,250 global firms are seeing real returns on AI.
Meanwhile, 60% of companies have seen little to no benefit, reporting only minimal increases in revenue and cost savings despite making substantial investments.
One of the few published scientific studies of the effect of generative AI on worker productivity in a real world context found mixed results. The study looked at the use of large language models (LLMs) at a call centre providing customer support. It found:
Second, the impact of AI assistance varies widely among agents. Less skilled and less experienced workers improve significantly across all productivity measures, including a 30% increase in the number of issues resolved per hour. The AI tool also helps newer agents to move more quickly down the experience curve: treated agents with two months of tenure perform just as well as untreated agents with more than six months of tenure. In contrast, AI has little effect on the productivity of higher-skilled or more experienced workers. Indeed, we find evidence that AI assistance leads to a small decrease in the quality of conversations conducted by the most skilled agents.
Coding stands out as one area where they may be more significant productivity gains. Here's the abstract of a paper published in August 2025:
This study evaluates the effect of generative AI on software developer productivity via randomized controlled trials at Microsoft, Accenture, and an anonymous Fortune 100 company. These field experiments, run by the companies as part of their ordinary course of business, provided a random subset of developers with access to an AI-based coding assistant suggesting intelligent code completions. Though each experiment is noisy and results vary across experiments, when data is combined across three experiments and 4,867 developers, our analysis reveals a 26.08% increase (SE: 10.3%) in completed tasks among developers using the AI tool. Notably, less experienced developers had higher adoption rates and greater productivity gains.
Interestingly, while that study found a positive impact on coding productivity from AI coding assistants, a study from the non-profit Model Evaluation & Threat Research (METR) found a negative impact on productivity. Here's that abstract:
Despite widespread adoption, the impact of AI tools on software development in the wild remains understudied. We conduct a randomized controlled trial (RCT) to understand how AI tools at the February–June 2025 frontier affect the productivity of experienced open-source developers. 16 developers with moderate AI experience complete 246 tasks in mature projects on which they have an average of 5 years of prior experience. Each task is randomly assigned to allow or disallow usage of early-2025 AI tools. When AI tools are allowed, developers primarily use Cursor Pro, a popular code editor, and Claude 3.5/3.7 Sonnet. Before starting tasks, developers forecast that allowing AI will reduce completion time by 24%. After completing the study, developers estimate that allowing AI reduced completion time by 20%. Surprisingly, we find that allowing AI actually increases completion time by 19%—AI tooling slowed developers down. This slowdown also contradicts predictions from experts in economics (39% shorter) and ML (38% shorter). To understand this result, we collect and evaluate evidence for 21 properties of our setting that a priori could contribute to the observed slowdown effect—for example, the size and quality standards of projects, or prior developer experience with AI tooling. Although the influence of experimental artifacts cannot be entirely ruled out, the robustness of the slowdown effect across our analyses suggests it is unlikely to primarily be a function of our experimental design.
Analysis: can AI companies catch up to expectations?
Financial expectations for AI companies are extremely high. Valuations implicitly assume a lot of growth. Current AI capabilities and applications do not support these valuations. To avoid a bubble popping, either new capabilities will have to be developed to allow new applications or better performance on existing applications or existing capabilities will have to find new applications. Moreover, the new capabilities or new applications will have to be significant enough to generate significant financial results.
New applications in the absence of new capabilities seem unlikely to be significant enough. Many actors have been strongly incentivized to find applications for large language models (LLMs) for around three years.
A third option is for companies like OpenAI to introduce ads to monetize their free users, but given the costs of training and running AI models, and given the comparatively small amount of revenue per user generated by ad-based companies such as Facebook, this doesn't look like a way out of the bubble pop.
That leaves improvements in capabilities. The primary way that AI companies are counting on for capabilities to improve is scaling model training (which occurs before a model is released) and scaling the compute used for model inference (which occurs every time a model is used).
There are two forms of model training that can be scaled. The first is self-supervised pre-training. To scale self-supervised pre-training requires both an increase in compute and in data.
The non-profit Epoch AI predicts that LLMs will run out of data to train on in 2028. This would mean, unless something significant changes, the scaling of self-supervised pre-training would have to stop at that point.
In any case, there are already signs that self-supervised pre-training has hit a point of steeply diminishing returns. The highly respected machine learning researcher Ilya Sutskever, formerly the chief scientist at OpenAI, has made strong public statements about the effective end of self-supervised pre-training. Via Reuters:
Ilya Sutskever, co-founder of AI labs Safe Superintelligence (SSI) and OpenAI, told Reuters recently that results from scaling up pre-training - the phase of training an AI model that use s a vast amount of unlabeled data to understand language patterns and structures - have plateaued.
Sutskever is widely credited as an early advocate of achieving massive leaps in generative AI advancement through the use of more data and computing power in pre-training, which eventually created ChatGPT. Sutskever left OpenAI earlier this year to found SSI.
“The 2010s were the age of scaling, now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing,” Sutskever said. “Scaling the right thing matters more now than ever.”
Sutskever declined to share more details on how his team is addressing the issue, other than saying SSI is working on an alternative approach to scaling up pre-training.
Reuters notes in the same article:
Behind the scenes, researchers at major AI labs have been running into delays and disappointing outcomes in the race to release a large language model that outperforms OpenAI’s GPT-4 model, which is nearly two years old, according to three sources familiar with private matters.
Anthropic's CEO Dario Amodei has also seemed to acknowledge an end to the era of scaling up self-supervised pre-training. In a January 2025 blog post, Amodei wrote:
Every once in a while, the underlying thing that is being scaled changes a bit, or a new type of scaling is added to the training process. From 2020-2023, the main thing being scaled was pretrained models: models trained on increasing amounts of internet text with a tiny bit of other training on top. In 2024, the idea of using reinforcement learning (RL) to train models to generate chains of thought has become a new focus of scaling.
That brings us to scaling up reinforcement learning to train LLMs. The problem with this approach is described by the philosopher (and co-founder of effective altruism) Toby Ord:
Jones (2021) and EpochAI both estimate that you need to scale-up inference by roughly 1,000x to reach the same capability you’d get from a 100x scale-up of training. And since the evidence from o1 and o3 suggests we need about twice as many orders of magnitude of RL-scaling compared with inference-scaling, this implies we need something like a 1,000,000x scale-up of total RL compute to give a boost similar to a GPT level.
This is breathtakingly inefficient scaling. But it fits with the extreme information inefficiency of RL training, which (compared to next-token-prediction) receives less than a ten-thousandth as much information to learn from per FLOP of training compute.
Yet despite the poor scaling behaviour, RL training has so far been a good deal. This is solely because the scaling of RL compute began from such a small base compared with the massive amount of pre-training compute invested in today’s models.
Ord explains why he sees a very limited amount of remaining runway for scaling reinforcement learning-based training of LLMs:
But this changes dramatically once RL-training reaches and then exceeds the size of the pre-training compute. In July 2025, xAI’s Grok 4 launch video included a chart suggesting that they had reached this level (where pre-training compute is shown in white and RL-training compute in orange):
Scaling RL by another 10x beyond this point increases the total training compute by 5.5x, and beyond that it is basically the full 10x increase to all training costs. So this is the point where the fact that they get much less for a 10x scale-up of RL compute compared with 10x scale-ups in pre-training or inference really bites. I estimate that at the time of writing (Oct 2025), we’ve already seen something like a 1,000,000x scale-up in RL training and it required ≤2x the total training cost. But the next 1,000,000x scale-up would require 1,000,000x the total training cost, which is not possible in the foreseeable future.
Grok 4 was trained on 200,000 GPUs located in xAI’s vast Colossus datacenter. To achieve the equivalent of a GPT-level jump through RL would (according to the rough scaling relationships above) require 1,000,000x the total training compute. To put that in perspective, it would require replacing every GPU in their datacenter with 5 entirely new datacenters of the same size, then using 5 years worth of the entire world’s electricity production to train the model. So it looks infeasible for further scaling of RL-training compute to give even a single GPT-level boost.
I don’t think OpenAI, Google, or Anthropic have quite reached the point where RL training compute matches the pre-training compute. But they are probably not far off. So while we may see another jump in reasoning ability beyond GPT-5 by scaling RL training a further 10x, I think that is the end of the line for cheap RL-scaling.
If self-supervised pre-training has reached its end and if reinforcement learning-based training is not far off from its own end, then all that's left is scaling inference. Ord notes the economic problem:
This leaves us with inference-scaling as the remaining form of compute-scaling. RL helped enable inference-scaling via longer chain of thought and, when it comes to LLMs, that may be its most important legacy. But inference-scaling has very different dynamics to scaling up the training compute. For one thing, it scales up the flow of ongoing costs instead of scaling the one-off training cost.
The fixed cost of training a model can be amortized over every single instance where that model is used. By contrast, scaling up inference increases the marginal cost of every LLM query. This raises obvious difficulties, especially given that inference is already quite expensive.
Finally, there is no guarantee that even if scaling could continue, it would make LLMs a profitable or productive technology to justify current valuations. Some problems with LLMs may be more fundamental and not solvable through scaling, such as:
A lack of continual learning or online learning, which would allow a model to learn new things over time, as humans do constantly, rather than only learning new things when the AI company does a new training run for a new version of the model
Extreme data inefficiency compared to humans, requiring perhaps as many as 1,000 examples to learn a concept that humans can learn with only one or two examples
Very poor generalization compared to humans, meaning the bounds of what LLMs can grasp do not extend far beyond what's in their training data
For tasks that rely on imitation learning from humans (which is the primary form of learning for LLMs), there may not be sufficient examples of humans performing a task for LLMs to effective learn from (this is especially true for so-called "agentic" use cases that require computer use rather than just operating in the domain of text)
For computer vision tasks (perhaps including computer use), learning from video remains an open research problem with very little success to date, in contrast to LLMs learning from text data, which is a mature solution to an easier problem that benefits from the inherent structure in text
These are problems that can't be solved with scaling. Many human work tasks seem to require some combination of continual learning, human-like data efficiency, and much better generalization than LLMs have. Common complaints from users attempting to use LLMs for work tasks include hallucinations, basic mistakes in reasoning or understanding, and failure to follow instructions. These problems may be related to the fundamental limitations of LLMs just listed, or they may be more related to other limitations such as the architectures of current LLMs.
To summarize: capabilities must improve significantly to avoid a bubble. The prospects for improving capabilities much further through continued scaling seem very poor. Even if scaling could continue and keep delivering improvements, it is quite plausible this wouldn't be good enough, anyway, due to some of the more fundamental limitations with LLMs.
Conclusion
It seems very likely that the AI industry is in a bubble. Indeed, it is hard to imagine how AI could not be in a bubble.
The cause prioritization landscape in EA is changing.
* Focus has shifted away from evaluation of general cause areas or cross-cause comparisons, with the vast majority of research now comparing interventions within particular cause areas.
* Artificial Intelligence does not comfortably fit into any of the traditional cause buckets of Global Health, Animal Welfare, and Existential Risk. As EA becomes increasingly focused on AI, traditional cause comparisons may ignore important considerations.
* While some traditional cause prioritization cruxes remain central (e.g. animal vs. human moral weights, cluelessness about the longterm future), we expect new cruxes have emerged that are important for people’s giving decisions today but have received much less attention.
We want to get a better picture of what the most pressing cause prioritization questions are right now. This will help us, as a community, decide what research is most needed and open up new lines of inquiry. Some of these questions may be well known in EA but still unanswered. Some may be known elsewhere but neglected in EA. Some may be brand new. To elicit these cruxes, consider the following question:
> Imagine that you are to receive $20 million at the beginning of 2026. You are committed to giving all of it away, but you don’t have to donate on any particular timeline. What are the most important questions that you would want answers to before deciding how, where, and when to give?
Author’s note: This is an adapted version of my recent talk at EA Global NYC (I’ll add a link when it’s available). The content has been adjusted to reflect things I learned from talking to people after my talk. If you saw the talk, you might still be interested in the “some objections” section at the end.
Summary
Wild animal welfare faces frequent tractability concerns, amounting to the idea that ecosystems are too complex to intervene in without causing harm. However, I suspect these concerns reflect inconsistent justification standards rather than unique intractability. To explore this idea:
* I provide some context about why people sometimes have tractability concerns about wild animal welfare, providing a concrete example using bird-window collisions.
* I then describe four approaches to handling uncertainty about indirect effects: spotlighting (focusing on target beneficiaries while ignoring broader impacts), ignoring cluelessness (acting on knowable effects only), assigning precise probabilities to all outcomes, and seeking ecologically inert interventions.
* I argue that, when applied consistently across cause areas, none of these approaches suggest wild animal welfare is distinctively intractable compared to global health or AI safety. Rather, the apparent difference most commonly stems from arbitrarily wide "spotlights" applied to wild animal welfare (requiring consideration of millions of species) versus narrow ones for other causes (typically just humans).
While I remain unsure about the right approach to handling indirect effects, I think that this is a problem for all cause areas as soon as you realize wild animals belong in your moral circle, and especially if you take a consequentialist approach to moral analysis. Overall, while I’m sympathetic to worries about unanticipated ecological consequences, they aren’t unique to wild animal welfare, and so either wild animal welfare is not uniquely intractable, or everything is.
Consequentialism +
Long time lurker, first time poster - be nice please! :)
I was searching for summary data of EA funding trends, but couldn't find anything more recent than Tyler's post from 2022. So I decided to update it. If this analysis is done properly anywhere, please let me know.
The spreadsheet is here (some things might look weird due to importing from Excel to sheets)
Observations
* EA grantmaking appears on a steady downward trend since 2022 / FTX.
* The squeeze on GH funding to support AI / other longtermist priorities appears to be really taking effect this year (though 2025 is a rough estimate and has significant uncertainty.)
* I am really interested in particular about the apparent drop in GW grants this year. I suspect that it is wrong or at least misleading - the metrics report suggests they are raising ~$300m p.a. from non OP donors. Not sure if I have made an error (missing direct to charity donations?) or if they are just sitting on funding with the ongoing USAID disruption.
Methodology
* I compiled the latest grants databases from EA Funds, GiveWell, OpenPhilanthropy, and SFF. I added summary level data from ACE.
* To remove double counting, I removed any OpenPhilanthropy grants that were duplicated in GiveWell's grant database. Likewise for EA Funds.
* I inflation adjusted to 2025 $ based on the US CPI data from WorldBank.
* For 2025 data, I made a judgement call on how much data was "complete" and pro-rated accordingly - e.g. from GiveWell, it looks complete up until the end of June, so I excluded any grants made in H2 and doubled the sum.
Notes
My numbers are a bit different from Tyler's. I've identified the following reasons:
* Inflation adjustments (i.e. an upward boost from using 2025$)
* I've used GiveWell's grant database rather than their metrics reports,
* Different avoidance of double counting (I removed from OP, Tyler removed from GW. I also went through more manually - from what I can see Tyler removed any GW grant that has OP as