Main

This Perspective examines ‘open’ artificial intelligence (AI). We find that concepts from open-source software are being applied in ill-fitting ways to AI systems. At a time when industry players are seeking to influence policy using claims that open AI is beneficial to scientific innovation and democracy, on the one hand, or detrimental to safety, on the other, we look to grounding discussions about the affordances of ‘openness’ in AI in a material analysis of what AI is and what openness in AI can and cannot provide.

With this aim, we review the core components of AI systems, examining which of these can be made open or not, as well as reviewing the ecosystem that has formed around the concept of open AI. We find that open AI systems can offer transparency, reusability and extensibility: they can be scrutinized, reused and built ‘on top of’, to varying degrees. But we also find that claims posited around openness often lack precision, frequently focused on only one stage in the development-to-deployment life cycle of AI systems, often neglecting substantial industry concentration in large-scale AI development and deployment and thus warping common-sense understandings of openness imparted from free and open-source software. Discourses that index on openness in isolation from the economic incentives of AI rarely engage issues of context, power and use—how will such systems be used, by whom, on whom—even as these issues profoundly shape the policy outcomes that debates around openness and AI claim to concern themselves with.

These questions are particularly important in our present AI landscape, which is dominated by corporate actors1,2,3,4,5. Creating the conditions under which independent alternatives to industry-dominated tech can thrive is a worthy cause. However, just as many traditional open-source projects were co-opted in various ways by large technology companies, our findings indicate that the rhetoric of openness is frequently wielded in ways that, far from alleviating, instead exacerbate the concentration of power in the AI sector.

The rhetoric of open AI is at present directing political and research attention and shaping policy in both the USA and the European Union, among other jurisdictions6,7,8,9. The ‘open-source AI’ debate has been substantially constructed by AI companies, who have used claims around openness to serve their particular regulatory and market aims. Depending on their business model, companies have used the rhetoric of openness to implicitly back arguments that AI should either be exempt from regulation10 or be subject to stringent licensing requirements or export controls11. Meanwhile, recent work by researchers has helpfully complicated these claims, even if it hasn’t reshaped the public debate, adding nuance and grounding by evaluating the risks and benefits of model openness12,13 and creating taxonomies of more or less open models in an attempt to provide conceptual clarity14,15.

Open AI and definitional arbitrage

The definition of AI itself is contested and unclear, further muddling the question of what ‘open’ means in the context of AI. Over its more than 70-year history, the term AI has been applied to a wide variety of approaches, less as a technical term of art and more as marketing and aspiration4,16. Some AI systems are deterministic, such as rule-based systems, which—given a set of inputs—follow a set of instructions to produce clearly defined outputs. Others are probabilistic, making comparisons to vast pools of data and drawing inferences from the connections between data points. At present, the term often describes probabilistic, large, resource-intensive machine-learning systems, with so-called ‘generative’ AI attracting the most attention in popular discourse. Because large and generative AI systems most clearly perturb traditional definitions of open source and because they are the focus of present policy and discourse, we focus on these systems.

The need for definitional clarity has prompted considerable debate17, and has culminated in a proposal from the Open Source Initiative18. In more popularized discussions about AI, drawing on ideologies about free software that were forged decades ago with the aim of resisting corporate control19, conventional understandings of free and open-source software are being projected onto open AI systems even when they do not fit19,20. From the promise that open source democratizes software development, that many eyes on open code could ensure its integrity and security21, or that open source levels the playing field and allows the innovative to triumph22,23, open-source software did many of these things, to varying degrees18.

Methods of asserting dominance through—not in spite of—open-source software

Over the history of free and open-source software, for-profit tech companies have used their resources to capture ecosystems, or have used open-source projects to assert dominance in a variety of ways. Here are examples used by companies in the past.

1. Invest in open source to challenge your proprietary competitors.

IBM and Linux. In 1999, IBM invested US$1 billion in the open-source operating system Linux—operating software positioned as an open-source alternative to the then-dominant Microsoft—and established the Linux Foundation24.

2. Release open source to control a platform.

Google and Android. In 2007, Google open sourced and heavily invested in Android OS, allowing them to achieve mobile operating prominence over competitor Apple and attracting scrutiny from regulators for anticompetitive practices25.

3. Re-implement and sell as Software As A Service (SAAS).

Amazon and MongoDB. In 2019, Amazon implemented its own version of the popular open-source database MongoDB, known as DocumentDB26, and sold it as a service on its AWS platform. In 2022, it transitioned to a revenue-sharing agreement with MongoDB27,28,29.

4. Develop an open-source framework that enables the company to integrate open-source products into its proprietary systems.

Meta and PyTorch. Meta CEO Mark Zuckerberg has described how open sourcing the PyTorch framework has made it easier to capitalize on new ideas developed externally and for free30,31.

Open AI is a different story from open-source software in key respects. Unlike open-source software, identifying harms and flaws in AI systems requires much more than open weights and an accessible application programming interface (API) or openly licensed AI model (as in Meta’s LLaMA model series), and although provision of the training data and rigorous open documentation have salutary effects on the ability to audit AI systems critical for accountability, there are inherent limitations in the ability to predict the behaviour of systems that are probabilistic32.

Likewise, although openness can foster competition at the edges—enabling others to build on top of base AI models through fine-tuning with a high level of efficiency—this does not perturb the characteristics of the market at large. Nor does fine-tuning eliminate the impact of key decisions made during the development phase of the base model33. Factors that make the playing field in AI uneven include network effects, access to datasets, access to and cost of the computing needed for inference at scale, lack of a viable business model and, at present, inflated interest rates34,35,36,37,38. Together, these factors strongly limit the competitiveness of AI start-ups in the present business environment and contribute to a market in which the paths to profit, by and large, channel through large tech companies—whose infrastructures are imperative for AI development and whose access to markets are imperative for any return on investment39. Openness may enable greater ability to modify AI models that have already been developed but these larger environmental factors influence whether the product of such experimentation has a path to market40.

In practice, gradients of AI openness offer greatly differing affordances, even though they are all confusingly clustered under the same term, ‘openness’41. Some systems described as open, such as Meta’s LLaMA-3 (ref. 42), offer little more than an API or the ability to download a model subject to distinctly non-open use restrictions42,43. In these cases, this is ‘openwashing’ systems that are better understood as closed44,45,46. Other maximal variants of open AI, such as EleutherAI’s Pythia series, go much further, offering access to the source code, underlying training data and full documentation, as well as licensing the AI model for wide reuse under terms aligned with the Open Source Initiative’s long-standing definition of open source.

Given these confused definitions, unless quoting verbatim claims, we avoid the term open source in the rest of this paper and instead use the blanket term open.

What is (and is not) open about open AI?

AI systems require distinct development processes and rely on specialized and costly resources concentrated in the hands of a few large tech companies5,47,48,49. Given the resources required to produce large-scale AI systems, commercial AI companies with computing power, datasets and research teams have increasingly dominated the field of AI research and development. As such, these companies not only shape the trajectory of what gets built but also the conditions under which AI systems can be built, including what elements of a system (weights and datasets) are made open for others to access and reuse. Although new techniques have made it easier to build leaner, more efficient use cases that are fine-tuned with larger base models50, they have done so without changing these underlying characteristics of the market. Ultimately, the cost and resources needed for training, and the choke points that large companies hold in terms of access to market, mean that open AI thus does not straightforwardly equate to a shift in competitive conditions for the AI market, although in its more maximal instantiations, it provides three key affordances:

  1. 1.

    Transparency. Many AI systems labelled ‘open’ publish weights, documentation or data about the system. Maximal examples of open AI provide access to the underlying training data and information about the weights associated with a given model. Both of these are useful for enabling some forms of validation and auditing51,52 and both help augur post-hoc insights into system behaviour that are critical for accountability. Because of the probabilistic nature of present AI systems, assertions about the transparency of AI systems, however open they are, should be measured, particularly when drawing comparisons with traditional software: knowing the weights, code and documentation cannot tell us exactly how a model will perform in a given context or explain why a given outcome occurs or enable us to predict the so-called ‘emergent’ properties of the system53,54,55.

  2. 2.

    Reusability. Some open AI models and data are licensed and made available to third parties to reuse56. Openly licensed data and model weights, and the frequent use of traditional open-source licences in making these available, have contributed to claims that open AI will have inherent beneficial effects on market competition7. However, access to market remains a constrained resource. Even well-resourced actors, who have the capital, talent and data to create large-scale models, do not always have an obvious way to deploy these models or ensure a return on investment, owing to substantial bottlenecks in market access, which—at present—runs through the large companies by either cloud offerings or large-scale platform integrations. We see this in the example of ‘open’AI company Mistral AI’s decision to contract with Microsoft, allowing Microsoft to license a version of its Mixtral Large AI model to cloud customers through its Azure Cloud business. This is notable given that Mistral is one of the most well-financed AI start-ups building open models57 and has marketed itself for its efficient use of computing58. But even with these advantages, it still moved—alongside OpenAI and Inflection AI—to access the market through Microsoft’s cloud platform59.

  3. 3.

    Extensibility. Extensibility enables us to build on top of off-the-shelf models, fine-tuning them for one purpose or another. It is a key feature championed particularly by corporate actors invested in open AI60. This is in large part because the work of ‘extending’ off-the-shelf models doubles as free product development for those who might want to repurpose a fine-tuned model. Extending an open AI model means that those doing this work do not start with a blank slate. They take a large model, already laboriously and expensively trained, and adjust its parameters and generally train it on further, often specialized, data in service of adapting its performance to a particular domain or task. Notable editorial decisions have already been made during the process of developing the ‘base model’61,62,63.

The political economy of open AI

Here we review the materials—models, data, labour, frameworks, and computational power—frequently involved in creating and using large AI systems2,64. This helps us evaluate which parts of these systems are or can be made open, which aren’t or can’t, and in what ways.

AI models

Much of the continuing discourse about open AI focuses on AI models, which are only one part of an operational AI system and which on their own do not account for the full development-to-deployment life cycle of an AI system. An AI model refers to an algorithmic system that has been trained and evaluated using large amounts of data to produce statistically likely outputs in response to a given input, stored as numerical weights. For example, ChatGPT works by applying generative pre-trained transformer (GPT) models, which were trained on huge amounts of text data, much of it scraped from the web. These GPT models are one part of ChatGPT’s suite of client-specific software, which includes a web client and iOS and Android apps, each of which require discrete libraries and expertise to maintain, as well as skilled people to maintain them for as long as they exist48. These clients incorporate GPT models as only one part of a user-facing interface. Once trained, an AI model can be released in the same way other software code would be released—under an open licence for reuse or otherwise made available online. Reusing an already-trained AI model does not require having access to the underlying training or evaluation data nor does it require weights or other system details be made available. In this sense, many AI systems that are labelled open are playing with the term loosely. Instead of providing meaningful documentation and access, they are effectively wrappers around closed models, inheriting undocumented data, failing to provide annotated reinforcement learning from human feedback (RLHF) training data and labour-process information and rarely publishing their findings, let alone documenting these in independently reviewed publications15.

There are now several examples of large-scale open AI models available for some degree of public reuse: these include Meta’s LLaMA-2 (ref. 60) and LLaMA-3 (ref. 43); Falcon 40B, developed by the UAE’s Technology Innovation Institute, trained on AWS65; MosaicML’s MPT66 models and Mistral AI’s Mixtral 8x22B, both tied to Microsoft’s Azure; BigScience’s BLOOM model, trained on the French Jean Zay supercomputer67. Placing all of these under the singular label of open does a disservice to the serious distinctions between them and contributes to the confusion around the term.

Companies such as Hugging Face and Stability AI offer open AI models to their customers and the public. Their business models rely not on licensing proprietary models themselves but instead on charging for extra features and labour on top of open models, features such as API access, model training on custom data and security and technical support as a paid service68. They also offer to fine-tune private models for their clients, honing and calibrating the performance of already-trained models for a given task or domain.

The non-profit EleutherAI also offers large-scale open-source AI models, along with documentation and the codebases used to train them. EleutherAI is focused only on fostering research on large-scale AI69, licensing its models under the very permissive Apache 2.0 open-source licence for use by AI researchers56. Among those engaging in open AI, EleutherAI offers arguably the most maximally open AI systems.

A handful of academic projects have also produced large open AI models at smaller scales. These include Stanford’s Alpaca model, well known for having been developed to run on a single laptop—a notable feat given the computationally intensive nature of deploying such models70. However, even a chatbot based on this extremely computationally efficient model became too costly—and risky, owing to the model’s ‘hallucinations’—to continue running and the team has since taken it down71.

The present pattern in AI development takes a bigger-is-better approach when it comes to data, computing and model size33. The bigger the model, the more resource intensive it is to train and calibrate and thus the more difficult it is to produce outside large technology companies. Although we know that the largest openly available AI model is at present LLaMA-3 and that it was trained on 15 trillion tokens42, information on the datasets for models has become increasingly opaque, for closed and ostensibly open models alike. OpenAI has not released the size of GPT-4 (ref. 72), Anthropic’s technical report does not discuss the size of Claude 3’s training data73 and Mistral AI has declined to release even the size of the training data of its openly available model, citing the “highly competitive nature of the field”74. Further, although fine-tuning a model for a particular task or domain is less computationally expensive per instance (but much more environmentally costly in aggregate), such third parties can only build on top of models that they cannot scrutinize nor replicate, leading to an ‘upper class of AI’33.

Data

Data shaped to exacting (and labour-intensive) specifications is necessary to construct large-scale AI systems. Some researchers have even claimed that access to data may be more important than access to computing when building large-scale AI48,75. Both are essential and, in the present ‘rush-to-scale’ pattern, the more of each, the ‘better’ these models perform33,76.

Data are frequently a closed element of many AI offerings advertising themselves as open: many large-scale AI models described as open neglect to provide even basic information about the underlying data used to train the system77, let alone offering the underlying training data openly or documenting its provenance. Lack of data transparency presents a serious challenge to any claims made around the benefits of open AI and hinders the kind of validation or reproducibility needed for sound science.

Scraping data to create datasets for AI development raises issues of extraction and intellectual property that are particularly relevant to concerns about concentration in the AI sector. Such datasets, whether open or closed, are often assembled by taking copyrighted images, text and code from the web or by copying and reusing datasets compiled by language groups from the majority world, such as GhanaNLP78 and Lesan AI79. This means that, even though it is possible to train models without copyrighted content80, those using these datasets to train and evaluate AI models are often using others’ work and intellectual property to do so, claiming fair use even as such claims are being legally contested81, and willing and able to weather the cost of lawsuits in either instance82. Legal or not, the practice of indiscriminately trawling web data to create systems that are now being poised to undercut the livelihoods of writers, artists and programmers—whose labour created such ‘web’ data in the first place—has raised alarm and ire83, and lawsuits filed on behalf of these actors are now moving forward84.

These concerns are particularly pressing considering the colonial echoes present in current data labour practices: AI systems frequently rely on data and labour resources from the majority world85, and the founder of the GhanaNLP open-source project has noted that big tech’s open source risks enabling continued colonial exploitation86,87,88. Such exploitation also runs directly counter to majority-world movements for data sovereignty, exemplified by projects such as Te Hiku Media, who point out that “the majority of tangata whenua and other indigenous peoples may not have access to the resources that enable them to benefit from open source technologies… By simply open sourcing our data and knowledge, we further allow ourselves to be colonised digitally in the modern world”89.

This is not an argument for closed datasets, which compound this issue. It is an entreaty to be clear about precisely what open datasets can and cannot accomplish. When datasets are not made available for scrutiny, or when they are inscrutably large, it becomes very difficult to check whether these datasets launder others’ intellectual property or commercially use data that were specifically licensed for non-commercial use or were licensed under particular sovereignty mandates. For example, Microsoft’s GitHub Copilot programming assistant—a generative AI system that produces code—has been shown to have been trained on and subsequently regurgitate code licensed under the General Public License90, an open-source licence that requires derivative code to be released under the same terms. However, even using permissively licensed code to train generative AI may similarly violate provisions requiring attribution, which current generative AI systems could, but do not, provide at present.

Datasets such as the Pile91 and Common Crawl92,93 are widely available but extra labour is required to make such datasets useful for the purpose of building large AI models. Careful curation and remixing of datasets is necessary to create performant AI: BigScience’s BLOOM model was trained on a composite of 498 datasets, which involved a complex data-governance process, as well as a manual quality-filtering process to remove code, spam and other noise67. Although presumptively the larger datasets used by companies require proportionately similar levels of labour, we know little to nothing about them, even those that claim to be open.

Labour

The insatiable need of large-scale AI systems for curated, labelled, carefully organized data means that building AI at scale requires substantial human labour. This labour creates the ‘intelligence’ that AI systems are marketed as making computational94,95. This labour can be roughly categorized as applied to:

  • Data labelling and classification

  • Model calibration (reinforcement learning with human feedback, and similar processes)

  • Content moderation, trust and safety and other forms of post-deployment support

  • Engineering, product development and maintenance.

Generative AI systems are trained and evaluated on a broad range of human-generated text, speech or imagery. The process of shaping a model such that it can mimic human-like output without replicating offensive or dangerous material requires intensive human involvement to ensure that the outputs of the model stay within the bounds of ‘acceptable’96—and thus enable it to be marketed, sold and applied in the real world by corporations and other institutions intent on maintaining customers and their reputations. This process is often called reinforcement learning from human feedback, or RLHF, which is a technical-sounding term that, in practice, refers to thousands of hours of human labour, during which workers might be instructed to select which of a few snippets of text produced by a generative AI system most closely resembles human-generated text, and their choices would be fed back into the system97. Although data preparation and model calibration require extensive, rarely heralded labour that is fundamental in attaching meaning to the data that shape AI systems, companies generally release little if any information about the labour practices underpinning this data work, and failing to release such information is seldom criticized as a form of closedness. What we do know about these processes is largely the product of either investigative journalism98,99,100 or organizing by workers and researchers101,102,103.

The labour required to curate, prepare data and calibrate systems is poorly paid but it still costs a large amount given the number of workers and time required to shape the data to build contemporary AI systems. This presents another barrier to democratic and open access to the resources required to create and deploy large AI models (even as we cannot accept the term democratic for a structure that relies on low-paid, precarious workers who receive little benefit while enduring harm and are themselves excluded from such imagined democracy).

Development frameworks

Development frameworks make it easier for those developing software to build and deploy it in regimented, predictable and expedient ways. They are part of standard development practices and are not unique to AI. They work by providing pre-written pieces of code, templatized workflows, evaluation tools and other standardized methods for common development tasks. This helps create more fungible, interoperable and testable computational systems, while minimizing the time spent ‘reinventing the wheel’ and avoiding bugs easily introduced when implementing systems from scratch. As with software development in general, AI development relies on a handful of popular open-source development frameworks. They include increasingly vast repositories of datasets, data-validation tools, evaluation tools, tools for model construction, tools for model training and export, pre-training libraries and more, which together shape the way AI is made and deployed4.

The two dominant AI development frameworks are PyTorch and TensorFlow. Both were created within large commercial technology companies, Meta and Google, respectively, who continue to resource and maintain them. There are many more pre-trained AI models that exclusively work within the PyTorch framework than there are those that work with TensorFlow33,104. PyTorch is also the most popular framework in academic AI research, used in most academic papers105,106.

PyTorch was initially developed for internal use by Meta but was released publicly in 2017. Although PyTorch operates as a research foundation under the umbrella of the Linux Foundation107, it continues to be financially supported by Meta107,108 and its lead maintainers (responsible for governance and decision-making) are all Meta employees109. TensorFlow was originally developed and released by Google Brain in 2015 (ref. 110) and continues to be directed and financially supported by Google, which also employs many of its core contributors111.

Open-source development frameworks offer tools that make the AI development and deployment process quicker, more predictable and more robust. They also have important benefits for the companies developing them. Most notably, they allow Meta, Google and those steering framework development to standardize AI construction so that the results are compatible with their own company platforms—ensuring that their framework leads developers to create AI systems that, like Lego, snap into place with their own company systems112. In the case of Meta, this allows them to more easily integrate and commercialize systems developed, tuned or deployed using PyTorch. Zuckerberg clearly stated these benefits to Meta in a 2023 earnings call, in which he said, “[PyTorch] has generally become the standard in the industry […] it’s generally been very valuable for us […] Because it’s integrated with our technology stack, when there are opportunities to make integrations with products, it’s much easier to make sure that developers and other folks are compatible with the things that we need in the way that our systems work.”113 and reiterated this point in a 2024 earnings call31. This is true for Google and TensorFlow as well. In the case of Google, TensorFlow has been created to easily and intuitively operate with Google’s Tensor Processing Unit (TPU) hardware, the powerful proprietary computing infrastructure core to Google’s cloud AI computing business. This enables Google to optimize their commercial cloud offerings for AI development, positioning these products as the engine of AI. In this way, open development frameworks can work to entrench and bolster corporate AI dominance.

Open AI development frameworks can also allow those bankrolling and directing their development to create on-ramps to profitable computing and other service offerings. Similarly to how corporate representatives drive governance of internet standards to the exclusion of others114, AI companies shape the work practices of researchers and developers33 such that new AI models can be easily integrated and commercialized. This gives the company offering the framework substantial indirect power within the ecosystem: training developers, researchers and students interacting with these tools in the norms of the company’s preferred framework and thus helping define—and in some ways capture—the AI field4,115.

Computational power

Developing large AI models requires massive datasets, which require massive computational power to process49,76. Contemporary AI development is characterized by a race to scale33, with older estimates showing that the amount of computing used to train models has increased about 300,000 times in 6 years, roughly an 8-fold increase each year116, and recent estimates of data use showing an increase in dataset size of around 2.4 times per year117. Access to computing remains a notable barrier to practical reusability for many open AI systems, because of the high cost involved in both training and running inference (51,686 kWh, 7,571 kWh and 1 × 10−4 kWh for training, fine-tuning and inference energy costs, respectively, in one case118) on large AI models at scale (that is, instrumenting them in a product or API for widespread public use). Furthermore, eking out maximal computational capacity from specialized hardware requires specialized and, in some cases, proprietary software systems.

It is hard to overstate Nvidia’s dominance here: the company maintains a 70–90% market share for state-of-the-art AI chips119. Moreover, more than four million developers rely on CUDA120, the ‘de facto industry standard’121 partly proprietary framework developed by Nvidia that only supports training on the company’s proprietary graphics processing units (GPUs) (specialized computer processors, originally developed for gaming, now primarily used for AI training because they allow many calculations to be performed quickly in parallel). The CUDA development ecosystem is a key element of Nvidia’s powerful market dominance49 (with the company’s market share at 88% for GPUs122) and has been nurtured and extended since 2006, giving it a big head start. Like Apple’s developer ecosystem—which offers those wishing to build apps and services for the company’s proprietary operating systems high-quality building blocks—CUDA provides expansive and norm-setting resources to AI researchers and developers.

In short, the computational resources needed to build new AI models and use existing ones at scale, outside privatized enterprise contexts and individual tinkering, are scarce, extremely expensive and concentrated among only a handful of corporations (with Nvidia at the helm49,122), who themselves benefit from economies of scale, the capacity to control the software that optimizes computing and the ability sell costly access to computational resources33,123. The seamlessness of integration across computational provision and model access is seen by some as powering demand for cloud infrastructure providers124,125, further suggesting that it is ownership of an ecosystem, rather than the ability to produce a successful model or product offering, that determines competitiveness in AI.

Conclusion

By dissecting the pieces that together comprise modern AI systems and examining which of these pieces can and cannot be made open, we reveal a map of open AI which shows that, even at its most maximal, open AI is highly dependent on the resources of a few large corporate actors, who effectively control the AI industry and the research ecology beyond.

For this reason, the pursuit of even the most open AI will not on its own lead to a more diverse, accountable or democratized ecosystem, even though it may have other benefits. We also see that, as in the past, big tech companies vying for AI advantage are making use of open AI to consolidate market advantage while deploying the rhetorical wand of openness to deflect from accusations of AI monopoly and attendant regulation1. The reality is, however open it is, when AI systems are deployed at scale across sensitive domains, they can have diffuse and profound effects that should not be determined by the small handful of for-profit companies who at present control the resources required to create and deploy these systems at scale, bringing them in front of the millions of customers who will be directly affected by them, particularly when these effects cannot be foreseen simply by examining system code, model weights and documentation. The creation of meaningful alternatives to the present AI model will not be accomplished through the pursuit of open AI development alone, even though elements such as data transparency and documentation are valuable for accountability, and maximally open AI projects helpfully illustrate the limits of what is possible. Focusing policy intervention on whether AI will be open or closed serves to distract from the overwhelmingly opaque nature of most corporate AI systems, both open and closed, in turn drawing valuable energy and initiative away from questions on the implications of AI in practice.

Unless pursued alongside other strong measures to address the concentration of power in AI, including antitrust enforcement and data privacy protections, the pursuit of openness on its own will be unlikely to yield much benefit. This is because the terms of transparency, and the infrastructures required for reuse and extension, will continue to be set by these same powerful companies, who will be unlikely to consent to meaningful checks that conflict with their profit and growth incentives.

We need a wider scope for AI development and greater diversity of methods, as well as support for technologies that more meaningfully attend to the needs of the public, not of commercial interests. And we need space to ask ‘why AI’ in the context of many pressing social and ecological challenges. Creating the conditions to make such alternatives possible is a project that can coexist with, and even be supported by, regulation. But pinning our hopes on ‘open’ AI in isolation will not lead us to that world, and—in many respects—could make things worse, as policymakers and the public put their hope and momentum behind open AI126, assuming that it will deliver benefits that it cannot offer in the context of concentrated corporate power.