Making Sense of Research on How People Use AI

November 18, 2025

Normform/Getty Images

Summary.

This year, three major studies have examined the human behavioral side of AI. OpenAI’s usage report, Anthropic’s Economic Index, and the author’s own social

Leer en español Ler em português

As generative AI plays an increasingly prominent role in our lives—at work as well as at home—we’d all do well to develop our own nuanced viewpoints about the technology. Over the past three years in particular, it’s become far too big to ignore. A useful way to think about this new phenomenon is to separate the rapid changes in the technology itself from the ensuing changes in human behavior as we use generative AI in our daily lives. On the technology side, there’s lots to keep track of: latest model releases, benchmark performance records, breathtaking funding announcements, training data sources, new features, and much more. On the human side, it’s about understanding how people are actually using the new technology, and why. My work is focused on the latter—our responses and adaptations to AI—the anthropological side of the story.

For business leaders, this shouldn’t just be academic curiosity. Companies are investing billions in AI tools but struggling, so far, to see meaningful returns. Understanding how people actually use AI—in contrast to how vendors claim they do—can help business leaders avoid costly mistakes, identify concrete opportunities, and make smarter bets on which tools to adopt and which use cases to evangelize. Business transformation, and AI transformation in particular, is substantially about changing people’s behaviors, and so understanding what they know, how they feel, and what they’re doing is a vital set of inputs.

This year, three major studies have examined the human behavioral side of AI. OpenAI’s usage report, Anthropic’s Economic Index, and my own social listening research, each shed a different light on what we’re doing with generative AI.

The path to a sensible, defensible, and useful view of what’s going on lies in the synthesis of many different sources. Taking these three reports together—and with an eye toward each method’s strengths and weaknesses—can provide some clear signals for business leaders to develop an enlightened stance and make good decisions around generative AI.

Reading AI Research Critically

Before we look at the different studies, it’s important to understand the contexts in which they—and all studies—are published. Vested interests lie in plain sight as well as in deep cover. So, it’s important to be vigilant about a few things when reading new studies:

Vested interests

While the research from any source may be methodologically sound, reports do not exist in a vacuum. Companies often publish research because it serves their interests. Usage data from AI companies can demonstrate their platforms’ value and encourage adoption, in addition to the value the research offers on the merits. This doesn’t invalidate their findings, but savvy readers should view them with that context in mind.

Inherent bias

Certain types of studies will have a tendency to tell a certain type of story. For example, usage reports (all three of those mentioned here) tend to emphasize growth, momentum, and positive outcomes. They may highlight the most impressive use cases while downplaying the long tail of failures or limitations. The choice of metrics—active users vs. satisfied users, task completion vs. task quality—can paint very different pictures.

Triangulation

A single study may hold value; more valuable still is when its findings are replicated. Most valuable of all is when studies using different methods consistently converge on the same conclusions. That might be cross-checking different studies, as we are doing here in this article. Or it might just be validating a paper’s findings against your own experience.

Millions of business leaders are now evaluating AI investments, big and small. And they should be making those decisions with open eyes and a critical lens.

Methods at a Glance

Let’s return now to the question of human behavior.

How might we understand what people are doing with AI? Billions of generative AI queries are sent each day. But those queries are made privately, between the individual and the LLM they’re using—even the LLM companies themselves are restricted in what individual message content they can see. And they certainly can’t include user messages in their research reports. (OpenAI notes explicitly in their report that “no member of the research team ever saw the content of user messages.”)

So far, three main methods have been used:

Telemetry

What users actually do: the usage logs that only LLMs themselves can access.

Strengths: captures actual behavior, large volume of data.
Weaknesses: limited to single platforms, excludes logged-out users and opt-outs, lacks contextual detail (owing to privacy), inaccessible for most of us.

Surveys

What users say, when asked.

Strengths: can ask for any detail at all (e.g. demographics), easy to set up.
Weaknesses: self-selection bias, may not reflect actual behavior, low volume of data.

Social listening

What users say, unasked.

Strengths: truth (anonymity enables psychological safety), detail, what people feel most strongly about, downstream reflections and impact.
Weaknesses: low volume of data, bias towards sources, no demographic data.

No single method will capture everything. But together, these complementary approaches help form a workable, useful picture of how a billion human beings are already weaving AI into their daily lives.

On Social Listening, Specifically

Before examining what these studies found, I’d like to elaborate on the social listening approach—the method I’ve been using.

One major advantage is that it’s accessible for everyone. You don’t need a huge list of survey recipients or to be working at one of the LLM companies. Public forums contain rich data about AI usage that anyone can pick up and parse.

Social listening captures intensity of feeling. There’s what people do, and there’s what they care enough to write (and rant) about publicly. For contexts where emotional investment matters—launching an AI product, protecting vulnerable users, or understanding edge cases—this lens reveals use cases that carry outsized human impact. For example, the AI psychosis headlines (such as this one from The Wall Street Journal) broke just months after my 2025 HBR article—not because millions experienced such episodes, but because the effect on the few individuals that did was profound, and some of them took to Reddit and similar places to share their intense experiences.

Anonymity affects what people share. In online forums, aliases create the psychological safety to discuss therapeutic usage, intimate details, and controversial applications that just won’t show up in privacy-protected, meticulously-sanitized corporate data. This doesn’t mean every post is perfectly accurate, but it does show that the method surfaces truths about high-stakes AI usage which other methods systematically miss.

Public forums can also provide downstream context. The original poster and subsequent commenters build on each other’s experiences, occasionally revealing what happened after—whether a use case achieved its goal, what workarounds emerged, or which prompts worked best. Such community-driven, post-hoc insights are largely absent from usage logs or surveys.

Where the Research Is Mostly Aligned

Where distinct reports agree, there is probably some signal to pay attention to. And they agree on plenty.

Unsurprisingly, LLM usage is growing fast

ChatGPT has just hit approximately 800 million weekly active users who collectively send about 20 billion messages per week. Anthropic claims that 40% of employees report using AI at work, double the percentage from just two years ago.

Tasks follow a power-law distribution

In a power-law distribution, a few events are disproportionately large, while most are very small. OpenAI sees 78% of all messages fitting into its top-three categories (practical guidance, writing, and seeking information). Anthropic observes that 20% of task categories account for 87% of usage. In my research, the top-20 use case buckets account for more than half of the total use cases. This concentration suggests that leaders should focus AI investments on a small handful of high-impact use cases, rather than trying to do everything at once.

We use AI substantially to write

The OpenAI report states that “Writing is the most common use case at work, accounting for about 40% of work-related messages.” Anthropic doesn’t actually say much on this in its most recent report, but in its February 2025 report, one of the main findings was that “usage is concentrated in software development and technical writing tasks.” At least 20 of the top-100 use cases I cover are squarely about writing. This makes sense, given that text is the output format predominantly pumped out by generative AI. Given how much writing dominates usage, organizations might see faster ROI from tools that stay closer to the words in presentations, reports, and communications than in more exotic, ambitious applications.

Learning & education is a major category

For Anthropic, education usage grew from 9.3% to 12.4%. For OpenAI, about 10% of all ChatGPT messages are requests for tutoring or teaching. In my research, 20% of my use cases fall into the learning and education bucket—unsurprising, given that this is an “intelligent” technology designed for a cognitively advanced, curious, neotenic species.

Augmentation is here and automation is coming

For Anthropic, directive conversations (automation) rose from 27% to 39% in eight months, and among enterprise API transcripts, 77% are now automation-dominant. For OpenAI, at work the majority (56%) of tasks are doing (rather than expressing or asking). On the social platforms I watch daily, I’m seeing more posts than ever about agentic usage.

Where the Research Diverges

Of course, there are discrepancies too, and these can be illuminating. Let’s dig into a couple:

One discrepancy is how much people are using LLMs for coding. Anthropic reports that 36% of its total sample are using gen AI for coding, whereas my research shows 8%, and OpenAI has coding at just 4.2% of use cases. These are interestingly large deltas. They may be explained partially by Anthropic’s category being slightly broader (it includes math) and partly by the early reputation they gained as a coding assistant. Still, it feels like there’s some further explanatory work to do, by the two LLM firms or others.

Another difference is the apparent discrepancy between my research and OpenAI’s in how much people use LLMs for therapy. Indeed, this is explicitly noted in the original version of OpenAI’s paper. For OpenAI, “relationships and personal reflection” makes up 1.9% of messages, whereas my analysis has “therapy and companionship” as the #1 use case. In fact, this discrepancy is not so big. In my data set, despite being the top, single use case, therapy and companionship actually only accounts for about 4% of the data (bear in mind there are well over 100 use cases). This two-percentage-point difference can be explained by methodological differences.

As senior business leaders rely on data and research to make crucial decisions for implementing AI within their own organizations, it’s important to approach all of the data critically to help separate the signal from the noise, and the patterns from the haze. Consider these tips as you delve into all the research:

Tips for Developing Your Own AI Viewpoint

Follow AI optimists (e.g., Allie Miller), skeptics (e.g., Gary Marcus) and neutrals (e.g., Nicholas Thompson). Don’t get trapped in a filter bubble exacerbated by social media and AI.
Map studies to your own usage of AI—do the findings match what you observe in your organization or in your home life?
Use AI to summarize AI research. (Very meta! This was #34 in my top-100 list.) You can do this yourself or enlist an AI to curate and summarize articles with topics of interest to you.
Read one full scientific paper thoroughly from time to time, rather than just registering the headlines of many. The headlines that we are all so disproportionately exposed to are designed to surprise and shock, not necessarily to reveal essential truth.
Track your organization’s own metrics—don’t rely on external studies. The studies may give you an idea of how effective an application of AI is. It may offer some clues about what to measure and how to measure it. But your context is what matters. Draw what you need from a sanitized, academic study and bring that across to the full technicolor of your real-world business situation.
Look out for the disagreement. When studies dramatically disagree (like the coding usage discrepancies here between Anthropic and OpenAI), something interesting is probably going on. Either there’s more to investigate and no one has yet, or perhaps there is an untapped business opportunity.
Finally, experiment for yourself—at the very least, social listening is a method available to you, as it has been to me.

. . .

The AI momentum is dizzying and relentless. We are having to adapt to it, fast. Individual studies can give us a glimpse of what’s going on. Multiple studies, properly and critically synthesized, give us a still-imperfect but more three-dimensional view, as we’ve seen with the three studies discussed here. They collectively reveal some clear signals: usage is growing rapidly, that usage is concentrated on a relatively small set of practical tasks, automation is on the rise, and use cases that have to do with human connection make the headlines because of how deeply they’re felt and needed.

As we watch AI’s continued evolution, let’s be sure to keep an eye on our own.