17 hours ago - Technology
Series / What AI knows about you

What Google's AI knows about you

Illustration of hands on a laptop surrounded by glitchy shapes and the Google logo.

Illustration: Maura Losch/Axios

Google knows all about most of us — our email, our search queries, often our location and our photos — but the search giant isn't using most of that data to train its AI models.

The big picture: Google trains most of its models on content that's publicly available on the web — but free and experimental Google products do often come with caveats that your data could be used to train the company's AI.

In this series on What AI Knows About You, Axios is looking, company by company, at how the data-hungry AI industry uses its own customers' information to create and refine its products.

  • Today's AI developers don't face any requirement to divulge the exact sources for their training data — but under various privacy laws, they do have to share what customer data they collect and how they use it.

Catch up quick: Google has released a host of generative AI services for businesses and consumers, including its standalone Gemini chatbot (formerly Bard) and AI features within other products, including Maps and Photos.

Zoom in: Google says it largely trains its AI models the way other companies do: using huge swaths of the "publicly available" open web. The company says it gives publishers the ability to control whether their data is used to train future models.

  • Broadly speaking, Google says business users' interactions with Gemini won't be used for training the model, and the company doesn't use corporate data stored in Google Cloud or Workspace to train or improve its systems.
  • On the consumer side, Google does use data shared with its Gemini chatbot to improve and develop its products and services, though individuals have some control based on various settings.
  • Google says your Gmail inbox, your location data and what you type into search aren't used to train Google's foundation models. But your interactions with Gemini typically are.

For Google-owned YouTube, the rules are somewhat different.

  • YouTube confirmed in a blog post earlier this year that it makes use of content uploaded to YouTube to create and improve its own services, including for the development of AI products. It also said it takes exception to other companies' using YouTube content to train their AI models (as OpenAI, Apple, Anthropic and others reportedly have).
  • "As we have for many years, we use content uploaded to YouTube to improve the product experience for creators and viewers across YouTube and Google, including through machine learning and AI applications," Google said. "This encompasses powering our Trust & Safety operations, improving our recommendation systems, and developing new generative AI features like auto dubbing."

Google has different policies for some products that are in its experimental "labs" programs, even for corporate customers. For example, data from users of Google Workspace Labs — including prompts — can be used to improve its services.

Between the lines: Consumers using Google's standard AI services have some choice in how their data is collected and stored.

  • Gemini Apps Activity lets users control whether their conversations with Gemini are stored and potentially used for training new models. It's "on" by default for users 18 and over, and "off" by default for those under 18 (who can choose to turn it on).
  • Ordinarily Google stores such activity for 18 months, though it can also be set to be retained for three months or 36 months.

Go deeper:

Go deeper
Nov 4, 2024 - Technology
Series / What AI knows about you

What AI knows about you

Illustration: Maura Losch/Axios

Most AI builders don't say where they are getting the data they use to train their bots and models — but legally they're required to say what they are doing with their customers' data.

The big picture: These data-use disclosures open a window onto the otherwise opaque world of Big Tech's AI brain-food fight.

Nov 5, 2024 - Technology
Series / What AI knows about you

Meta's AI feasts on user data

Illustration: Maura Losch/Axios

As Facebook's parent company has aggressively injected Meta AI into its products, it has fueled its systems with a bounty of customer data.

Why it matters: Meta knows a ton about you and billions of other people. It's counting on that knowledge to build AI that's both powerful and relevant — and it's also limiting customers' ability to say "no" to this use of their information.

Nov 12, 2024 - Technology
Series / What AI knows about you

Adobe's pledge: We won't train AI with your data

Illustration: Maura Losch/Axios

Adobe says its stance on using customers' data to train its AI models is simple and absolute: It doesn't.

Why it matters: Adobe's software is widely used by many of the creative professionals who feel their livelihoods are under threat by generative AI.