Applications for all Summer 2025 streams are now open!

Applications due April 18

Application process

Steps in the application process

  1. Create an application account. You’ll use this to access all applications materials.

  2. Submit a MATS pre-application by April 18. This is required by all streams.

  3. Submit applications to the MATS stream(s) you want to work with. You must submit at least one stream-specific application for your MATS application to be considered. You can and should apply to all of the MATS streams that interest you! Most stream applications will be due April 18. See a list of the streams and their applications below.

  4. Complete additional evaluations. Depending on the streams you apply to, you may be required to complete a coding screen, interviews, or other evaluations after submitting your application. The process, however, is not standardized between streams; not being contacted for an interview does not necessarily mean that your application is not in consideration.

Tips for applying

  • Make sure to check your spam folders for emails! You may wish to automatically filter for emails from applications@matsprogram.org to ensure you don’t miss any emails.

  • Submit your application materials early. In the past, some applicants have had technical problems in the hour leading up to the application deadline. Additionally, applications are reviewed on a rolling basis.

  • Mentors will primarily evaluate candidates based on the submission of their own stream-specific applications, though all mentors will have access to application materials submitted to other streams.

Summer 2025 Tracks

To decide which mentor(s) to apply to, applicants are able to filter below by track and research interest. We recommend reading through the different streams’ proposed research projects and mentorship style to assess personal fit.

Click on each track title to read a brief description.

Summer 2025 Streams

A previous version of this website had a stream for just Nicholas Carlini. If you want to work with Nicholas Carlini, please apply to this stream instead.

MATS Research Streams
Filter by Track:
Filter by Research Interest:

Oversight & control (8 streams)

Mentor headshotMentor headshot

Jason Gross, Rajashree Agrawal

Independent, [TBD, new AI security org]

Software verification has been labor bottlenecked, often requiring PhD-level engineers to write 10x - 100x more verification code than the code being verified. Now, progress in advanced mathematical reasoning can be leveraged for progress in software verification. Our work is geared at using formal methods to build provably robust oversight methods for AI-generated software.

Control, Red-teamingScalable oversightNational securityInformation security
Mentor headshot

Samuel Albanie

Google DeepMind

This stream will focus primarily on projects relating to AI monitoring and control.

Control, Red-teaming
Mentor headshotMentor headshotMentor headshotMentor headshotMentor headshotMentor headshotMentor headshotMentor headshotMentor headshotMentor headshot

Ethan Perez, Buck Shlegeris, Samuel Marks, Joe Benton, Evan Hubinger, Mrinank Sharma, Fabien Roger, Kyle Fish, Stephen McAleer, Nicholas Carlini

Anthropic, Redwood Research, Anthropic alignment science, Anthropic, Anthropic, Anthropic, Anthropic, Anthropic, OpenAI, Anthropic

The Anthropic-Redwood stream spans a range of empirical research areas in AI safety on LLMs, ranging from AI control to scalable oversight and model organisms. You’ll be pitched, and have the option to pitch, a variety of safety research projects, and then be matched to projects and mentors based on your interests/preferences on research and what you’d like to get out of MATS. Scholars in this stream frequently receive funding and continued mentorship after MATS to complete their research project, usually leading to a (co-)first author paper. People in this stream often end up in long-term homes for safety research after MATS (e.g. Anthropic).

Control, Red-teamingDangerous capability evals/demosAlignment evals/demosScalable oversightOtherCooperative AINational securityInformation security
Mentor headshot

Micah Carroll

UC Berkeley

Studying and demonstrating manipulative behaviors which emerge due to RL and/or mitigating such behaviors. Studying CoT faithfulness and how it’s affected by different forms of training.

Value alignmentAlignment evals/demosScalable oversight
Mentor headshot

Joshua Clymer

Redwood Research

Alignment evaluations and control evaluations (e.g. evals for collusion, white-box techniques, etc)

Control, Red-teamingAlignment evals/demos
Mentor headshot

Marius Hobbhahn

Apollo Research

This stream is generally focused on projects related to scheming. In this cohort, we will build a black-box monitor for scheming in complex agentic settings.

Control, Red-teamingAlignment evals/demos
Mentor headshotMentor headshotMentor headshot

Scott Emmons, David Lindner, Erik Jenner

Google DeepMind, Google DeepMind, UC Berkeley / Center for Human-Compatible AI

This stream will focus on monitoring, stress-testing safety methods, and evals, likely with a focus on risks from scheming AIs. Examples include (black-box) AI control techniques, white-box monitors (probes etc.), chain-of-thought monitoring/faithfulness, as well as building evaluation environments for all of these.

Control, Red-teamingAlignment evals/demos
Mentor headshot

Tomek Korbak

UK AISI

I think AI control is the most tractable approach to reducing risks from misaligned AI. I’m excited to work with mentees interested in empirical projects building and evaluating control measures for LLM agents. An ideal project ends with a paper submitted to NeurIPS/ICML/ICLR.

UK AISI might be able to provide additional financial, logistics and engineering support to projects.

Control, Red-teamingDangerous capability evals/demos

Evaluations (5 streams)

Mentor headshotMentor headshot

Oliver Sourbut (Oly), Sid Black

UK AI Safety Institute, UK AISI

We are interested in supporting scholars who evaluate autonomy and loss of control risks from AI systems. Our focus areas include sandbagging and deception, long-horizon agentic evaluations, control evaluations, and predicting future evaluation results. We strongly prefer scholars work out of London, but this is not an absolute deal-breaker.

Dangerous capability evals/demosAlignment evals/demosControl, Red-teamingInformation security
Mentor headshot

Owain Evans

Truthful AI, UC Berkeley

We empirically research topics related to emergent misalignment, self-awareness for LLMs, and faithfulness in reasoning models. I have mentored 35+ researchers in AI safety, and past MATS projects have resulted in papers “Emergent Misalignment”, “The Reversal Curse” and many other papers. My past mentees (https://www.truthfulai.org/about#mentees) have gone on to work at UK & US AISIs, Anthropic, Apollo Research, OpenAI, Transluce, GDM, etc, as well as joining my group full-time.

Note: We will send out an assignment to selected candidates during the candidate evaluation phase. This assignment will involve coding and writing a short research experiment. Please be able to spend 1-2 days to complete the assignment. Because of this, we encourage applicants to apply earlier so that we can evaluate them on a rolling basis.

Control, Red-teamingValue alignmentAlignment evals/demosOtherDangerous capability evals/demos
Mentor headshot

Mary Phuong

Google Deepmind

This stream will run in-person in London, with scholars working in pairs or small groups. Before the program starts, I’ll share a few project proposals with potential scholars for consideration. Topic wise the focus will be on scheming precursor capabilities (and how to measure them) and control protocols (resolving design uncertainties, prototyping, red-teaming).

Dangerous capability evals/demosControl, Red-teaming
Mentor headshot

Francis Rhys Ward

Imperial College London

Currently I’m focusing on projects which evaluate sabotage, sandbagging, and oversight subversion in frontier agents. Additionally, I’m interested in coming up with novel alignment or control plans.

Control, Red-teamingDangerous capability evals/demosAlignment evals/demos
Mentor headshotMentor headshotMentor headshot

Dawn Song, Yiyou Sun, Xuandong Zhao

UC Berkeley, UC Berkeley, UC Berkeley

Research in our group spans multiple areas in AI, focusing on improving the reliability, interpretability, and security of large language models (LLMs).

Mechanistic interpretabilityAgent foundationsControl, Red-teamingDangerous capability evals/demosInformation securityAlignment evals/demosOther

Governance (9 streams)

Mentor headshotMentor headshot

Girish Sastry, Steven Adler

Independent , OpenAI (former)

What technical governance measures can AI labs implement to improve the safety of their own models? What technical governance measures can governments lean on to make sure that labs’ AI systems remain safe? What are useful ways to identify or control these risks of an AI system in-practice?

AI governance, policyControl, Red-teamingDangerous capability evals/demosNational security
Mentor headshot

Mauricio Baker

RAND

This stream focuses on AI policy, especially technical governance topics. Tentative project options include: technical projects for verifying AI treaties, metascience for AI safety and governance, and proposals for tracking AI-caused job loss. Scholars can also propose their own projects.

AI governance, policy
Mentor headshot

Benjamin Bucknall

University of Oxford

This stream will focus on questions in and about technical AI governance – that is, technical analysis and tools for supporting the effective governance of AI. Particular focus will be placed on questions regarding third-party scrutiny into AI systems and developers.

AI governance, policy
Mentor headshot

Alan Chan

GovAI

New policies and technical tools will be needed to prepare for and manage a world with ubiquitous, human-level or above AI agents. This stream is about developing such policies and tools.

AI governance, policyCooperative AI
Mentor headshot

Matthew Gentzel

Longview Philanthropy

Escalation risks from state perceptions of AI capability, AI-enabled targeting, AI-enabled decision manipulation, and the impact of AI integration into nuclear command and control.

National securityAI governance, policyDangerous capability evals/demosInformation securityOther
Mentor headshotMentor headshot

Eli Lifland, Daniel Kokotajlo

AI Futures Project , AI Futures Project

We are interested in mentoring projects in AI forecasting and governance. We are currently working on a detailed mainline scenario forecast, and this work would build on that to either do more scenario forecasting or explore how to positively affect key decision points, informed by our scenario.

AI governance, policyNational security
Mentor headshot

David Krueger

University of Montreal

Project building on gradual disempowerment OR choose your own adventure.

AI governance, policyOtherCooperative AIValue alignmentControl, Red-teaming
Mentor headshotMentor headshot

Gabriel Kulp, Jacob Lagerros

RAND, Ulyssean

Our team explores the hardware hacking abilities of present AI models by studying threat-models like self-exfiltration, covert communication, and sabotage. We measure side-channels of GPUs (like power consumption or electromagnetic radiation) while the GPU is running an LLM, then tune the LLM to control these measurements.

Information securityNational securityDangerous capability evals/demosAI governance, policy
Mentor headshot

Lisa Thiergart

Institute for Security and Technology

Contributing to technical projects of the Security Level 5 Task Force, which include (non exhaustive list) research to help define industry standards for SL5 together with frontier labs, building prototypes of particular components such as inter accelerator bandwidth limiters, contributing to designing components of air-gapped ML dev environments, contributing to testing and iteration processes, researching security-productivity tradeoffs and designing and testing mitigations.

National securityInformation security

Security (2 streams)

Mentor headshotMentor headshot

Florian Tramèr, Daniel Paleka

ETH Zurich, ETH Zurich

The desired output of the stream is a first draft of a good academic research paper. In our typical paper, we break claimed soft guarantees on AI systems by designing attacks that probe the worst-case performance of a system.

Information securityControl, Red-teamingDangerous capability evals/demosAlignment evals/demos
Mentor headshot

Keri Warr

Anthropic

Implementing SL4/5 and searching for differentially defense-favored security tools.

Information security

Interpretability (4 streams)

Mentor headshot

Neel Nanda

Google DeepMind

Applications for Neel’s stream closed on February 28th.

Mechanistic interpretability
Applications Closed
Mentor headshotMentor headshot

Adam Shai, Paul Riechers

Simplex, Simplex, Astera Institute

In this stream we will explore extensions and implications of our discovery that neural networks pretrained on next-token prediction represent belief-state geometry in their activations. We will build on this fundamental theory of neural network representations in order to discover what AI systems are thinking, and understand their emergent behaviors.

Mechanistic interpretabilityAgent foundationsOther
Mentor headshot

Lee Sharkey

Apollo Research

Lee’s stream will focus primarily on improving mechanistic interpretability methods (sometimes known as ‘fundamental interpretability’ research).

Mechanistic interpretability
Mentor headshot

Hidenori Tanaka

Harvard/NTT Research

As AI systems become more capable and human-like, ensuring alignment requires going beyond traditional benchmarking methods and interpretability techniques. Our research stream introduces a novel paradigm, “cognitive alignment,” blending cognitive science, physics, neuroscience, and psychology to develop rigorous mathematical frameworks. Our goal is to precisely characterize how AI systems represent and interpret their environment, enabling control over AI behavior to ensure safety and trustworthiness.

Concept-based interpretabilityCooperative AIScalable oversightMechanistic interpretability

AI agency (6 streams)

Mentor headshot

Andreea Bobu

MIT CSAIL

We are broadly interested in AI agents learning to do tasks for, with, and around humans. Our main research motivations are ensuring that these agents are value aligned with the humans they are meant to support, whether the human is an expert designer, a novice end user, or a stakeholder of the AI system. The work that we do involves reward learning, learning from (potentially multiple) kinds of human feedback, active learning, representation learning, quantifying misalignment.

Value alignmentCooperative AI
Mentor headshotMentor headshot

Alex Turner, Alex Cloud

Google DeepMind, Independent

In the shard theory stream, we create qualitatively new methods and fields of inquiry, from steering vectors to gradient routing to unsupervised capability elicitation. If you’re theory-minded, maybe you’ll help us formalize shard theory itself.

Scalable oversightValue alignment
Mentor headshot

Michael Dennis

Google DeepMind

Progress in AI is driven towards better solving problem specifications like next-token prediction, the policy gradient loss, the diffusion denoising-loss, the objective in RLHF or the debate. In this stream we will build towards an understanding of how the objectives of a system predict its behavior and failure modes, and aim to design objectives whose failures are more likely to be benign. The approach tends to be a mixture of game and decision theoretic analysis of objectives, and empirical implementation of the systems to demonstrate the intended effects.

Agent foundationsCooperative AI
Mentor headshot

Lewis Hammond

Cooperative AI Foundation / University of Oxford

Projects in my stream largely focus on multi-agent safety, cooperative AI, and/or governing AI agents. Several projects are also relevant to scalable oversight. They range from conceptual/theoretical to empirical, and include a couple of AI governance/ethics projects for those with less technical backgrounds.

Cooperative AIScalable oversightAgent foundationsAI governance, policy
Mentor headshot

Richard Ngo

Independent

This stream includes two projects. The first is an agent foundations project aiming to further develop a unified theory of intelligent agency. The second is an AI governance project on designing unprecedentedly trustworthy institutions.

Agent foundationsAI governance, policy
Mentor headshot

Fernando Rosas

University of Sussex

Interested in applications of computational mechanics to AI interpretability, with particular interest in RL and also transformers. I’m also interest in the application of formal notions of emergence to interpretablity

Agent foundationsCooperative AIMechanistic interpretability
Data last updated: 4/7/2025 10:42:27 PM