Aymeric Roucher

    The codist monk

    What better time to live in? We're at a turning point for human history, and my craft is the lever that can lift the world.

    I studied at Polytechnique and Cambridge, then worked at Hugging Face, where I led the agent development team.

    Now I'm exploring the next steps.

    SmolLM3's Tool Use Post-training

    SmolLM3's Tool Use Post-training

    |

    Jul 2025

    Helped the Smol team post-train SmolLM3 to be better at tool-use: this was my discovery of SOTA model training. It was really fun, and the model reached the Pareto Frontier on tool use. But there would still be lots to do!

    ScreenSuite

    ScreenSuite

    |

    May 2025

    Developed ScreenSuite, a comprehensive benchmarking suite for evaluating GUI agents across perception, single-step, and multi-step agentic behavior. Built on smolagents, it provides a standardized evaluation framework for Multi-Modal LLMs interacting with graphical interfaces.

    Open Computer Agent

    Open Computer Agent

    |

    Mar 2025

    GUI agents are agents where the LLM interacts directly with a GUI through vision: the first renowned one is Claude Computer Use. At Hugging Face, I participated to building an open version of it.

    Open Deep Research

    Open Deep Research

    |

    Feb 2025

    Built an open version of OpenAI's Deep Research in 24 hours, reaching SOTA results on GAIA benchmark. (to be fair I had started building on this idea before they published)

    Smolagents

    Smolagents

    |

    Dec 2024

    Smolagents is a leading framework for agents, with over 20,000 stars on GitHub. It's the first framework to propose agents that think in code.

    Predicting Wine Prices from the Weather

    Predicting Wine Prices from the Weather

    |

    Aug 2021

    Developed a predictive model for Bordeaux wine prices using weather data across grape growth stages. Despite agriculture experts telling this project was doomed to fail, this model predicts vintage quality more accurately than expert ratings.