More
More
Stars
Trae Agent is an LLM-based agent for general purpose software engineering tasks.
Harbor is a framework for running agent evaluations and creating and using RL environments.
[ICLR2026🔥Oral] SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Edit Banana: A framework for converting statistical formats into editable.
Advances and Frontiers of LLM-based Issue Resolution in Software Engineering A Comprehensive Survey
The Github repo for our survey paper: "Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models"
DART-GUI: Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation
SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving
💻 Terminal-Agent with Human-in-the-Loop Learning
[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training
Coding problems used in aider's polyglot benchmark
A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architectures
Repo2Run is an LLM-based agent that automates environment configuration by generating error-free Dockerfiles for Python repositories.
[ICLR 2026] MMSearch-Plus: Benchmarking Provenance-Aware Search For Multimodal Browsing Agents
Tongyi Deep Research, the Leading Open-source Deep Research Agent
SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
[NeurIPS 2025 D&B Spotlight] Scaling Data for SWE-agents
SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]