lmarena.ai

Image edit models now support multi-turn editing! Instead of trying to fit every edit into one mega-prompt, you can now refine your image step by step. Like a natural back-and-forth conversation. Do it in Battle mode, Side by Side or Direct. Just

0:04 / 0:21

3.2K

lmarena.ai

@lmarena_ai

Create and iterate image generations with the community's favorite models like Gemini-2.5-Flash-Image-Preview aka "nano banana"

! You can now help answer the question: which image gen AI handles multi-turn editing best? Jump in to test, compare, and vote

lmarena.ai

@lmarena_ai

Leaderboard Disrupted! Two new models have entered the Top 10 Text leaderboard:

#6 Qwen3-max-preview (Proprietary) by

@Alibaba_Qwen

#8 Kimi-K2-0905-preview (Modified MIT) by

@Kimi_Moonshot

tied with 7 others. Note that this puts Kimi-K2-0905-preview in a tight race for

12K

lmarena.ai

@lmarena_ai

Remember, your votes drive the leaderboards. Check out Qwen3-max-preview and Kimi-K2-0905-preview for yourself at:

Learn more about Qwen3-max-preview here:

Quote

Qwen

@Alibaba_Qwen

Sep 6

Big news: Introducing Qwen3-Max-Preview (Instruct) — our biggest model yet, with over 1 trillion parameters!

Now available via Qwen Chat & Alibaba Cloud API. Benchmarks show it beats our previous best, Qwen3-235B-A22B-2507. Internal tests + early user feedback confirm:

Breaking News: Gemini-2.5-Flash-Image-Preview (“nano-banana”) by

@GoogleDeepMind

now ranks #1 in Image Edit Arena. In just two weeks:

“nano-banana” has driven over 5 million community votes in the Arena

Record-breaking 2.5M+ votes casted for this model alone

It has

A table ranking AI models in Image Edit Arena. Gemini-2.5-Flash-Image-Preview, labeled "nano-banana," is listed first with an Elo score of 1392. Other models like gpt-4-turbo and qwen-image-v1 are ranked below. The table includes columns for rank, model name, score, votes, organization, and license.

Quote

Google DeepMind

@GoogleDeepMind

Aug 26

0:18

Image generation with Gemini just got a bananas upgrade and is the new state-of-the-art image generation and editing model.

From photorealistic masterpieces to mind-bending fantasy worlds, you can now natively produce, edit and refine visuals with new levels of reasoning,

This was an amazing week at ⁦

@MicrosoftAI

⁩ !! We released MAI 1 preview and a taste of MAI Voice. I’m super happy with this team - only about 100 people and already shipping in ⁦

@lmarena_ai

⁩ in less than a year. Strong support. More soon. Thanks for feedback!

0:02 / 2:06

Big Reveal: who was "Nano Banana?" The anonymous model, “nano-banana,” that caught the world's attention with its ability to follow complex instructions, preserve character identity, and maintain contextual details was: Gemini-2.5-Flash-Image-Preview by

@GoogleDeepMind

0:01 / 0:42

Quote

Google DeepMind

@GoogleDeepMind

Aug 26

0:18

Image generation with Gemini just got a bananas upgrade and is the new state-of-the-art image generation and editing model.

From photorealistic masterpieces to mind-bending fantasy worlds, you can now natively produce, edit and refine visuals with new levels of reasoning,

Our first open model has landed on the Search leaderboard! Diffbot-small-xl by

@diffbot

debuts at #9 (Apache 2.0) We look forward to more models with search capabilities contributing to ecosystem progress!

Quick reminder that we have turned off style control by default from Search Arena. The effects of response style are different for search than they are for ordinary chat so we'll be defaulting this leaderboard to ordinary Bradley-Terry while we study search specific style

Come test out all the best AI models with search capabilities at:

Top 10 Leaderboard Disrupted

DeepSeek V3.1 and DeepSeek v3.1 thinking by

@deepseek_ai

have landed in the Arena, both ranked at #8. A few highlights:

DeepSeek V3.1 is in the Top 3 for Math, Creative Writing & Longer Query

DeepSeek V3.1 thinking comes in #3 for

Quote

DeepSeek

@deepseek_ai

Aug 21

Introducing DeepSeek-V3.1: our first step toward the agent era!

Hybrid inference: Think & Non-Think — one model, two modes

Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528

Stronger agent skills: Post-training boosts tool use and

The new open models by

@DeepSeek_AI

(MIT license) are tied with Grok-4, Kimi K2, Claude Opus 4, Qwen 3-235b-a22b-Instruct, and even it's sibling—DeepSeek R1—making this an incredibly competitive race.

The rankings are shifting fast! Don’t expect this leaderboard to stay put. Test out DeepSeek V3.1 and all the best frontier models at:

hello, world!

Quote

lmarena.ai

@lmarena_ai

Aug 29

Text Leaderboard Update: A new model provider, @MicrosoftAI has broken into the Top 15 this week!

MAI-1-preview by @MicrosoftAI debuts at #13. Congrats to the Microsoft AI team! As the Text Arena is one of the most competitive races, breaking into the Top 15 is no small x.com/mustafasuleyma…

Text Leaderboard Update: A new model provider,

@MicrosoftAI

has broken into the Top 15 this week!

MAI-1-preview by

@MicrosoftAI

debuts at #13. Congrats to the Microsoft AI team! As the Text Arena is one of the most competitive races, breaking into the Top 15 is no small

$A table listing AI model rankings with columns for rank, model, score, win rate, and license. Microsoft AI\'s MAI-1-preview is highlighted at rank 13 with a score of 1402, win rate of 40%, and proprietary license. The lmarena.ai logo is at the top left.$

Quote

Mustafa Suleyman

@mustafasuleyman

Aug 29

Replying to @mustafasuleyman

Introducing MAI-1-preview - our first foundation model trained end to end in house - in public testing on LMArena - we’re excited to be actively spinning the flywheel to deliver improved models

Learn more about MAI-1-preview on the

@MicrosoftAI

blog at:

Two in-house models in support of our mission

Come test out MAI-1-preview and all the top best AI models at:

GPT-5 by

and Claude Opus 4.1 by

@anthropicAI

have just debuted on the Search Leaderboard. Both came into the Arena strong across categories in the Text Arena, and we've been waiting to see what the community thinks about their search capabilities. Here's how they've

These state-of-the-art models are starting to show their broad flexibility and performance when it comes to real-world use cases. The community has noticed, and the competition just keeps heating up

. Check out the all the leaderboards here:

Test and push search-enabled models with your toughest prompts at: lmarena.ai/?chat-modality Be sure to have the

search icon selected before you prompt!

LMArena

From lmarena.ai

6.3K