Tencent: Hy3 preview (free)

Going away May 8, 2026

tencent/hy3-preview:free

Released Apr 22, 2026262,144 context$0/M input tokens$0/M output tokens

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use. It supports configurable reasoning levels across disabled, low, and high modes, allowing it to balance speed and depth depending on the task, while delivering strong code generation and reliable performance across multi-step, real-world workflows.

Providers for Hy3 preview (free)

OpenRouter routes requests to the best providers that are able to handle your prompt size and parameters, with fallbacks to maximize uptime.

SiliconFlow

Going away May 8, 2026

Latency

4.64s

Throughput

34tps

Uptime

100.0%

Total Context

262.1K

Max Output

262.1K

Input Price

--/M tokens

Output Price

--/M tokens

Performance for Hy3 preview (free)

Compare different providers across OpenRouter

Throughput

SiliconFlow

Avg38 tok/s

Latency

SiliconFlow

Avg4.19 s

E2E Latency

SiliconFlow

Avg19.50 s

Tool Call Error Rate

SiliconFlow

Avg1.66 %

Effective Pricing for Hy3 preview (free)

Actual cost per million tokens across providers over the past hour

Weighted Average

Weighted Avg Input Price

$0.0000

per 1M tokens

Weighted Avg Output Price

$0.0000

per 1M tokens

Provider	Input $/1M	Output $/1M	Cache Hit Rate
SiliconFlow	$0.0000	$0.0000	82.5%

Input Price / 1M tokens (7 days)

Output Price / 1M tokens (7 days)

Apps using Hy3 preview (free)

Top public apps this month

Kilo Code

Kilo Code is an open-source AI coding agent that works across VS Code, JetBrains, and CLI to help developers ship code faster with agentic workflows.

94.8Btokens

Favicon for https://claude.ai/apple-touch-icon.png

Claude Code

Claude Code is Anthropic's agentic coding tool that reads your entire codebase, plans and executes changes across files, runs tests, and iterates on failures, all from natural language prompts.

74.3Btokens

Hermes Agent

Hermes Agent is an open-source, self-improving AI agent by Nous Research that runs persistently with memory across sessions, and builds reusable skills from experience. It comes with 40+ built-in tools, including web search, browser automation, and vision, plus scheduled automations and subagents.

73.8Btokens

OpenClaw

OpenClaw is an open-source AI agent that connects to your messaging apps and takes real actions on your behalf, from running commands and browsing the web to managing files and sending emails.

53.6Btokens

Cline

Cline is an open-source AI coding agent that lives inside your IDE, autonomously exploring your codebase, editing files, running terminal commands, and using browser automation.

15.5Btokens

Recent activity on Hy3 preview (free)

Total usage per day on OpenRouter

Prompt

103B

Completion

2.32B

Reasoning

12.9M

Prompt tokens measure input size. Reasoning tokens show internal thinking before a response. Completion tokens reflect total output length.

Uptime stats for Hy3 preview (free)

Uptime stats for Hy3 preview (free) on the only provider

When an error occurs in an upstream provider, we can recover by routing to another healthy provider, if your request filters allow it. You can access uptime data programmatically through the Endpoints API

Learn more about our load balancing and customization options.

Sample code and API for Hy3 preview (free)

OpenRouter normalizes requests and responses across providers for you.

OpenRouter supports reasoning-enabled models that can show their step-by-step thinking process. Use the reasoning parameter in your request to enable reasoning, and access the reasoning_details array in the response to see the model's internal reasoning before the final answer. When continuing a conversation, preserve the complete reasoning_details when passing messages back to the model so it can continue reasoning from where it left off. Learn more about reasoning tokens.

In the examples below, the OpenRouter-specific headers are optional. Setting them allows your app to appear on the OpenRouter leaderboards.

import { OpenRouter } from "@openrouter/sdk";

const openrouter = new OpenRouter({
  apiKey: "<OPENROUTER_API_KEY>"
});

// Stream the response to get reasoning tokens in usage
const stream = await openrouter.chat.send({
  model: "tencent/hy3-preview:free",
  messages: [
    {
      role: "user",
      content: "How many r's are in the word 'strawberry'?"
    }
  ],
  stream: true
});

let response = "";
for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) {
    response += content;
    process.stdout.write(content);
  }

  // Usage information comes in the final chunk
  if (chunk.usage) {
    console.log("\nReasoning tokens:", chunk.usage.reasoningTokens);
  }
}

Using third-party SDKs

For information about using third-party SDKs and frameworks with OpenRouter, please see our frameworks documentation.

See the Request docs for all possible fields, and Parameters for explanations of specific sampling parameters.

More models from tencent

Hy3 preview

Hunyuan A13B Instruct

Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parameter count of 80B and support for reasoning via Chain-of-Thought. It offers competitive benchmark performance across mathematics, science, coding, and multi-turn reasoning tasks, while maintaining high inference efficiency via Grouped Query Attention (GQA) and quantization support (FP8, GPTQ, etc.).

Tencent: Hy3 preview (free)