Try Kimi K2 0905

256K context • Enhanced agentic coding • MoE architecture

Online
Powered by K2-Instruct-0905 (1T total, 32B active)

Kimi K2 0905 Assistant

256K context • agentic coding

10 free messages left

Hi! I'm Kimi K2 0905

A high-capability MoE model with extended context and stronger agentic coding performance.

💡 Try asking:

"Explain quantum computing"

🎯 Or try:

"Write a Python function"

📝 Or even:

"Help me with homework"

🚀 And more:

"Create a business plan"

⌘/Ctrl + Enter to sendShift + Enter for new line
10 free messages
🚀

Fast Response

Get instant answers powered by our optimized infrastructure

🔒

Privacy First

Your conversations are secure and never used for training

💎

Premium Features

Sign in to unlock API access and unlimited conversations

Kimi K2 0905 at a glance

Scale and long-context capacity for complex coding tasks.

Total Parameters

1T

Mixture-of-Experts architecture

Activated Parameters

32B

Per-token active compute

Context Window

256K

Long-horizon reasoning

Experts

384

8 experts selected per token

Key Improvements

Purpose-built for long-horizon coding

K2-Instruct-0905 upgrades the K2 line with longer context and stronger agentic coding performance.

256K Context Window

Expanded from 128K to 256K tokens to support long-horizon tasks.

Enhanced Agentic Coding

Improved performance on public benchmarks and real-world coding agent tasks.

Improved Frontend Coding

Advancements in aesthetics and practicality for frontend programming.

Tool Calling

Strong tool-calling capabilities for autonomous task execution.

Model Summary

Core architecture specifications for K2-Instruct-0905.

61 Layers (1 Dense)

61 layers total with one dense layer included.

MLA Attention

64 attention heads and 7168 hidden dimension.

MoE Hidden Dim 2048

Per-expert hidden dimension of 2048.

SwiGLU Activation

SwiGLU activation function across the model.

160K Vocabulary

Large vocabulary for robust language understanding.

MLA + MoE Routing

8 experts selected per token with 1 shared expert.

Benchmark Highlights

Results reported for K2-Instruct-0905.

SWE-Bench Verified

69.2 ± 0.63 accuracy on verified SWE-Bench.

SWE-Bench Multilingual

55.9 ± 0.72 accuracy on multilingual SWE-Bench.

Terminal-Bench

44.5 ± 2.03 accuracy on Terminal-Bench.

SWE-Dev

66.6 ± 0.72 accuracy on SWE-Dev.

Deployment & Usage

Operational guidance and recommended settings.

OpenAI & Anthropic Compatible API

Accessible via platform.moonshot.ai with OpenAI/Anthropic-compatible formats.

Anthropic Temperature Mapping

Anthropic-compatible API maps real_temperature = request_temperature × 0.6.

Block-FP8 Checkpoints

Model checkpoints are stored in block-fp8 format on Hugging Face.

Recommended Inference Engines

vLLM, SGLang, KTransformers, and TensorRT-LLM.

Recommended Temperature

Suggested temperature = 0.6 for K2-Instruct-0905.

Tool Calling Support

Pass available tools in each request; the model decides when to invoke them.

FAQ

Kimi K2 0905 FAQ

Key details for builders and researchers.

1

What is Kimi K2 0905?

Kimi K2-Instruct-0905 is a high-capability MoE language model with 1T total parameters and 32B activated parameters.

2

What is the context length?

The context window is 256K tokens, expanded from 128K.

3

What temperature is recommended?

The recommended temperature is 0.6 for general use.

4

Which inference engines are recommended?

vLLM, SGLang, KTransformers, and TensorRT-LLM are recommended.

Build with Kimi K2 0905

Start with the API or explore pricing to scale usage.