Try Kimi K2 0905

256K context • Enhanced agentic coding • MoE architecture

Online

•

Kimi K2 0905 Assistant

256K context • agentic coding

10 free messages left

Hi! I'm Kimi K2 0905

A high-capability MoE model with extended context and stronger agentic coding performance.

💡 Try asking:

"Explain quantum computing"

🎯 Or try:

"Write a Python function"

📝 Or even:

"Help me with homework"

🚀 And more:

"Create a business plan"

Enable web search (Standard/Premium only)

⌘/Ctrl + Enter to send•Shift + Enter for new line

10 free messages

🚀

Fast Response

Get instant answers powered by our optimized infrastructure

🔒

Privacy First

Your conversations are secure and never used for training

💎

Premium Features

Kimi K2 0905 at a glance

Scale and long-context capacity for complex coding tasks.

Total Parameters

Mixture-of-Experts architecture

Activated Parameters

32B

Per-token active compute

Context Window

256K

Long-horizon reasoning

Experts

384

8 experts selected per token

Key Improvements

Purpose-built for long-horizon coding

K2-Instruct-0905 upgrades the K2 line with longer context and stronger agentic coding performance.

256K Context Window

Expanded from 128K to 256K tokens to support long-horizon tasks.

Enhanced Agentic Coding

Improved performance on public benchmarks and real-world coding agent tasks.

Improved Frontend Coding

Advancements in aesthetics and practicality for frontend programming.

Tool Calling

Strong tool-calling capabilities for autonomous task execution.

Model Summary

Core architecture specifications for K2-Instruct-0905.

61 Layers (1 Dense)

61 layers total with one dense layer included.

MLA Attention

64 attention heads and 7168 hidden dimension.

MoE Hidden Dim 2048

Per-expert hidden dimension of 2048.

SwiGLU Activation

SwiGLU activation function across the model.

160K Vocabulary

Large vocabulary for robust language understanding.

MLA + MoE Routing

8 experts selected per token with 1 shared expert.

Benchmark Highlights

Results reported for K2-Instruct-0905.

SWE-Bench Verified

69.2 ± 0.63 accuracy on verified SWE-Bench.

SWE-Bench Multilingual

55.9 ± 0.72 accuracy on multilingual SWE-Bench.

Terminal-Bench

44.5 ± 2.03 accuracy on Terminal-Bench.

SWE-Dev

66.6 ± 0.72 accuracy on SWE-Dev.

Deployment & Usage

Operational guidance and recommended settings.

OpenAI & Anthropic Compatible API

Accessible via platform.moonshot.ai with OpenAI/Anthropic-compatible formats.

Anthropic Temperature Mapping

Anthropic-compatible API maps real_temperature = request_temperature × 0.6.

Block-FP8 Checkpoints

Model checkpoints are stored in block-fp8 format on Hugging Face.

Recommended Inference Engines

vLLM, SGLang, KTransformers, and TensorRT-LLM.

Recommended Temperature

Suggested temperature = 0.6 for K2-Instruct-0905.

Tool Calling Support

Pass available tools in each request; the model decides when to invoke them.

FAQ

Kimi K2 0905 FAQ

Key details for builders and researchers.

What is Kimi K2 0905?

Kimi K2-Instruct-0905 is a high-capability MoE language model with 1T total parameters and 32B activated parameters.

What is the context length?

The context window is 256K tokens, expanded from 128K.

What temperature is recommended?

The recommended temperature is 0.6 for general use.

Which inference engines are recommended?

vLLM, SGLang, KTransformers, and TensorRT-LLM are recommended.

Build with Kimi K2 0905

Start with the API or explore pricing to scale usage.