API Documentation

Overview

The Kimi K2 API provides programmatic access to the Kimi K2 language model. This API supports both OpenAI and Anthropic message formats, allowing seamless integration with existing applications.

Base URL

https://kimi-k2.ai/api

Supported Protocols

REST API over HTTPS
JSON request and response bodies
UTF-8 character encoding
CORS support for browser-based applications

Quick Start

Get started with the Kimi K2 API in three steps:

Create an account and receive 100 free credits
Generate an API key from your dashboard
Make your first request (1 credit per request)

Example Request

curl https://kimi-k2.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $KIMI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kimi-k2-0905",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Authentication

API Keys

Authentication is performed using API keys. Include your API key in the request header:

Authorization: Bearer YOUR_API_KEY

Or for Anthropic-compatible endpoints:

X-API-Key: YOUR_API_KEY

Authentication Methods

Method	Header	Format	Endpoints
Bearer Token	`Authorization`	`Bearer YOUR_API_KEY`	`/v1/chat/completions`
API Key	`X-API-Key`	`YOUR_API_KEY`	`/v1/messages`

API Reference

List Models

List all available models that can be used with the API.

List Available Models

GET /v1/models

Returns a list of models available for use with the API.

Response Format

{
  "object": "list",
  "data": [
    {
      "id": "kimi-k2",
      "object": "model",
      "created": 1735785600,
      "owned_by": "moonshot-ai",
      "permission": [...],
      "root": "kimi-k2",
      "parent": null
    },
    {
      "id": "kimi-k2-0905",
      "object": "model",
      "created": 1735785600,
      "owned_by": "moonshot-ai",
      "permission": [...],
      "root": "kimi-k2-0905",
      "parent": null
    }
  ]
}

Response Fields

Field	Type	Description
`object`	string	Always `list`
`data`	array	List of available models
`data[].id`	string	Model identifier to use in API requests
`data[].object`	string	Always `model`
`data[].owned_by`	string	Organization that owns the model

Chat Completions

The Chat Completions API generates model responses for conversations. This endpoint is compatible with OpenAI's API format.

Create Completion

POST /v1/chat/completions

Generates a model response for the given conversation.

Request Format

{
  "model": "kimi-k2-0905",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user", 
      "content": "Explain quantum computing"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 2048,
  "top_p": 1.0,
  "frequency_penalty": 0,
  "presence_penalty": 0,
  "stream": false,
  "n": 1
}

Parameters

Parameter	Type	Required	Default	Description
`model`	string	Yes	-	Model identifier. Use `kimi-k2`
`messages`	array	Yes	-	Input messages. Each message has a `role` and `content`
`temperature`	number	No	0.6	Sampling temperature between 0 and 2. Lower values make output more deterministic
`max_tokens`	integer	No	1024	Maximum tokens to generate. Model maximum is 128000
`top_p`	number	No	1.0	Nucleus sampling threshold. Alternative to temperature
`frequency_penalty`	number	No	0	Penalize repeated tokens. Range: -2.0 to 2.0
`presence_penalty`	number	No	0	Penalize tokens based on presence. Range: -2.0 to 2.0
`stream`	boolean	No	false	Stream responses incrementally
`n`	integer	No	1	Number of completions to generate
`stop`	string/array	No	null	Stop sequences (up to 4)
`user`	string	No	null	Unique identifier for end-user tracking

Message Object

Field	Type	Description
`role`	string	One of: `system`, `user`, `assistant`
`content`	string	Message content

Response Format

{
  "id": "chatcmpl-9d4c2f68-5e3a-4b2f-a3c9-7d8e6f5c4b3a",
  "object": "chat.completion",
  "created": 1709125200,
  "model": "kimi-k2-0905",
  "system_fingerprint": "fp_a7c4d3e2",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Quantum computing leverages quantum mechanical phenomena..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 189,
    "total_tokens": 214
  }
}

Response Fields

Field	Type	Description
`id`	string	Unique request identifier
`object`	string	Object type: `chat.completion`
`created`	integer	Unix timestamp
`model`	string	Model used
`choices`	array	Generated completions
`usage`	object	Token usage statistics

Finish Reasons

Value	Description
`stop`	Natural end of message or stop sequence reached
`length`	Maximum token limit reached

Streaming

Server-sent events format when stream: true:

data: {"id":"chatcmpl-...","choices":[{"delta":{"content":"Hello"},"index":0}]}

data: {"id":"chatcmpl-...","choices":[{"delta":{"content":" there"},"index":0}]}

data: [DONE]

Messages

The Messages API provides Anthropic-compatible message generation.

Create Message

POST /v1/messages

Creates a model response using the Messages format.

Request Format

{
  "model": "kimi-k2-0905",
  "messages": [
    {
      "role": "user",
      "content": "What is the capital of France?"
    }
  ],
  "max_tokens": 1024,
  "system": "You are a knowledgeable geography assistant.",
  "temperature": 0.7,
  "top_p": 1.0,
  "stop_sequences": ["\n\nHuman:"]
}

Parameters

Parameter	Type	Required	Default	Description
`model`	string	Yes	-	Model identifier
`messages`	array	Yes	-	Conversation messages (user/assistant only)
`max_tokens`	integer	Yes	-	Maximum tokens to generate
`system`	string	No	null	System prompt for behavior guidance
`temperature`	number	No	0.6	Sampling temperature (0-1)
`top_p`	number	No	1.0	Nucleus sampling threshold
`stop_sequences`	array	No	null	Stop generation sequences (max 4)
`stream`	boolean	No	false	Enable streaming responses
`metadata`	object	No	null	Request metadata

Response Format

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "The capital of France is Paris."
    }
  ],
  "model": "kimi-k2-0905",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 15,
    "output_tokens": 9
  }
}

Response Fields

Field	Type	Description
`id`	string	Unique message identifier
`type`	string	Object type: `message`
`role`	string	Always `assistant`
`content`	array	Message content blocks
`model`	string	Model used
`stop_reason`	string	Why generation stopped
`usage`	object	Token usage

System Prompts

System prompts in the Messages API are specified separately:

{
  "system": "You are Claude, an AI assistant created by Anthropic.",
  "messages": [
    {"role": "user", "content": "Hello"}
  ],
  "max_tokens": 1024
}

Models

Available Models

Model ID	Context Window	Description
`kimi-k2`	128,000 tokens	Primary model for chat completions
`kimi-k2-0905`	256,000 tokens	Primary model for chat completions

Model Selection

Both model identifiers point to the same underlying Kimi K2 model. Use the appropriate identifier based on your API format:

OpenAI format: Use kimi-k2-0905
Anthropic format: Use kimi-k2-0905

Request Limits

Rate Limits

Rate limits are applied per API key based on credit balance:

Credit Balance	Requests/Minute	Requests/Hour	Requests/Day
1-100	20	600	5,000
101-1,000	60	2,000	20,000
1,001-10,000	200	6,000	50,000
10,000+	500	15,000	100,000

Rate limit headers:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 59
X-RateLimit-Reset: 1709125800

Token Limits

Limit Type	Value
Maximum input tokens	128,000
Maximum output tokens	8,192
Maximum total tokens	128,000

Timeout Settings

Timeout Type	Duration
Connection timeout	30 seconds
Read timeout	600 seconds
Stream timeout	600 seconds

Error Codes

HTTP Status Codes

Status	Meaning
200	Success
400	Bad Request - Invalid parameters
401	Unauthorized - Invalid or missing API key
403	Forbidden - Insufficient credits or permissions
404	Not Found - Invalid endpoint
429	Too Many Requests - Rate limit exceeded
500	Internal Server Error
503	Service Unavailable

Error Types

OpenAI Format Errors

{
  "error": {
    "message": "Invalid API key provided",
    "type": "invalid_request_error",
    "code": "invalid_api_key"
  }
}

Error Code	Type	Description
`invalid_api_key`	`invalid_request_error`	API key is invalid or malformed
`insufficient_credits`	`insufficient_quota`	Credit balance is insufficient
`rate_limit_exceeded`	`rate_limit_error`	Too many requests
`invalid_request`	`invalid_request_error`	Request validation failed
`model_not_found`	`invalid_request_error`	Specified model doesn't exist
`context_length_exceeded`	`invalid_request_error`	Input exceeds context window

Anthropic Format Errors

{
  "type": "error",
  "error": {
    "type": "authentication_error",
    "message": "Invalid API key"
  }
}

Error Type	Description
`authentication_error`	Authentication failed
`invalid_request_error`	Request validation failed
`rate_limit_error`	Rate limit exceeded
`api_error`	Server-side error

Error Handling

Implement exponential backoff with jitter for retries:

import time
import random

def retry_with_backoff(
    func, 
    max_retries=3,
    base_delay=1,
    max_delay=60
):
    for attempt in range(max_retries):
        try:
            return func()
        except RateLimitError:
            if attempt == max_retries - 1:
                raise
            
            delay = min(
                base_delay * (2 ** attempt) + random.uniform(0, 1),
                max_delay
            )
            time.sleep(delay)

Client Libraries

Python

Installation

pip install openai
# or
pip install anthropic

OpenAI Client

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://kimi-k2.ai/api/v1"
)

# List available models
models = client.models.list()
for model in models.data:
    print(f"Model ID: {model.id}")

# Create chat completion
response = client.chat.completions.create(
    model="kimi-k2",
    messages=[
        {"role": "user", "content": "Hello"}
    ]
)

Anthropic Client

from anthropic import Anthropic

client = Anthropic(
    api_key="YOUR_API_KEY",
    base_url="https://kimi-k2.ai/api/v1"
)

response = client.messages.create(
    model="kimi-k2",
    messages=[
        {"role": "user", "content": "Hello"}
    ],
    max_tokens=1024
)

Node.js

Installation

npm install openai
# or
npm install @anthropic-ai/sdk

OpenAI Client

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.KIMI_API_KEY,
  baseURL: 'https://kimi-k2.ai/api/v1',
});

// List available models
const models = await openai.models.list();
for (const model of models.data) {
  console.log(`Model ID: ${model.id}`);
}

// Create chat completion
const response = await openai.chat.completions.create({
  model: 'kimi-k2-0905',
  messages: [{ role: 'user', content: 'Hello' }],
});

Anthropic Client

import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  apiKey: process.env.KIMI_API_KEY,
  baseURL: 'https://kimi-k2.ai/api/v1',
});

const response = await anthropic.messages.create({
  model: 'kimi-k2-0905',
  messages: [{ role: 'user', content: 'Hello' }],
  max_tokens: 1024,
});

Go

Installation

go get github.com/sashabaranov/go-openai

Example

package main

import (
    "context"
    "fmt"
    openai "github.com/sashabaranov/go-openai"
)

func main() {
    config := openai.DefaultConfig("YOUR_API_KEY")
    config.BaseURL = "https://kimi-k2.ai/api/v1"
    
    client := openai.NewClientWithConfig(config)
    
    resp, err := client.CreateChatCompletion(
        context.Background(),
        openai.ChatCompletionRequest{
            Model: "kimi-k2",
            Messages: []openai.ChatCompletionMessage{
                {
                    Role:    openai.ChatMessageRoleUser,
                    Content: "Hello",
                },
            },
        },
    )
    
    if err != nil {
        panic(err)
    }
    
    fmt.Println(resp.Choices[0].Message.Content)
}

REST API

Direct HTTP requests without client libraries:

cURL

curl -X POST https://kimi-k2.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $KIMI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kimi-k2",
    "messages": [
      {"role": "user", "content": "Hello"}
    ]
  }'

Python (requests)

import requests

response = requests.post(
    "https://kimi-k2.ai/api/v1/chat/completions",
    headers={
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    },
    json={
        "model": "kimi-k2",
        "messages": [{"role": "user", "content": "Hello"}]
    }
)

Node.js (fetch)

const response = await fetch('https://kimi-k2.ai/api/v1/chat/completions', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${apiKey}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    model: 'kimi-k2-0905',
    messages: [{ role: 'user', content: 'Hello' }],
  }),
});

Billing

Credit System

API usage is billed through a credit system:

1 credit = 1 API request
Credits are deducted upon successful completion
Failed requests (4xx errors) are not charged
Server errors (5xx) are not charged
New users receive 100 free credits upon registration
Invite rewards:
- 50 credits when someone registers with your invite code
- 500 credits when an invited user makes their first payment

Credit Packages

Package	Credits	Price	Per Credit	Validity
Starter	500	$4.99	$0.0099	No expiration
Standard	5,000	$29.99	$0.0060	1 month
Premium	20,000	$59.99	$0.0030	1 month
Enterprise	Custom	Contact sales	Custom	Custom

Usage Tracking

Monitor your usage through:

Response headers: X-Credits-Remaining: 4523
Dashboard: Real-time usage statistics at /my-credits
API endpoint: GET /api/user/credits

Usage data includes:

Total credits consumed
Credits remaining
Usage by day/hour
Average tokens per request

Migration Guide

From OpenAI

Migrating from OpenAI API requires minimal changes:

Update base URL:

# From
client = OpenAI(api_key="sk-...")

# To
client = OpenAI(
    api_key="sk-...",
    base_url="https://kimi-k2.ai/api/v1"
)

Update model name:

# From
model="gpt-4"

# To
model="kimi-k2-0905"

No other changes required - The API is fully compatible

From Anthropic

Migrating from Anthropic API:

Update base URL:

# From
client = Anthropic(api_key="sk-ant-...")

# To
client = Anthropic(
    api_key="sk-...",
    base_url="https://kimi-k2.ai/api/v1"
)

Update authentication:
- Generate API key from Kimi K2 dashboard
- Replace Anthropic API key
Model compatibility:
- Kimi K2 is supported

Changelog

2025-01-30

Added Anthropic Messages API compatibility
Introduced X-API-Key authentication method
Enhanced error response formats

2025-01-15

Initial API release
OpenAI Chat Completions compatibility
128K context window support
Credit-based billing system

2025-09-05

256K context window support
kimi-k2-0905 model support