Moonshot · Llm model

Kimi K2.6

Kimi AI is the flagship intelligence layer from Moonshot AI

One API key. Every model.

per 1m tokens
Example
AI text
Commercial use included Verified May 26, 2026 Outputs are yours No training on your data
Endpoints

Start building with Kimi K2.6.

One model. Three ways to call it. Same key, same bill.

IN

Llm

One call. Same key. Same bill.

$1.62 / 1m tokens

OUT

Llm

One call. Same key. Same bill.

$6.8 / 1m tokens

Capabilities

What it does best.

Kimi AI capabilities at a glance

Context

  • 256K total context window (262,144 tokens precisely)[^1]
  • Automatic context caching: cache-hit input costs drop to $0.16/1M tokens vs. $0.95/1M on a cache miss[^1]

Modalities

  • Accepts text and image input natively[^2]
  • Video input listed as a supported modality per Moonshot pricing docs (verify against latest API docs before use)[^1]

Reasoning modes

  • Selectable thinking (deep reasoning) and non-thinking modes per request[^1]
  • Dialogue mode and agent task-execution mode, switchable at the API level[^1]

Tooling

  • Native ToolCalls (function calling) with multi-step task completion[^1]
  • JSON Mode, Partial Mode, and built-in internet search for web-grounded responses[^1]

Coding and instruction following

  • Stable long-horizon code writing across extended sessions[^1]
  • Significantly improved instruction compliance and self-correction versus predecessor models[^1]
  • Open-source state-of-the-art performance on agent, code, and visual understanding benchmarks[^2]
API

Call Kimi K2.6 in three lines.

One key. One base URL. Same SDK shape you already use.

# 1. set your key
export COINIS_API_KEY="sk_live_..."

# 2. call the model
curl https://api.app.coinis.com/v1/llm/generate \
  -H "Authorization: Bearer $COINIS_API_KEY" \
  -d '{"prompt":"neon city, rain, tracking shot"}'
import { Coinis } from "@coinis/sdk";
const coinis = new Coinis(process.env.COINIS_API_KEY);

const job = await coinis.llm.generate({
  model: "models/moonshot/kimi",
  prompt: "neon city, rain, tracking shot",
});
from coinis import Coinis
coinis = Coinis(os.environ["COINIS_API_KEY"])

job = coinis.llm.generate(
    model="models/moonshot/kimi",
    prompt="neon city, rain, tracking shot",
)
Response
{
  "id": "gen_8fa2c1",
  "status": "succeeded",
  "model": "models/moonshot/kimi",
  "output": {
    "image_url": 
                "https://cdn.coinis.com/gen_8fa2c1.mp4"
              
              ,
    "format": "mp4"
  },
  "tokens_used": 10
}

Already on another provider's SDK? Change the host. Keep the call.

Pricing

Token pricing. No surprises.

One wallet across every model. No API accounts to juggle.

Kimi K2.6 · IN
16.2 tokens
per 1m tokens · $1.62
Frontier LLM
$1.62 / 1m tokens
One key. Every model. One invoice. 1 token = $0.10
1 1m tokens ≈ 16 tokens ($1.62)
Start free. 15 tokens a week.

No credit card.

Why pay through Coinis
  • One wallet for every model. No API keys. No separate bills.
  • Generate ads. Launch to Meta. Track in one place.
  • On-brand output from your Brand Profile.

1 token = $0.10 pay-as-you-go. Less on a plan.

Standard vs Fast

Pick the run for the job.

IN

Final renders, studios
Resolution
Price $1.62 / 1m tokens

OUT

Rapid tests, high volume
Resolution
Price $6.8 / 1m tokens
Use cases

Two buyers. One model.

For builders

Resell every model. One key. One bill.

Unified API across video, image, audio, and LLM.

Generate 500 variants overnight.

Async queue plus webhooks. Batch at scale.

White-label the output.

Ship it under your brand. Outputs are yours.

For creatives

Ship a Reel before lunch.

Prompt to platform-native clip in minutes.

Same product. Ten formats.

One generation, every aspect ratio.

Commercial UGC without a creator.

Authentic selfie-style ads, on brand.

Long-context agentic coding Feed an entire repository into a 256K context window and let Kimi K2.6 write, refactor, and debug across multiple files in a single session. Stable long-horizon code writing means output quality holds across extended completions[^1].

Autonomous tool-using agents ToolCalls and agent-mode execution are built into the model rather than wrapped on top[^1]. Connect external APIs, run multi-step workflows, and let the model self-correct when a tool call fails.

Multimodal document and screenshot analysis Pass in images alongside text prompts for document parsing, screenshot Q&A, and UI review tasks. Image input is natively supported. video input is documented by Moonshot and should be verified in the latest API docs before production use[^1].

Deep-reasoning research workflows Switch on thinking mode for complex analytical problems that require multi-step reasoning. The model works through intermediate steps before producing a final answer, reducing the need for manual chain-of-thought prompting[^1].

Web-grounded RAG and live-data chat Built-in internet search lets the model retrieve real-time information inside a conversation[^1]. Use it for competitive research, news summarisation, or any workflow where static training data is insufficient.

Renders in seconds. Set a seed. Get the same frame back.

Outputs are yours. Sell them.

Safe for paid ads.

Your prompts are never used for training.

FAQ

Kimi K2.6 FAQs

How much does Kimi AI cost per 1M tokens on Coinis vs. the Moonshot API?

On Coinis, Kimi K2.6 is priced at $1.615/1M input tokens (variant kimi-k2-6-in) and $6.80/1M output tokens (variant kimi-k2-6-out). Moonshot charges $0.95/1M input and $4.00/1M output on their direct API. Coinis pricing includes managed access, unified billing across models, and no infrastructure setup.

Is there a Kimi API I can call directly from my application?

Yes. Send a POST request to https://api.app.coinis.com/v1/llm/generate with model set to moonshot/kimi-k2.6. ToolCalls, JSON Mode, and Partial Mode are all passed through to the Moonshot backend. Full parameter reference is on the API sub-page.

What is the context window of Kimi K2.6, and does it support caching?

Kimi K2.6 supports 256K tokens (262,144 precisely) as a combined input-plus-output context window. Automatic context caching is supported: cache hits are billed at $0.16/1M input tokens versus $0.95/1M on a cache miss. That is an approximately 83% discount for repeated or large shared contexts.

How is Kimi K2.6 different from the older Kimi K2 series?

The kimi-k2 series was officially discontinued by Moonshot AI on May 25, 2026. Kimi K2.6 is the current supported successor. It adds selectable thinking mode, stronger instruction compliance, improved self-correction, and stable long-horizon code writing that the predecessor series did not offer.

Does Kimi AI support tool calling, JSON mode, and web search?

Yes. Kimi K2.6 supports native ToolCalls (function calling), JSON Mode, Partial Mode, and built-in internet search. These are first-class features of the model, not post-processing layers. Agent-mode execution allows multi-step tool orchestration within a single API call.

Can Kimi K2.6 handle images and video, or is it text-only?

Kimi K2.6 accepts text and image input natively. Video input is documented as a supported modality on the Moonshot pricing page. Check the official docs at https://platform.kimi.ai/docs/pricing/chat-k26 to confirm video input availability before building a production pipeline around it.

When should I pick Kimi over other coding-focused LLMs like DeepSeek or Qwen?

Pick Kimi K2.6 when you need a very large shared context (256K tokens), built-in agent-mode tool use, and selectable deep-reasoning on the same model. DeepSeek V3 and Qwen 3 are strong alternatives with different context and pricing profiles. Coinis lists all three so you can benchmark them on the same prompt without switching providers.

Start free

Your wallet. Every model. One call away.

Start free. 15 tokens a week. No card.

Generate on Coinis

No credit card.

Pricing and capabilities verified 2026-05-26. Read the docs .