Compute · Ignite Models

Call a model, not a vendor.

OpenAI-compatible Chat & Embeddings. Cohere-style Rerank. Whisper for audio. A generic Infer for vision, OCR, NER, and translation. 23 models across 5 capabilities — and you can register your own.

api.dodil.io / v1
catalog
# OpenAI-compatible Chat — point at /v1/chat/completions.
$ curl https://api.dodil.io/v1/chat/completions \
    -H "Authorization: Bearer $DODIL_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "model": "kimi-k2.5",
      "messages": [{"role": "user", "content": "Summarise this PDF."}],
      "stream": false
    }'

# Or stream via /v1/chat/completions/stream — SSE chunks
# carry usage stats + finish_reason on the last chunk.
OpenAI-compatible — drop in the SDK you already use, swap the base URL.ModelService@v1
Surface

Six endpoints.
Every model behind them.

Chat, embeddings, rerank, transcribe — drop-in for the SDKs you already know. Anything else lands behind a single generic Infer route with a JSON schema you can fetch.

Chat completionslive
POST/v1/chat/completions47 ms · p50
Embeddingslive
POST/v1/embeddings18 ms · p50
Reranklive
1
2
3
4
POST/v1/rerank62 ms · p50
Transcribelive
POST/v1/audio/transcriptions3.2 s · avg
Inferlive
POST/v1/infer92 ms · p50
List modelscached
19.8k
cached lookups
hot
GET/v1/models4 ms · cached
Catalog

23 models.
Five capabilities.

Curated mix of API-proxied flagships (Kimi) and open-weight models we host (Whisper, Jina, YOLO, GLiNER, …). Each is tagged with the endpoint it serves and whether it runs on GPU or CPU.

Reasoning, long-context, multimodal chat. Served behind /v1/chat/completions.

Kimi K2.5
Moonshot AI
API

Flagship reasoning + vision + chat. 1M-token context. Extended thinking.

/v1/chat/completions· 1M tokens
Kimi K2
Moonshot AI
API

Step-by-step reasoning. 128K context. Lighter than K2.5.

/v1/chat/completions· 128K tokens
Moonshot V1 Auto
Moonshot AI
API

Auto-routes between 8K / 32K / 128K context based on input length.

/v1/chat/completions· Auto
Pricing

Per token.
Per model.

Base credit rates below. Each model declares its own multiplier in model.yaml — see the per-model rate sheet for effective prices.

per-model× multipliere.g. Kimi K2.5 · 15.6×
Talk
$0.05/ 1M prompt credits
$0.40/ 1M completion credits
Base credit · × per-model multiplier
Embed
$0.03/ 1M embedding tokens
Text · image · multimodal
Search
$2.00/ 1M reads
$2.00/ 1M writes
Vector Units (VU)
ExampleEmbed 1M tokens of docs $0.03, then run 100k vector queries $0.20 $0.23
See pricing
FAQ

Questions, answered.

One key. One bill. Every model.

Drop in the OpenAI SDK. Swap the base URL. Done.

Get an API key
Regions
UKLiveEULiveMiddle EastSoonAfricaSoon
Compliance
SOC 2In progressISO 27001In progressGDPR-readyData residencyEnforced
© 2026 Circle Technologies Pte Ltd. All rights reserved.