One serverless compute service, two surfaces. Deploy your own Python or Rust functions on demand — or call 25+ catalog models through OpenAI-compatible /v1/chat, /v1/embeddings, /v1/rerank, and /v1/predict. Same auth, same scaling, same observability. MCP-ready by default.
Lambda for code, OpenAI for models, Cohere for rerank, Whisper somewhere else. Three bills, three SDKs, three auth flows.
Ignite gives you both halves of the AI app: bring your own code as Functions, or call our catalog through the ModelService. Same scheduler, same warm pool, same observability — and both surfaces show up to MCP-aware agents as tools.
The catalog is open. New models drop in as YAML in ignite-catalog; the ModelService resolves them under the public org. Pick a category to see what's live today and which OpenAI/Cohere endpoint serves it.
Ignite runs every function on the same warm executor pool, whatever the runtime. Python for the obvious cases, Rust for the hot path, WASM when you want a tight sandbox, and more on the way. The scheduler, the invocation API, and the observability surface stay the same across all of them.
Save your handler to a mutable draft slot. Compile (Rust only) and test against a payload — none of it touches production.
`ignite draft publish` promotes the draft to an immutable, versioned release. Older versions are kept (10 by default) for instant rollback.
Call sync (wait for result), async (fire and forget), or fan-out via run-once. The scheduler routes to a warm executor in the dedicated pool.
`ignite execution stream` tails logs over NATS JetStream in real time. Per-function metrics, traces, and rollups land in PostgreSQL automatically.
Functions ship through the CLI — scaffold, save the draft, publish. Models ship through standard HTTP endpoints any OpenAI/Cohere SDK already speaks. The Python SDK (decorator-based) is in development; it's shown in the last Functions tab.
Toggle mcp_enabled to see the function flip from a regular service into an agent tool. Hit Invoke for a single dedicated execution, or fan it out across 50 concurrent runs and watch the wall clock barely move.
docs/PERFORMANCE.md warm-pool benchmark. No requests leave the page.