The managed memory layer for AI apps. Store, search, and retrieve vectors with confidence.Serverless preview • Up to 100 collections per instance
Dedicated + Shared
Deployment Modes
Private Endpoints
Secure Access
ABAC + RBAC
Policy Control
Why vectors became the default
Vector search grew out of decades of information retrieval research, where text and documents were mapped into vector spaces to enable similarity-based ranking.As datasets became high-dimensional and massive, brute-force search stopped scaling. Modern approximate-nearest-neighbor indexes like HNSW made similarity search fast enough for production workloads, while keeping recall high.VBase brings that evolution into a managed, serverless shape—so teams can ship retrieval, recommendation, and memory features without running vector infrastructure.
1970s
Vector space IR
Similarity ranking becomes a practical model for retrieval.
2016
HNSW indexing
Graph-based ANN indexing scales k‑NN for large collections.
Today
Vector databases
AI applications rely on fast retrieval for RAG and memory.
Platform Capabilities
Vector Storage
High-density collections with multi-vector schemas and filters.
Ingestion Pipelines
Stream embeddings from files, DBs, and event sources.
Orchestration Layer
Sync and normalize knowledge sources at scale.
Vector search at any scale
Designed for high-recall retrieval with predictable performance, no matter the collection size.
A short research note
Vector search isn’t just about speed — it’s about relevance under real-world constraints. As embeddings grow in dimension and datasets reach millions of items, exact search becomes prohibitively expensive. Approximate indexing preserves similarity while trading a small amount of recall for massive performance gains.VBase is built around that philosophy. We expose familiar index types while automating the operational pieces that matter most in production: scaling, access control, and predictable latency as collections grow.
Indexing in PracticeIndexes reduce search from a full scan to a guided traversal. In production, that’s the difference between “interesting demo” and “default retrieval layer.”
Dedicated or Shared
Start in a shared, serverless pool to ship quickly, then move to a fully isolated cluster when you need dedicated performance, strict separation, and predictable capacity.
Milvus-Compatible API
Use the Milvus-compatible API and keep your existing SDKs, schemas, and query flows. Move workloads without rewriting your stack.
Private Endpoints + Vault
Lock access down with private endpoints and managed secrets. Credentials and service URLs live in Vault, not in app code.
Policy-Aware Proxy
Enforce ABAC policies at the proxy layer, capture audit trails, and keep usage accountable across teams and environments.
Async Provisioning
Provision resources asynchronously so teams aren’t blocked. Add capacity or scale down on your schedule.
Multi-Vector Schemas
Store multiple vector fields per collection to power hybrid retrieval, re-ranking, and richer semantic matches.
Choose your index type
Tune for recall, latency, and cost with familiar index types.
Indexes are additional data structures built on top of your collections. They are the reason vector search and filters stay fast at scale. They also come with trade-offs: build time, extra storage and memory, and potential recall loss when using approximate methods.In VBase, indexes are created per field. Vector fields (dense, binary, sparse) use specialized vector indexes, while scalar fields use scalar indexes to accelerate filtering. This section focuses on vector indexes because they drive the biggest performance and cost differences.Most ANN indexes follow the same pattern: narrow candidates with a coarse structure, score them efficiently (often using quantization), then optionally refine to restore recall.
Flat
Exact search for small datasets.Recall: 100% · Latency: 1–10ms
IVF
Balanced speed for large collections.Recall: High · Latency: Low
HNSW
Low-latency graph search at scale.Recall: Very high · Latency: Low
How to pick an index (practical guide)
Small dataset or highest accuracy: start with a flat index if you can afford it. Low-latency interactive search: prefer graph-based indexes like HNSW. High throughput on very large datasets: favor IVF. Memory constrained: add quantization (SQ or PQ) and enable refinement, then tune recall with search parameters.Indexes matter most when collections are large (millions+ vectors), latency targets are strict, or you combine vector search with filters. If you’re still prototyping, start simple and add indexes as you scale.Next: review vector index types (dense, binary, sparse), search parameters like topK and candidate expansion, and the SDK steps to create or modify indexes.