First solution to promote growing tenants into dedicated shards inside one vector search engine environment. Delivers predictable performance, lowers infrastructure costs, and simplifies scaling for multi-tenant RAG and agentic platforms.
Qdrant, the open-source vector search engine used by enterprises and AI-native teams, today announced Tiered Multitenancy, a new capability that helps organizations isolate heavy-traffic tenants, improve performance, and scale vector search workloads more efficiently. It is part of the v1.16 release
Modern AI platforms often serve thousands of small tenants alongside a few large enterprise users with significantly higher throughput requirements. This uneven distribution causes a classic noisy neighbor problem where a single high-volume tenant can force the cluster to scale for everyone, increasing costs and reducing performance for smaller tenants.
Tiered Multitenancy addresses this issue by storing all tenants in one shared collection and allowing operators to promote any large or latency-sensitive tenant to a dedicated shard when needed. Promotion happens without downtime, without reindexing, and without requiring any changes to client applications. Shared and dedicated paths operate within the same collection, keeping operations simple while ensuring predictable performance for high-demand tenants.
"Customers want strong tenant isolation without the operational burden of maintaining dozens of separate indexes," said Andre Zayarni, CEO at Qdrant. "Tiered Multitenancy offers that balance. Teams can scale their largest tenants independently while keeping the rest of the system simple and unified."
How the Feature Works
Qdrant combines payload-based filtering and custom sharding inside a single architecture. Tenants begin in a shared fallback shard. When a tenant grows or requires dedicated resources, operators can promote it through a single API call that uses a filtered streaming transfer mechanism. Throughout the transfer, Qdrant automatically routes reads and writes to the correct shard and maintains consistency guarantees so applications remain fully operational.
This approach removes the complex client-side routing logic common in multi-tenant systems. Some competing platforms require a separate index for every tenant, while others cannot support cross-tenant search at all. Qdrant is the first vector search engine to provide both tenant isolation and global search inside the same collection. This makes it possible to support hybrid workloads such as agents that access tenant-specific memory while also querying a global knowledge base.
Designed for AI at Scale
Multi-tenant RAG platforms, coding agents, and enterprise copilots frequently host customers with very different performance profiles. Tiered Multitenancy gives teams the ability to isolate large customers, scale compute only where necessary, and maintain a global index for cross-tenant retrieval. It also reduces operational complexity by consolidating multi-index architectures into a single collection.
About Qdrant
Qdrant is the leading high-performance, scalable, open-source vector search engine, essential for building the next generation of AI/ML applications. Qdrant is able to handle billions of vectors and is implemented in Rust for performance, memory safety, and scale. Recently, Qdrant's open-source project surpassed 250 million installs across all open-source packages and earned a place in The Forrester Wave: Vector Databases, Q3 2024. The company was also recognized as one of Europe's top 10 startups in Sifted's 2025 B2B SaaS Rising 100, an annual ranking of the most promising B2B SaaS companies valued under $1 billion. Today, Qdrant powers real-time Agentic RAG applications at scale in enterprises like Tripadvisor, HubSpot, and Deutsche Telekom.
View source version on businesswire.com: https://www.businesswire.com/news/home/20251119343840/en/
Contacts:
For more information, please visit qdrant.tech or contact press@qdrant.com.