2024-06-11 –, Frannz Salon
We've recently seen a boom of specialized vector databases. At the same time, almost all popular database projects have added support for vectors. So a lot of people are asking themselves if and when do they really need a specialized vector databases, and when could use an already deployed tool.
We're going to look in particular at (at least) two vector search implementation in popular tools that a lot of people already use:
- pgvector for PostgreSQL
- Lucene vector implementation for Elasticsearch and OpenSearch
We recently had to evaluate the two for a particular use case and the comparison is quite interesting, there are pros to each, for example:
- pgvector means less infra and cost, and is always strongly consistent
- Elasticsearch/Opensearch can do automatic sharding
- in postgres you can shard by tenant easier by using schemas or partitioned indexes
- Lucene can combine functionality with full-text search
We'll go through the above and also discuss when going for a dedicated vector DB makes sense.
Tudor is CTO at Xata, a Postgres platform that brings in extra features like branching, automatic replication to search, and schema migrations improvements. Before Xata, Tudor has worked at Elastic for several years.