Redis 8.4 Hybrid Search — Combining Full-Text and Vector Search for AI
Posted on: 4/20/2026 9:08:53 AM
Table of contents
- 1. The Problem — Why a Single Search Mode Isn't Enough
- 2. Redis 8.x — The Evolution Journey
- 3. Vector Sets — A Brand-New Data Type
- 4. FT.HYBRID — The Hybrid Search Command in Redis 8.4
- 5. Score Fusion — RRF vs Linear Combination
- 6. Building a RAG Pipeline with FT.HYBRID
- 7. Advanced FT.HYBRID Use Cases
- 8. Redis 8.4 vs Dedicated Vector Databases
- 9. FT.HYBRID Performance Tuning
- 10. Conclusion
1. The Problem — Why a Single Search Mode Isn't Enough
In the era of AI and LLMs, accurate information retrieval is the foundation of every RAG (Retrieval-Augmented Generation) application. However, the two popular search methods each have their own weaknesses:
- Full-text search (BM25): Excellent at exact keyword matching but "blind" to semantics. Searching "car" won't return results containing "motor vehicle."
- Vector search: Deeply understands semantics via embeddings, but tends to "hallucinate" when you need to match proper names, product codes, or specific technical terms exactly.
Research from Anthropic (2025) and Apple ML Research (2024) has proven that combining both methods is the optimal answer. Redis 8.4 realizes this with the FT.HYBRID command — performing score fusion inside the engine, with no external post-processing needed.
2. Redis 8.x — The Evolution Journey
Before diving into FT.HYBRID, let's look at the bigger picture of Redis 8.x:
3. Vector Sets — A Brand-New Data Type
Vector Sets are an invention of Redis creator Salvatore Sanfilippo himself, inspired by Sorted Sets but storing string + multi-dimensional vector embedding instead of string + score.
3.1 Basic Commands
# Add a vector to a set
VADD product-vectors "laptop-001" VALUES 768 0.12 -0.34 0.56 ...
SETATTR '{"category":"electronics","price":1299}'
# Find the 5 most similar vectors
VSIM product-vectors VALUES 768 0.11 -0.33 0.55 ... COUNT 5
# Find with attribute filters
VSIM product-vectors VALUES 768 0.11 -0.33 0.55 ... COUNT 5
FILTER '.category == "electronics" && .price < 2000'
# Update attributes
VSETATTR product-vectors "laptop-001"
'{"category":"electronics","price":1199,"on_sale":true}'
3.2 Quantization and Dimensionality Reduction
Vector Sets support 3 quantization modes to balance memory against accuracy:
| Mode | Size/vector | Accuracy | Use case |
|---|---|---|---|
| Full precision (default) | 100% | Highest | Maximum recall required |
| 8-bit quantization | ~25% | Nearly lossless | Mainstream production |
| Binary quantization | ~3% | Slight drop | Massive datasets, coarse search |
Additionally, dimensionality reduction via random projection lets you shrink vector dimensions while preserving similarity relationships — particularly useful when embedding models output 1536+ dimensional vectors but your application only needs moderate discriminative power.
3.3 Vector Sets vs Redis Query Engine
When to use what?
Vector Sets: When you only need pure similarity search, a minimal API, single-node. Think "Sorted Sets for vectors."
Redis Query Engine (FT.SEARCH/FT.HYBRID): When you need to combine full-text + vector + geo + numeric filtering, horizontal scaling, and enterprise features.
| Feature | Vector Sets | Redis Query Engine |
|---|---|---|
| Vector search | Native, simple | Advanced, scalable |
| Full-text search | No | BM25, stemming, fuzzy |
| Hybrid queries | No | FT.HYBRID |
| Geo/Numeric filter | JSON attributes | Native indexing |
| Scalability | Single-node | Cluster, Active-Active |
| Complexity | Very low | Medium |
4. FT.HYBRID — The Hybrid Search Command in Redis 8.4
FT.HYBRID is the new star of Redis 8.4. Rather than running 2 separate queries and merging results at the application layer, this command performs score fusion inside the Redis engine with O(N+M) complexity.
graph TD
A["Input Query"] --> B["FT.HYBRID Engine"]
B --> C["SEARCH Component
BM25 Full-Text"]
B --> D["VSIM Component
Vector Similarity"]
C --> E["Score Fusion
RRF / Linear"]
D --> E
E --> F["Ranked Results"]
F --> G["LOAD / APPLY / SORTBY
Post-processing"]
G --> H["Final Results"]
style A fill:#e94560,stroke:#fff,color:#fff
style B fill:#2c3e50,stroke:#fff,color:#fff
style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
style E fill:#e94560,stroke:#fff,color:#fff
style F fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style G fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
style H fill:#4CAF50,stroke:#fff,color:#fff
4.1 Full Syntax
FT.HYBRID index
SEARCH query
[SCORER scorer]
[YIELD_SCORE_AS name]
VSIM vector_field $vector_param
[KNN count K k [EF_RUNTIME ef]]
[RANGE count RADIUS radius [EPSILON epsilon]]
[YIELD_SCORE_AS name]
[FILTER filter]
[COMBINE RRF count [CONSTANT c] [WINDOW w] [YIELD_SCORE_AS name]]
[COMBINE LINEAR count [ALPHA a] [BETA b] [WINDOW w] [YIELD_SCORE_AS name]]
[LIMIT offset num]
[SORTBY count field [ASC | DESC]]
[LOAD count field ...]
[GROUPBY ... REDUCE ...]
[APPLY expression AS name]
PARAMS nargs vector_param vector_blob [name value ...]
[TIMEOUT timeout]
4.2 Main Components
Each FT.HYBRID command has 3 required parts:
graph LR
A["SEARCH
Full-text query
BM25 scorer"] --> D["COMBINE
Score Fusion"]
B["VSIM
Vector similarity
KNN or RANGE"] --> D
D --> E["Ranked
Results"]
style A fill:#e94560,stroke:#fff,color:#fff
style B fill:#2c3e50,stroke:#fff,color:#fff
style D fill:#4CAF50,stroke:#fff,color:#fff
style E fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
5. Score Fusion — RRF vs Linear Combination
This is the most important part of hybrid search: how do you combine scores from two different search systems into a single unified ranking?
5.1 Reciprocal Rank Fusion (RRF)
RRF is the default method, scoring based on rank position rather than absolute score values:
# RRF formula for each document d:
RRF_score(d) = 1/(rank_text(d) + k) + 1/(rank_vector(d) + k)
# Where k = CONSTANT (default 60)
# rank_text(d) = position of d in the full-text results
# rank_vector(d) = position of d in the vector results
Advantages of RRF
No score normalization needed: BM25 scores and cosine similarity live on completely different scales. RRF ignores absolute values and only looks at ranks — solving this problem entirely.
Robust to outliers: A document with an abnormally high BM25 score doesn't "dominate" the results.
Parameter k = 60: The default value based on Cormack et al. (2009), which favors balance between the two sources.
# Example: RRF with window = 40 and constant = 80
FT.HYBRID products-idx
SEARCH "@category:electronics laptop"
SCORER 4 BM25 1.5 0.8
YIELD_SCORE_AS text_score
VSIM @embedding $query_vec
KNN 4 K 20 EF_RUNTIME 200
YIELD_SCORE_AS vector_score
COMBINE RRF 4 WINDOW 40 CONSTANT 80
YIELD_SCORE_AS hybrid_score
LOAD 3 @title @price @category
LIMIT 0 10
PARAMS 2 query_vec "\x00\x01..."
5.2 Linear Combination
When you want to control the weights between full-text and vector, Linear combination lets you set specific ratios:
# Formula:
Linear_score(d) = alpha * normalized_text_score(d) + beta * normalized_vector_score(d)
# Example: prioritize semantic search (70% vector, 30% text)
FT.HYBRID docs-idx
SEARCH "machine learning optimization"
VSIM @content_vector $query_vec
KNN 2 K 15
COMBINE LINEAR 4 ALPHA 0.3 BETA 0.7
LOAD *
PARAMS 2 query_vec "\x00\x01..."
5.3 Which Method to Choose?
| Criterion | RRF | Linear |
|---|---|---|
| Needs parameter tuning? | Minimal (constant, window) | A lot (alpha, beta need experimentation) |
| Robust to new data? | High — rank-based | Medium — depends on score distribution |
| Weight control? | Not direct | Yes — explicit alpha/beta |
| When to use | Don't know the optimal ratio upfront | Already A/B tested, know the weights |
| Recommendation | Start with RRF | Optimize once you have metrics |
6. Building a RAG Pipeline with FT.HYBRID
Hands-on: integrating Redis 8.4 hybrid search into a complete RAG pipeline.
sequenceDiagram
participant U as User
participant App as Application
participant Emb as Embedding Model
participant R as Redis 8.4
participant LLM as LLM (Claude/GPT)
U->>App: Question: "How to optimize PostgreSQL queries for large tables?"
App->>Emb: Generate embedding
Emb-->>App: query_vector [768d]
App->>R: FT.HYBRID docs-idx
SEARCH "optimize PostgreSQL queries"
VSIM @embedding $vec
COMBINE RRF
R-->>App: Top 5 documents (ranked)
App->>LLM: Prompt + context from 5 docs
LLM-->>U: Answer with citations
6.1 Step 1 — Create the Index
# Create an index supporting both full-text and vector search
FT.CREATE docs-idx ON HASH PREFIX 1 doc:
SCHEMA
title TEXT WEIGHT 2.0
content TEXT
category TAG
created_at NUMERIC SORTABLE
embedding VECTOR HNSW 6
TYPE FLOAT32
DIM 768
DISTANCE_METRIC COSINE
6.2 Step 2 — Index Documents
import redis
from openai import OpenAI
import numpy as np
client = OpenAI()
r = redis.Redis(host='localhost', port=6379)
def index_document(doc_id, title, content, category):
# Generate embedding
response = client.embeddings.create(
model="text-embedding-3-small",
input=f"{title} {content}"
)
embedding = np.array(response.data[0].embedding, dtype=np.float32)
# Store in a Redis Hash
r.hset(f"doc:{doc_id}", mapping={
"title": title,
"content": content,
"category": category,
"created_at": int(time.time()),
"embedding": embedding.tobytes()
})
6.3 Step 3 — Hybrid Search
def hybrid_search(query, top_k=5):
# Generate query embedding
response = client.embeddings.create(
model="text-embedding-3-small",
input=query
)
query_vec = np.array(
response.data[0].embedding, dtype=np.float32
).tobytes()
# FT.HYBRID — combine BM25 + vector similarity
result = r.execute_command(
"FT.HYBRID", "docs-idx",
"SEARCH", query,
"SCORER", "4", "BM25", "1.2", "0.75",
"YIELD_SCORE_AS", "text_score",
"VSIM", "@embedding", "$query_vec",
"KNN", "2", "K", str(top_k * 3),
"YIELD_SCORE_AS", "vector_score",
"COMBINE", "RRF", "4",
"WINDOW", "30", "CONSTANT", "60",
"YIELD_SCORE_AS", "hybrid_score",
"LOAD", "4", "@title", "@content",
"@text_score", "@vector_score",
"LIMIT", "0", str(top_k),
"PARAMS", "2", "query_vec", query_vec
)
return parse_results(result)
6.4 Step 4 — Combine with the LLM
import anthropic
client_ai = anthropic.Anthropic()
def rag_answer(question):
# Hybrid retrieval
docs = hybrid_search(question, top_k=5)
# Build context
context = "\n---\n".join(
f"[{d['title']}] {d['content']}" for d in docs
)
# Call Claude with the context
response = client_ai.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
messages=[{
"role": "user",
"content": f"""Based on the following documents, answer the question.
Documents:
{context}
Question: {question}"""
}]
)
return response.content[0].text
7. Advanced FT.HYBRID Use Cases
7.1 Boosting Recent Documents (Recency Boost)
# Combine hybrid search + time-based sorting
FT.HYBRID docs-idx
SEARCH "kubernetes deployment strategy"
VSIM @embedding $query_vec
KNN 2 K 20
COMBINE RRF 2 WINDOW 30
SORTBY 2 created_at DESC
LOAD 3 @title @created_at @category
LIMIT 0 10
PARAMS 2 query_vec "\x00\x01..."
7.2 Location-Based Search
# Hybrid search + geo filter for location-based apps
FT.HYBRID places-idx
SEARCH "quiet cafe with wifi"
VSIM @description_vec $query_vec
KNN 2 K 15
FILTER "@location:[106.6 10.7 5 km]"
COMBINE LINEAR 4 ALPHA 0.4 BETA 0.6
LOAD 4 @name @address @rating @distance
PARAMS 2 query_vec "\x00\x01..."
7.3 Fuzzy Matching Combined with Semantic
# Find both typos and semantically similar terms
FT.HYBRID products-idx
SEARCH "%%samsuung%% %%gallaxy%% ~phone ~smartphone"
VSIM @features_vec $query_vec
KNN 2 K 10
COMBINE RRF 2 WINDOW 20
LOAD *
PARAMS 2 query_vec "\x00\x01..."
# %%word%% = fuzzy matching (Levenshtein distance)
# ~word = optional term (boost if matched, not excluded otherwise)
8. Redis 8.4 vs Dedicated Vector Databases
| Criterion | Redis 8.4 | Pinecone | Weaviate | Milvus |
|---|---|---|---|---|
| Hybrid search | Native (FT.HYBRID) | Sparse + Dense vectors | BM25 + Vector | Sparse + Dense |
| Score fusion | RRF + Linear (in-engine) | Client-side | In-engine | RRF (in-engine) |
| Latency | Sub-ms (in-memory) | ~10-50ms | ~10-100ms | ~10-50ms |
| Integrated caching | Native (it's a cache) | No | No | No |
| Full-text quality | BM25, stemming, fuzzy | Basic | BM25 | Basic |
| Data structures | Hash, JSON, Stream, TS... | Vectors only | Objects + vectors | Collections |
| Operational | Familiar, big ecosystem | Managed only | Self-host / Cloud | Self-host / Cloud |
| Pricing | Free + Enterprise | Freemium | Free + Cloud | Free + Cloud |
When is Redis the best choice?
You already use Redis: No new infrastructure needed. Vector search + caching + sessions + pub/sub in one system.
Ultra-low latency: In-memory engine for sub-millisecond responses — critical for real-time RAG, autocomplete, and recommendations.
Complex hybrid queries: FT.HYBRID with geo, time, and numeric filtering in a single command.
When should you go with a dedicated vector DB?
Billion-scale vectors: If the dataset exceeds RAM capacity, disk-based systems like Milvus are a better fit.
Vector search only: If you don't need full-text, caching, or other data structures, a dedicated vector DB is simpler.
Complex multi-tenancy: Pinecone has better namespace isolation for multi-tenant SaaS.
9. FT.HYBRID Performance Tuning
9.1 Tuning Parameters
| Parameter | Default | Recommended | Impact |
|---|---|---|---|
EF_RUNTIME | 10 | 50-200 | Higher = better recall, higher latency |
K (KNN) | 10 | 2-3× desired top_k | Candidate pool for score fusion |
WINDOW | 20 | 20-50 | Candidates considered during fusion |
CONSTANT (RRF) | 60 | 60 (leave as-is) | Balance between top and lower ranks |
ALPHA/BETA | — | Try 0.3/0.7 → 0.5/0.5 → 0.7/0.3 | Weights between text and vector |
9.2 Pre-filtering Strategies
# FILTER inside VSIM narrows the vector space BEFORE search
# → significantly reduces compute time
FT.HYBRID products-idx
SEARCH "gaming laptop RTX"
VSIM @features_vec $query_vec
KNN 2 K 20
FILTER "@category:{electronics} @price:[500 3000]"
COMBINE RRF 2 WINDOW 30
LOAD 3 @title @price @rating
LIMIT 0 10
PARAMS 2 query_vec "\x00\x01..."
# Two pre-filter policies:
# ADHOC_BF — brute-force scan, good for selective filters (<10% data)
# BATCHES — batch processing, good for less-selective filters
10. Conclusion
Redis 8.4 with FT.HYBRID marks an important shift: from a pure cache / data-structure tool to a comprehensive AI-native platform. Integrating score fusion directly in the engine not only simplifies architecture but delivers superior performance — sub-millisecond latency for hybrid queries that previously required orchestrating multiple services.
For systems already using Redis for caching or sessions, extending into hybrid search for a RAG pipeline is nearly "free" — no new infrastructure, no complex migration. That's the power of the "One Redis" philosophy the development team has pursued since version 8.0.
Get Started Now
1. Upgrade to Redis 8.4
2. Create an index with a VECTOR field (HNSW or FLAT)
3. Start with FT.HYBRID ... COMBINE RRF — default parameters are already optimal for most use cases
4. A/B test RRF vs Linear once you have enough metrics
5. Tune EF_RUNTIME and WINDOW based on your latency budget
References:
- FT.HYBRID Command Documentation — Redis
- Revamping Context-Oriented Retrieval with Hybrid Search in Redis 8.4 — Redis Blog
- Announcing Vector Sets — Redis Blog
- What's New in Redis 8.0 — Redis Docs
- Hybrid Search Benefits: Why RAG Systems Need Both Methods — Redis Blog
- Advanced RAG — Understanding Reciprocal Rank Fusion in Hybrid Search
Domain-Driven Design in Practice on .NET 10 — Aggregate, Domain Event, and Bounded Context
Nginx vs Caddy vs Traefik — Picking the Right Reverse Proxy for 2026
Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.