Redis 8.4 Hybrid Search — Combining Full-Text and Vector Search for AI

Posted on: 4/20/2026 9:08:53 AM

Table of contents

1. The Problem — Why a Single Search Mode Isn't Enough
2. Redis 8.x — The Evolution Journey
3. Vector Sets — A Brand-New Data Type
4. FT.HYBRID — The Hybrid Search Command in Redis 8.4
1. 4.1 Full Syntax
2. 4.2 Main Components
5. Score Fusion — RRF vs Linear Combination
6. Building a RAG Pipeline with FT.HYBRID
7. Advanced FT.HYBRID Use Cases
8. Redis 8.4 vs Dedicated Vector Databases
1. When is Redis the best choice?
2. When should you go with a dedicated vector DB?
9. FT.HYBRID Performance Tuning
1. 9.1 Tuning Parameters
2. 9.2 Pre-filtering Strategies
10. Conclusion
1. Get Started Now

1. The Problem — Why a Single Search Mode Isn't Enough

In the era of AI and LLMs, accurate information retrieval is the foundation of every RAG (Retrieval-Augmented Generation) application. However, the two popular search methods each have their own weaknesses:

Full-text search (BM25): Excellent at exact keyword matching but "blind" to semantics. Searching "car" won't return results containing "motor vehicle."
Vector search: Deeply understands semantics via embeddings, but tends to "hallucinate" when you need to match proper names, product codes, or specific technical terms exactly.

49% Reduction in context failure rate with hybrid vs single-mode search

3.5x Retrieval recall improvement in Blended RAG systems

15% End-to-end accuracy gain on complex reasoning tasks

87% Faster command execution vs Redis 7.x

Research from Anthropic (2025) and Apple ML Research (2024) has proven that combining both methods is the optimal answer. Redis 8.4 realizes this with the FT.HYBRID command — performing score fusion inside the engine, with no external post-processing needed.

2. Redis 8.x — The Evolution Journey

Before diving into FT.HYBRID, let's look at the bigger picture of Redis 8.x:

Redis 8.0 (2025)

"One Redis" — merging Redis Stack + Community Edition. Vector Sets (beta), hash field expiration (HGETEX/HSETEX/HGETDEL), 30+ performance optimizations, BM25 replacing TF-IDF as the default scorer.

Redis 8.4 (Q1/2026)

FT.HYBRID — native hybrid search command. Score fusion (RRF + Linear), atomic cluster slot migration for zero-downtime scaling, SIMD-driven optimizations.

Vector Sets (ongoing)

A new data type for vector similarity — lighter than the Redis Query Engine, suitable for simple use cases. Supports 8-bit quantization, dimensionality reduction, and JSON attribute filtering.

3. Vector Sets — A Brand-New Data Type

Vector Sets are an invention of Redis creator Salvatore Sanfilippo himself, inspired by Sorted Sets but storing string + multi-dimensional vector embedding instead of string + score.

3.1 Basic Commands

# Add a vector to a set
VADD product-vectors "laptop-001" VALUES 768 0.12 -0.34 0.56 ...
    SETATTR '{"category":"electronics","price":1299}'

# Find the 5 most similar vectors
VSIM product-vectors VALUES 768 0.11 -0.33 0.55 ... COUNT 5

# Find with attribute filters
VSIM product-vectors VALUES 768 0.11 -0.33 0.55 ... COUNT 5
    FILTER '.category == "electronics" && .price < 2000'

# Update attributes
VSETATTR product-vectors "laptop-001"
    '{"category":"electronics","price":1199,"on_sale":true}'

3.2 Quantization and Dimensionality Reduction

Vector Sets support 3 quantization modes to balance memory against accuracy:

Mode	Size/vector	Accuracy	Use case
Full precision (default)	100%	Highest	Maximum recall required
8-bit quantization	~25%	Nearly lossless	Mainstream production
Binary quantization	~3%	Slight drop	Massive datasets, coarse search

Additionally, dimensionality reduction via random projection lets you shrink vector dimensions while preserving similarity relationships — particularly useful when embedding models output 1536+ dimensional vectors but your application only needs moderate discriminative power.

3.3 Vector Sets vs Redis Query Engine

When to use what?

Vector Sets: When you only need pure similarity search, a minimal API, single-node. Think "Sorted Sets for vectors."
Redis Query Engine (FT.SEARCH/FT.HYBRID): When you need to combine full-text + vector + geo + numeric filtering, horizontal scaling, and enterprise features.

Feature	Vector Sets	Redis Query Engine
Vector search	Native, simple	Advanced, scalable
Full-text search	No	BM25, stemming, fuzzy
Hybrid queries	No	FT.HYBRID
Geo/Numeric filter	JSON attributes	Native indexing
Scalability	Single-node	Cluster, Active-Active
Complexity	Very low	Medium

4. FT.HYBRID — The Hybrid Search Command in Redis 8.4

FT.HYBRID is the new star of Redis 8.4. Rather than running 2 separate queries and merging results at the application layer, this command performs score fusion inside the Redis engine with O(N+M) complexity.

graph TD
    A["Input Query"] --> B["FT.HYBRID Engine"]
    B --> C["SEARCH Component
BM25 Full-Text"]
    B --> D["VSIM Component
Vector Similarity"]
    C --> E["Score Fusion
RRF / Linear"]
    D --> E
    E --> F["Ranked Results"]
    F --> G["LOAD / APPLY / SORTBY
Post-processing"]
    G --> H["Final Results"]

    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style E fill:#e94560,stroke:#fff,color:#fff
    style F fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style G fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style H fill:#4CAF50,stroke:#fff,color:#fff

FT.HYBRID processing flow — score fusion happens in the engine, no external post-processing needed

4.1 Full Syntax

FT.HYBRID index
  SEARCH query
    [SCORER scorer]
    [YIELD_SCORE_AS name]
  VSIM vector_field $vector_param
    [KNN count K k [EF_RUNTIME ef]]
    [RANGE count RADIUS radius [EPSILON epsilon]]
    [YIELD_SCORE_AS name]
    [FILTER filter]
  [COMBINE RRF count [CONSTANT c] [WINDOW w] [YIELD_SCORE_AS name]]
  [COMBINE LINEAR count [ALPHA a] [BETA b] [WINDOW w] [YIELD_SCORE_AS name]]
  [LIMIT offset num]
  [SORTBY count field [ASC | DESC]]
  [LOAD count field ...]
  [GROUPBY ... REDUCE ...]
  [APPLY expression AS name]
  PARAMS nargs vector_param vector_blob [name value ...]
  [TIMEOUT timeout]

4.2 Main Components

Each FT.HYBRID command has 3 required parts:

graph LR
    A["SEARCH
Full-text query
BM25 scorer"] --> D["COMBINE
Score Fusion"]
    B["VSIM
Vector similarity
KNN or RANGE"] --> D
    D --> E["Ranked
Results"]

    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style D fill:#4CAF50,stroke:#fff,color:#fff
    style E fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

The three required components of FT.HYBRID

5. Score Fusion — RRF vs Linear Combination

This is the most important part of hybrid search: how do you combine scores from two different search systems into a single unified ranking?

5.1 Reciprocal Rank Fusion (RRF)

RRF is the default method, scoring based on rank position rather than absolute score values:

# RRF formula for each document d:
RRF_score(d) = 1/(rank_text(d) + k) + 1/(rank_vector(d) + k)

# Where k = CONSTANT (default 60)
# rank_text(d) = position of d in the full-text results
# rank_vector(d) = position of d in the vector results

Advantages of RRF

No score normalization needed: BM25 scores and cosine similarity live on completely different scales. RRF ignores absolute values and only looks at ranks — solving this problem entirely.
Robust to outliers: A document with an abnormally high BM25 score doesn't "dominate" the results.
Parameter k = 60: The default value based on Cormack et al. (2009), which favors balance between the two sources.

# Example: RRF with window = 40 and constant = 80
FT.HYBRID products-idx
  SEARCH "@category:electronics laptop"
  SCORER 4 BM25 1.5 0.8
  YIELD_SCORE_AS text_score
  VSIM @embedding $query_vec
  KNN 4 K 20 EF_RUNTIME 200
  YIELD_SCORE_AS vector_score
  COMBINE RRF 4 WINDOW 40 CONSTANT 80
  YIELD_SCORE_AS hybrid_score
  LOAD 3 @title @price @category
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."

5.2 Linear Combination

When you want to control the weights between full-text and vector, Linear combination lets you set specific ratios:

# Formula:
Linear_score(d) = alpha * normalized_text_score(d) + beta * normalized_vector_score(d)

# Example: prioritize semantic search (70% vector, 30% text)
FT.HYBRID docs-idx
  SEARCH "machine learning optimization"
  VSIM @content_vector $query_vec
  KNN 2 K 15
  COMBINE LINEAR 4 ALPHA 0.3 BETA 0.7
  LOAD *
  PARAMS 2 query_vec "\x00\x01..."

5.3 Which Method to Choose?

Criterion	RRF	Linear
Needs parameter tuning?	Minimal (constant, window)	A lot (alpha, beta need experimentation)
Robust to new data?	High — rank-based	Medium — depends on score distribution
Weight control?	Not direct	Yes — explicit alpha/beta
When to use	Don't know the optimal ratio upfront	Already A/B tested, know the weights
Recommendation	Start with RRF	Optimize once you have metrics

6. Building a RAG Pipeline with FT.HYBRID

Hands-on: integrating Redis 8.4 hybrid search into a complete RAG pipeline.

sequenceDiagram
    participant U as User
    participant App as Application
    participant Emb as Embedding Model
    participant R as Redis 8.4
    participant LLM as LLM (Claude/GPT)

    U->>App: Question: "How to optimize PostgreSQL queries for large tables?"
    App->>Emb: Generate embedding
    Emb-->>App: query_vector [768d]
    App->>R: FT.HYBRID docs-idx
SEARCH "optimize PostgreSQL queries"
VSIM @embedding $vec
COMBINE RRF
    R-->>App: Top 5 documents (ranked)
    App->>LLM: Prompt + context from 5 docs
    LLM-->>U: Answer with citations

RAG flow with FT.HYBRID — hybrid retrieval provides higher-quality context

6.1 Step 1 — Create the Index

# Create an index supporting both full-text and vector search
FT.CREATE docs-idx ON HASH PREFIX 1 doc:
  SCHEMA
    title TEXT WEIGHT 2.0
    content TEXT
    category TAG
    created_at NUMERIC SORTABLE
    embedding VECTOR HNSW 6
      TYPE FLOAT32
      DIM 768
      DISTANCE_METRIC COSINE

6.2 Step 2 — Index Documents

import redis
from openai import OpenAI
import numpy as np

client = OpenAI()
r = redis.Redis(host='localhost', port=6379)

def index_document(doc_id, title, content, category):
    # Generate embedding
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=f"{title} {content}"
    )
    embedding = np.array(response.data[0].embedding, dtype=np.float32)

    # Store in a Redis Hash
    r.hset(f"doc:{doc_id}", mapping={
        "title": title,
        "content": content,
        "category": category,
        "created_at": int(time.time()),
        "embedding": embedding.tobytes()
    })

6.3 Step 3 — Hybrid Search

def hybrid_search(query, top_k=5):
    # Generate query embedding
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )
    query_vec = np.array(
        response.data[0].embedding, dtype=np.float32
    ).tobytes()

    # FT.HYBRID — combine BM25 + vector similarity
    result = r.execute_command(
        "FT.HYBRID", "docs-idx",
        "SEARCH", query,
        "SCORER", "4", "BM25", "1.2", "0.75",
        "YIELD_SCORE_AS", "text_score",
        "VSIM", "@embedding", "$query_vec",
        "KNN", "2", "K", str(top_k * 3),
        "YIELD_SCORE_AS", "vector_score",
        "COMBINE", "RRF", "4",
        "WINDOW", "30", "CONSTANT", "60",
        "YIELD_SCORE_AS", "hybrid_score",
        "LOAD", "4", "@title", "@content",
        "@text_score", "@vector_score",
        "LIMIT", "0", str(top_k),
        "PARAMS", "2", "query_vec", query_vec
    )
    return parse_results(result)

6.4 Step 4 — Combine with the LLM

import anthropic

client_ai = anthropic.Anthropic()

def rag_answer(question):
    # Hybrid retrieval
    docs = hybrid_search(question, top_k=5)

    # Build context
    context = "\n---\n".join(
        f"[{d['title']}] {d['content']}" for d in docs
    )

    # Call Claude with the context
    response = client_ai.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=2048,
        messages=[{
            "role": "user",
            "content": f"""Based on the following documents, answer the question.

Documents:
{context}

Question: {question}"""
        }]
    )
    return response.content[0].text

7. Advanced FT.HYBRID Use Cases

7.1 Boosting Recent Documents (Recency Boost)

# Combine hybrid search + time-based sorting
FT.HYBRID docs-idx
  SEARCH "kubernetes deployment strategy"
  VSIM @embedding $query_vec
  KNN 2 K 20
  COMBINE RRF 2 WINDOW 30
  SORTBY 2 created_at DESC
  LOAD 3 @title @created_at @category
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."

7.2 Location-Based Search

# Hybrid search + geo filter for location-based apps
FT.HYBRID places-idx
  SEARCH "quiet cafe with wifi"
  VSIM @description_vec $query_vec
  KNN 2 K 15
  FILTER "@location:[106.6 10.7 5 km]"
  COMBINE LINEAR 4 ALPHA 0.4 BETA 0.6
  LOAD 4 @name @address @rating @distance
  PARAMS 2 query_vec "\x00\x01..."

7.3 Fuzzy Matching Combined with Semantic

# Find both typos and semantically similar terms
FT.HYBRID products-idx
  SEARCH "%%samsuung%% %%gallaxy%% ~phone ~smartphone"
  VSIM @features_vec $query_vec
  KNN 2 K 10
  COMBINE RRF 2 WINDOW 20
  LOAD *
  PARAMS 2 query_vec "\x00\x01..."

# %%word%% = fuzzy matching (Levenshtein distance)
# ~word = optional term (boost if matched, not excluded otherwise)

8. Redis 8.4 vs Dedicated Vector Databases

Criterion	Redis 8.4	Pinecone	Weaviate	Milvus
Hybrid search	Native (FT.HYBRID)	Sparse + Dense vectors	BM25 + Vector	Sparse + Dense
Score fusion	RRF + Linear (in-engine)	Client-side	In-engine	RRF (in-engine)
Latency	Sub-ms (in-memory)	~10-50ms	~10-100ms	~10-50ms
Integrated caching	Native (it's a cache)	No	No	No
Full-text quality	BM25, stemming, fuzzy	Basic	BM25	Basic
Data structures	Hash, JSON, Stream, TS...	Vectors only	Objects + vectors	Collections
Operational	Familiar, big ecosystem	Managed only	Self-host / Cloud	Self-host / Cloud
Pricing	Free + Enterprise	Freemium	Free + Cloud	Free + Cloud

When is Redis the best choice?

You already use Redis: No new infrastructure needed. Vector search + caching + sessions + pub/sub in one system.
Ultra-low latency: In-memory engine for sub-millisecond responses — critical for real-time RAG, autocomplete, and recommendations.
Complex hybrid queries: FT.HYBRID with geo, time, and numeric filtering in a single command.

When should you go with a dedicated vector DB?

Billion-scale vectors: If the dataset exceeds RAM capacity, disk-based systems like Milvus are a better fit.
Vector search only: If you don't need full-text, caching, or other data structures, a dedicated vector DB is simpler.
Complex multi-tenancy: Pinecone has better namespace isolation for multi-tenant SaaS.

9. FT.HYBRID Performance Tuning

9.1 Tuning Parameters

Parameter	Default	Recommended	Impact
`EF_RUNTIME`	10	50-200	Higher = better recall, higher latency
`K` (KNN)	10	2-3× desired top_k	Candidate pool for score fusion
`WINDOW`	20	20-50	Candidates considered during fusion
`CONSTANT` (RRF)	60	60 (leave as-is)	Balance between top and lower ranks
`ALPHA/BETA`	—	Try 0.3/0.7 → 0.5/0.5 → 0.7/0.3	Weights between text and vector

9.2 Pre-filtering Strategies

# FILTER inside VSIM narrows the vector space BEFORE search
# → significantly reduces compute time
FT.HYBRID products-idx
  SEARCH "gaming laptop RTX"
  VSIM @features_vec $query_vec
  KNN 2 K 20
  FILTER "@category:{electronics} @price:[500 3000]"
  COMBINE RRF 2 WINDOW 30
  LOAD 3 @title @price @rating
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."

# Two pre-filter policies:
# ADHOC_BF — brute-force scan, good for selective filters (<10% data)
# BATCHES — batch processing, good for less-selective filters

10. Conclusion

Redis 8.4 with FT.HYBRID marks an important shift: from a pure cache / data-structure tool to a comprehensive AI-native platform. Integrating score fusion directly in the engine not only simplifies architecture but delivers superior performance — sub-millisecond latency for hybrid queries that previously required orchestrating multiple services.

For systems already using Redis for caching or sessions, extending into hybrid search for a RAG pipeline is nearly "free" — no new infrastructure, no complex migration. That's the power of the "One Redis" philosophy the development team has pursued since version 8.0.

Get Started Now

1. Upgrade to Redis 8.4
2. Create an index with a VECTOR field (HNSW or FLAT)
3. Start with FT.HYBRID ... COMBINE RRF — default parameters are already optimal for most use cases
4. A/B test RRF vs Linear once you have enough metrics
5. Tune EF_RUNTIME and WINDOW based on your latency budget

References:

#redis #Vector Search #Hybrid Search #RAG #AI #Full-Text Search #Score Fusion #Caching

# Redis 8.4 Hybrid Search — Combining Full-Text and Vector Search for AI

## 1. The Problem — Why a Single Search Mode Isn't Enough

- **Full-text search (BM25)**: Excellent at exact keyword matching but "blind" to semantics. Searching "car" won't return results containing "motor vehicle."
- **Vector search**: Deeply understands semantics via embeddings, but tends to "hallucinate" when you need to match proper names, product codes, or specific technical terms exactly.

49% Reduction in context failure rate with hybrid vs single-mode search

3.5x Retrieval recall improvement in Blended RAG systems

15% End-to-end accuracy gain on complex reasoning tasks

87% Faster command execution vs Redis 7.x

Research from Anthropic (2025) and Apple ML Research (2024) has proven that **combining both methods** is the optimal answer. Redis 8.4 realizes this with the `FT.HYBRID` command — performing score fusion inside the engine, with no external post-processing needed.

## 2. Redis 8.x — The Evolution Journey

Before diving into FT.HYBRID, let's look at the bigger picture of Redis 8.x:

Redis 8.0 (2025)

**"One Redis"** — merging Redis Stack + Community Edition. Vector Sets (beta), hash field expiration (HGETEX/HSETEX/HGETDEL), 30+ performance optimizations, BM25 replacing TF-IDF as the default scorer.

Redis 8.4 (Q1/2026)

**FT.HYBRID** — native hybrid search command. Score fusion (RRF + Linear), atomic cluster slot migration for zero-downtime scaling, SIMD-driven optimizations.

Vector Sets (ongoing)

A new data type for vector similarity — lighter than the Redis Query Engine, suitable for simple use cases. Supports 8-bit quantization, dimensionality reduction, and JSON attribute filtering.

## 3. Vector Sets — A Brand-New Data Type

Vector Sets are an invention of Redis creator Salvatore Sanfilippo himself, inspired by Sorted Sets but storing **string + multi-dimensional vector embedding** instead of string + score.

### 3.1 Basic Commands

```
# Add a vector to a set
VADD product-vectors "laptop-001" VALUES 768 0.12 -0.34 0.56 ...
    SETATTR '{"category":"electronics","price":1299}'

# Find the 5 most similar vectors
VSIM product-vectors VALUES 768 0.11 -0.33 0.55 ... COUNT 5

# Find with attribute filters
VSIM product-vectors VALUES 768 0.11 -0.33 0.55 ... COUNT 5
    FILTER '.category == "electronics" && .price &lt; 2000'

# Update attributes
VSETATTR product-vectors "laptop-001"
    '{"category":"electronics","price":1199,"on_sale":true}'
```

### 3.2 Quantization and Dimensionality Reduction

Vector Sets support 3 quantization modes to balance memory against accuracy:

| Mode | Size/vector | Accuracy | Use case |
| --- | --- | --- | --- |
| **Full precision** (default) | 100% | Highest | Maximum recall required |
| **8-bit quantization** | ~25% | Nearly lossless | Mainstream production |
| **Binary quantization** | ~3% | Slight drop | Massive datasets, coarse search |

Additionally, **dimensionality reduction** via random projection lets you shrink vector dimensions while preserving similarity relationships — particularly useful when embedding models output 1536+ dimensional vectors but your application only needs moderate discriminative power.

### 3.3 Vector Sets vs Redis Query Engine

#### When to use what?

**Vector Sets**: When you only need pure similarity search, a minimal API, single-node. Think "Sorted Sets for vectors."  
**Redis Query Engine (FT.SEARCH/FT.HYBRID)**: When you need to combine full-text + vector + geo + numeric filtering, horizontal scaling, and enterprise features.

| Feature | Vector Sets | Redis Query Engine |
| --- | --- | --- |
| Vector search | Native, simple | Advanced, scalable |
| Full-text search | No | BM25, stemming, fuzzy |
| Hybrid queries | No | FT.HYBRID |
| Geo/Numeric filter | JSON attributes | Native indexing |
| Scalability | Single-node | Cluster, Active-Active |
| Complexity | Very low | Medium |

## 4. FT.HYBRID — The Hybrid Search Command in Redis 8.4

`FT.HYBRID` is the new star of Redis 8.4. Rather than running 2 separate queries and merging results at the application layer, this command performs **score fusion inside the Redis engine** with O(N+M) complexity.

```
graph TD
    A["Input Query"] --> B["FT.HYBRID Engine"]
    B --> C["SEARCH Component  
BM25 Full-Text"]
    B --> D["VSIM Component  
Vector Similarity"]
    C --> E["Score Fusion  
RRF / Linear"]
    D --> E
    E --> F["Ranked Results"]
    F --> G["LOAD / APPLY / SORTBY  
Post-processing"]
    G --> H["Final Results"]

style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style E fill:#e94560,stroke:#fff,color:#fff
    style F fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style G fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style H fill:#4CAF50,stroke:#fff,color:#fff

```

FT.HYBRID processing flow — score fusion happens in the engine, no external post-processing needed

### 4.1 Full Syntax

```
FT.HYBRID index
  SEARCH query
    [SCORER scorer]
    [YIELD_SCORE_AS name]
  VSIM vector_field $vector_param
    [KNN count K k [EF_RUNTIME ef]]
    [RANGE count RADIUS radius [EPSILON epsilon]]
    [YIELD_SCORE_AS name]
    [FILTER filter]
  [COMBINE RRF count [CONSTANT c] [WINDOW w] [YIELD_SCORE_AS name]]
  [COMBINE LINEAR count [ALPHA a] [BETA b] [WINDOW w] [YIELD_SCORE_AS name]]
  [LIMIT offset num]
  [SORTBY count field [ASC | DESC]]
  [LOAD count field ...]
  [GROUPBY ... REDUCE ...]
  [APPLY expression AS name]
  PARAMS nargs vector_param vector_blob [name value ...]
  [TIMEOUT timeout]
```

### 4.2 Main Components

Each FT.HYBRID command has 3 required parts:

```
graph LR
    A["SEARCH  
Full-text query  
BM25 scorer"] --> D["COMBINE  
Score Fusion"]
    B["VSIM  
Vector similarity  
KNN or RANGE"] --> D
    D --> E["Ranked  
Results"]

style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style D fill:#4CAF50,stroke:#fff,color:#fff
    style E fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

```

The three required components of FT.HYBRID

## 5. Score Fusion — RRF vs Linear Combination

This is the most important part of hybrid search: **how do you combine scores from two different search systems** into a single unified ranking?

### 5.1 Reciprocal Rank Fusion (RRF)

RRF is the default method, scoring based on **rank position** rather than absolute score values:

```
# RRF formula for each document d:
RRF_score(d) = 1/(rank_text(d) + k) + 1/(rank_vector(d) + k)

# Where k = CONSTANT (default 60)
# rank_text(d) = position of d in the full-text results
# rank_vector(d) = position of d in the vector results
```

#### Advantages of RRF

**No score normalization needed**: BM25 scores and cosine similarity live on completely different scales. RRF ignores absolute values and only looks at ranks — solving this problem entirely.  
**Robust to outliers**: A document with an abnormally high BM25 score doesn't "dominate" the results.  
**Parameter k = 60**: The default value based on Cormack et al. (2009), which favors balance between the two sources.

```
# Example: RRF with window = 40 and constant = 80
FT.HYBRID products-idx
  SEARCH "@category:electronics laptop"
  SCORER 4 BM25 1.5 0.8
  YIELD_SCORE_AS text_score
  VSIM @embedding $query_vec
  KNN 4 K 20 EF_RUNTIME 200
  YIELD_SCORE_AS vector_score
  COMBINE RRF 4 WINDOW 40 CONSTANT 80
  YIELD_SCORE_AS hybrid_score
  LOAD 3 @title @price @category
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."
```

### 5.2 Linear Combination

When you want to **control the weights** between full-text and vector, Linear combination lets you set specific ratios:

```
# Formula:
Linear_score(d) = alpha * normalized_text_score(d) + beta * normalized_vector_score(d)

# Example: prioritize semantic search (70% vector, 30% text)
FT.HYBRID docs-idx
  SEARCH "machine learning optimization"
  VSIM @content_vector $query_vec
  KNN 2 K 15
  COMBINE LINEAR 4 ALPHA 0.3 BETA 0.7
  LOAD *
  PARAMS 2 query_vec "\x00\x01..."
```

### 5.3 Which Method to Choose?

| Criterion | RRF | Linear |
| --- | --- | --- |
| Needs parameter tuning? | Minimal (constant, window) | A lot (alpha, beta need experimentation) |
| Robust to new data? | High — rank-based | Medium — depends on score distribution |
| Weight control? | Not direct | Yes — explicit alpha/beta |
| When to use | Don't know the optimal ratio upfront | Already A/B tested, know the weights |
| Recommendation | Start with RRF | Optimize once you have metrics |

## 6. Building a RAG Pipeline with FT.HYBRID

Hands-on: integrating Redis 8.4 hybrid search into a complete RAG pipeline.

```
sequenceDiagram
    participant U as User
    participant App as Application
    participant Emb as Embedding Model
    participant R as Redis 8.4
    participant LLM as LLM (Claude/GPT)

U->>App: Question: "How to optimize PostgreSQL queries for large tables?"
    App->>Emb: Generate embedding
    Emb-->>App: query_vector [768d]
    App->>R: FT.HYBRID docs-idx  
SEARCH "optimize PostgreSQL queries"  
VSIM @embedding $vec  
COMBINE RRF
    R-->>App: Top 5 documents (ranked)
    App->>LLM: Prompt + context from 5 docs
    LLM-->>U: Answer with citations

```

RAG flow with FT.HYBRID — hybrid retrieval provides higher-quality context

### 6.1 Step 1 — Create the Index

```
# Create an index supporting both full-text and vector search
FT.CREATE docs-idx ON HASH PREFIX 1 doc:
  SCHEMA
    title TEXT WEIGHT 2.0
    content TEXT
    category TAG
    created_at NUMERIC SORTABLE
    embedding VECTOR HNSW 6
      TYPE FLOAT32
      DIM 768
      DISTANCE_METRIC COSINE
```

### 6.2 Step 2 — Index Documents

```
import redis
from openai import OpenAI
import numpy as np

client = OpenAI()
r = redis.Redis(host='localhost', port=6379)

def index_document(doc_id, title, content, category):
    # Generate embedding
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=f"{title} {content}"
    )
    embedding = np.array(response.data[0].embedding, dtype=np.float32)

# Store in a Redis Hash
    r.hset(f"doc:{doc_id}", mapping={
        "title": title,
        "content": content,
        "category": category,
        "created_at": int(time.time()),
        "embedding": embedding.tobytes()
    })
```

### 6.3 Step 3 — Hybrid Search

```
def hybrid_search(query, top_k=5):
    # Generate query embedding
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )
    query_vec = np.array(
        response.data[0].embedding, dtype=np.float32
    ).tobytes()

# FT.HYBRID — combine BM25 + vector similarity
    result = r.execute_command(
        "FT.HYBRID", "docs-idx",
        "SEARCH", query,
        "SCORER", "4", "BM25", "1.2", "0.75",
        "YIELD_SCORE_AS", "text_score",
        "VSIM", "@embedding", "$query_vec",
        "KNN", "2", "K", str(top_k * 3),
        "YIELD_SCORE_AS", "vector_score",
        "COMBINE", "RRF", "4",
        "WINDOW", "30", "CONSTANT", "60",
        "YIELD_SCORE_AS", "hybrid_score",
        "LOAD", "4", "@title", "@content",
        "@text_score", "@vector_score",
        "LIMIT", "0", str(top_k),
        "PARAMS", "2", "query_vec", query_vec
    )
    return parse_results(result)
```

### 6.4 Step 4 — Combine with the LLM

```
import anthropic

client_ai = anthropic.Anthropic()

def rag_answer(question):
    # Hybrid retrieval
    docs = hybrid_search(question, top_k=5)

# Build context
    context = "\n---\n".join(
        f"[{d['title']}] {d['content']}" for d in docs
    )

# Call Claude with the context
    response = client_ai.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=2048,
        messages=[{
            "role": "user",
            "content": f"""Based on the following documents, answer the question.

Documents:
{context}

Question: {question}"""
        }]
    )
    return response.content[0].text
```

## 7. Advanced FT.HYBRID Use Cases

### 7.1 Boosting Recent Documents (Recency Boost)

```
# Combine hybrid search + time-based sorting
FT.HYBRID docs-idx
  SEARCH "kubernetes deployment strategy"
  VSIM @embedding $query_vec
  KNN 2 K 20
  COMBINE RRF 2 WINDOW 30
  SORTBY 2 created_at DESC
  LOAD 3 @title @created_at @category
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."
```

### 7.2 Location-Based Search

```
# Hybrid search + geo filter for location-based apps
FT.HYBRID places-idx
  SEARCH "quiet cafe with wifi"
  VSIM @description_vec $query_vec
  KNN 2 K 15
  FILTER "@location:[106.6 10.7 5 km]"
  COMBINE LINEAR 4 ALPHA 0.4 BETA 0.6
  LOAD 4 @name @address @rating @distance
  PARAMS 2 query_vec "\x00\x01..."
```

### 7.3 Fuzzy Matching Combined with Semantic

```
# Find both typos and semantically similar terms
FT.HYBRID products-idx
  SEARCH "%%samsuung%% %%gallaxy%% ~phone ~smartphone"
  VSIM @features_vec $query_vec
  KNN 2 K 10
  COMBINE RRF 2 WINDOW 20
  LOAD *
  PARAMS 2 query_vec "\x00\x01..."

# %%word%% = fuzzy matching (Levenshtein distance)
# ~word = optional term (boost if matched, not excluded otherwise)
```

## 8. Redis 8.4 vs Dedicated Vector Databases

| Criterion | Redis 8.4 | Pinecone | Weaviate | Milvus |
| --- | --- | --- | --- | --- |
| **Hybrid search** | Native (FT.HYBRID) | Sparse + Dense vectors | BM25 + Vector | Sparse + Dense |
| **Score fusion** | RRF + Linear (in-engine) | Client-side | In-engine | RRF (in-engine) |
| **Latency** | Sub-ms (in-memory) | ~10-50ms | ~10-100ms | ~10-50ms |
| **Integrated caching** | Native (it's a cache) | No | No | No |
| **Full-text quality** | BM25, stemming, fuzzy | Basic | BM25 | Basic |
| **Data structures** | Hash, JSON, Stream, TS... | Vectors only | Objects + vectors | Collections |
| **Operational** | Familiar, big ecosystem | Managed only | Self-host / Cloud | Self-host / Cloud |
| **Pricing** | Free + Enterprise | Freemium | Free + Cloud | Free + Cloud |

#### When is Redis the best choice?

**You already use Redis**: No new infrastructure needed. Vector search + caching + sessions + pub/sub in one system.  
**Ultra-low latency**: In-memory engine for sub-millisecond responses — critical for real-time RAG, autocomplete, and recommendations.  
**Complex hybrid queries**: FT.HYBRID with geo, time, and numeric filtering in a single command.

#### When should you go with a dedicated vector DB?

**Billion-scale vectors**: If the dataset exceeds RAM capacity, disk-based systems like Milvus are a better fit.  
**Vector search only**: If you don't need full-text, caching, or other data structures, a dedicated vector DB is simpler.  
**Complex multi-tenancy**: Pinecone has better namespace isolation for multi-tenant SaaS.

## 9. FT.HYBRID Performance Tuning

### 9.1 Tuning Parameters

| Parameter | Default | Recommended | Impact |
| --- | --- | --- | --- |
| `EF_RUNTIME` | 10 | 50-200 | Higher = better recall, higher latency |
| `K` (KNN) | 10 | 2-3× desired top_k | Candidate pool for score fusion |
| `WINDOW` | 20 | 20-50 | Candidates considered during fusion |
| `CONSTANT` (RRF) | 60 | 60 (leave as-is) | Balance between top and lower ranks |
| `ALPHA/BETA` | — | Try 0.3/0.7 → 0.5/0.5 → 0.7/0.3 | Weights between text and vector |

### 9.2 Pre-filtering Strategies

```
# FILTER inside VSIM narrows the vector space BEFORE search
# → significantly reduces compute time
FT.HYBRID products-idx
  SEARCH "gaming laptop RTX"
  VSIM @features_vec $query_vec
  KNN 2 K 20
  FILTER "@category:{electronics} @price:[500 3000]"
  COMBINE RRF 2 WINDOW 30
  LOAD 3 @title @price @rating
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."

# Two pre-filter policies:
# ADHOC_BF — brute-force scan, good for selective filters (<10% data)
# BATCHES — batch processing, good for less-selective filters
```

## 10. Conclusion

Redis 8.4 with `FT.HYBRID` marks an important shift: from a pure cache / data-structure tool to a **comprehensive AI-native platform**. Integrating score fusion directly in the engine not only simplifies architecture but delivers superior performance — sub-millisecond latency for hybrid queries that previously required orchestrating multiple services.

#### Get Started Now

1. Upgrade to Redis 8.4  
2. Create an index with a VECTOR field (HNSW or FLAT)  
3. Start with `FT.HYBRID ... COMBINE RRF` — default parameters are already optimal for most use cases  
4. A/B test RRF vs Linear once you have enough metrics  
5. Tune EF_RUNTIME and WINDOW based on your latency budget

**References:**

- [FT.HYBRID Command Documentation — Redis](https://redis.io/docs/latest/commands/ft.hybrid/)
- [Revamping Context-Oriented Retrieval with Hybrid Search in Redis 8.4 — Redis Blog](https://redis.io/blog/revamping-context-oriented-retrieval-with-hybrid-search-in-redis-84/)
- [Announcing Vector Sets — Redis Blog](https://redis.io/blog/announcing-vector-sets-a-new-redis-data-type-for-vector-similarity/)
- [What's New in Redis 8.0 — Redis Docs](https://redis.io/docs/latest/develop/whats-new/8-0/)
- [Hybrid Search Benefits: Why RAG Systems Need Both Methods — Redis Blog](https://redis.io/blog/hybrid-search-benefits-rag-systems/)
- [Advanced RAG — Understanding Reciprocal Rank Fusion in Hybrid Search](https://glaforge.dev/posts/2026/02/10/advanced-rag-understanding-reciprocal-rank-fusion-in-hybrid-search/)

Domain-Driven Design in Practice on .NET 10 — Aggregate, Domain Event, and Bounded Context

Nginx vs Caddy vs Traefik — Picking the Right Reverse Proxy for 2026

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.