Redis 8.4 Hybrid Search — Tìm kiếm kết hợp Full-Text và Vector cho AI

Posted on: 4/20/2026 9:08:53 AM

Table of contents

1. Bối cảnh — Vì sao tìm kiếm đơn lẻ không đủ?
2. Redis 8.x — Hành trình tiến hóa
3. Vector Sets — Kiểu dữ liệu hoàn toàn mới
4. FT.HYBRID — Lệnh Hybrid Search trong Redis 8.4
1. 4.1 Cú pháp đầy đủ
2. 4.2 Các thành phần chính
5. Score Fusion — RRF vs Linear Combination
6. Xây dựng RAG Pipeline với FT.HYBRID
7. Use Cases nâng cao với FT.HYBRID
8. Redis 8.4 vs Dedicated Vector Databases
1. Khi nào Redis là lựa chọn tốt nhất?
2. Khi nào nên dùng dedicated vector DB?
9. Tối ưu hiệu năng FT.HYBRID
1. 9.1 Tuning Parameters
2. 9.2 Pre-filtering Strategies
10. Kết luận
1. Bắt đầu ngay

1. Bối cảnh — Vì sao tìm kiếm đơn lẻ không đủ?

Trong kỷ nguyên AI và LLM, việc truy xuất thông tin chính xác là nền tảng của mọi ứng dụng RAG (Retrieval-Augmented Generation). Tuy nhiên, hai phương pháp tìm kiếm phổ biến đều có điểm yếu riêng:

Full-text search (BM25): Tuyệt vời với keyword matching chính xác, nhưng "mù" trước ngữ nghĩa. Tìm "xe ô tô" sẽ không trả về kết quả chứa "phương tiện giao thông".
Vector search: Hiểu ngữ nghĩa sâu nhờ embeddings, nhưng dễ "hallucinate" khi cần match chính xác tên riêng, mã sản phẩm, hoặc thuật ngữ kỹ thuật cụ thể.

49% Giảm tỷ lệ context failure khi dùng hybrid search so với single-mode

3.5x Cải thiện retrieval recall trong hệ thống Blended RAG

15% Tăng accuracy end-to-end trên các tác vụ reasoning phức tạp

87% Tốc độ thực thi lệnh nhanh hơn so với Redis 7.x

Nghiên cứu từ Anthropic (2025) và Apple ML Research (2024) đã chứng minh: kết hợp cả hai phương pháp mới là lời giải tối ưu. Redis 8.4 hiện thực hóa điều này qua lệnh FT.HYBRID — thực hiện score fusion ngay trong engine, không cần post-processing bên ngoài.

2. Redis 8.x — Hành trình tiến hóa

Trước khi đi sâu vào FT.HYBRID, hãy nhìn lại bức tranh lớn của Redis 8.x:

Redis 8.0 (2025)

"One Redis" — hợp nhất Redis Stack + Community Edition. Vector Sets (beta), hash field expiration (HGETEX/HSETEX/HGETDEL), 30+ tối ưu hiệu năng, BM25 thay thế TF-IDF làm default scorer.

Redis 8.4 (Q1/2026)

FT.HYBRID — lệnh hybrid search native. Score fusion (RRF + Linear), atomic cluster slot migration cho zero-downtime scaling, SIMD-driven optimizations.

Vector Sets (tiếp tục phát triển)

Kiểu dữ liệu mới cho vector similarity — nhẹ hơn Redis Query Engine, phù hợp use case đơn giản. Hỗ trợ quantization 8-bit, dimensionality reduction, JSON attribute filtering.

3. Vector Sets — Kiểu dữ liệu hoàn toàn mới

Vector Sets là phát minh của chính Salvatore Sanfilippo (cha đẻ Redis), lấy cảm hứng từ Sorted Sets nhưng thay vì lưu string + score, nó lưu string + vector embedding đa chiều.

3.1 Các lệnh cơ bản

# Thêm vector vào tập hợp
VADD product-vectors "laptop-001" VALUES 768 0.12 -0.34 0.56 ...
    SETATTR '{"category":"electronics","price":1299}'

# Tìm 5 vector tương tự nhất
VSIM product-vectors VALUES 768 0.11 -0.33 0.55 ... COUNT 5

# Tìm với filter theo attributes
VSIM product-vectors VALUES 768 0.11 -0.33 0.55 ... COUNT 5
    FILTER '.category == "electronics" && .price < 2000'

# Cập nhật attributes
VSETATTR product-vectors "laptop-001"
    '{"category":"electronics","price":1199,"on_sale":true}'

3.2 Quantization và Dimensionality Reduction

Vector Sets hỗ trợ 3 chế độ quantization để cân bằng giữa bộ nhớ và độ chính xác:

Chế độ	Kích thước/vector	Độ chính xác	Use case
Full precision (mặc định)	100%	Cao nhất	Yêu cầu recall tối đa
8-bit quantization	~25%	Gần như không mất	Production phổ thông
Binary quantization	~3%	Giảm nhẹ	Dataset cực lớn, tìm kiếm sơ bộ

Ngoài ra, dimensionality reduction qua random projection cho phép giảm số chiều vector mà vẫn giữ quan hệ tương tự giữa các điểm — đặc biệt hữu ích khi embedding model xuất ra vector 1536+ chiều nhưng ứng dụng chỉ cần độ phân biệt vừa phải.

3.3 Vector Sets vs Redis Query Engine

Khi nào dùng gì?

Vector Sets: Khi chỉ cần similarity search đơn thuần, API tối giản, single-node. Nghĩ như "Sorted Sets cho vectors".
Redis Query Engine (FT.SEARCH/FT.HYBRID): Khi cần kết hợp full-text + vector + geo + numeric filtering, horizontal scaling, enterprise features.

Tính năng	Vector Sets	Redis Query Engine
Vector search	Native, đơn giản	Advanced, scalable
Full-text search	Không	BM25, stemming, fuzzy
Hybrid queries	Không	FT.HYBRID
Geo/Numeric filter	JSON attributes	Native indexing
Scalability	Single-node	Cluster, Active-Active
Complexity	Rất thấp	Trung bình

4. FT.HYBRID — Lệnh Hybrid Search trong Redis 8.4

FT.HYBRID là ngôi sao mới của Redis 8.4. Thay vì chạy 2 truy vấn riêng biệt rồi merge kết quả ở application layer, lệnh này thực hiện score fusion ngay trong Redis engine với độ phức tạp O(N+M).

graph TD
    A["Query đầu vào"] --> B["FT.HYBRID Engine"]
    B --> C["SEARCH Component
BM25 Full-Text"]
    B --> D["VSIM Component
Vector Similarity"]
    C --> E["Score Fusion
RRF / Linear"]
    D --> E
    E --> F["Ranked Results"]
    F --> G["LOAD / APPLY / SORTBY
Post-processing"]
    G --> H["Kết quả cuối cùng"]

    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style E fill:#e94560,stroke:#fff,color:#fff
    style F fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style G fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style H fill:#4CAF50,stroke:#fff,color:#fff

Luồng xử lý của FT.HYBRID — score fusion diễn ra trong engine, không cần post-processing bên ngoài

4.1 Cú pháp đầy đủ

FT.HYBRID index
  SEARCH query
    [SCORER scorer]
    [YIELD_SCORE_AS name]
  VSIM vector_field $vector_param
    [KNN count K k [EF_RUNTIME ef]]
    [RANGE count RADIUS radius [EPSILON epsilon]]
    [YIELD_SCORE_AS name]
    [FILTER filter]
  [COMBINE RRF count [CONSTANT c] [WINDOW w] [YIELD_SCORE_AS name]]
  [COMBINE LINEAR count [ALPHA a] [BETA b] [WINDOW w] [YIELD_SCORE_AS name]]
  [LIMIT offset num]
  [SORTBY count field [ASC | DESC]]
  [LOAD count field ...]
  [GROUPBY ... REDUCE ...]
  [APPLY expression AS name]
  PARAMS nargs vector_param vector_blob [name value ...]
  [TIMEOUT timeout]

4.2 Các thành phần chính

Mỗi lệnh FT.HYBRID gồm 3 phần bắt buộc:

graph LR
    A["SEARCH
Full-text query
BM25 scorer"] --> D["COMBINE
Score Fusion"]
    B["VSIM
Vector similarity
KNN hoặc RANGE"] --> D
    D --> E["Kết quả
xếp hạng"]

    style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style D fill:#4CAF50,stroke:#fff,color:#fff
    style E fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

Ba thành phần bắt buộc của FT.HYBRID

5. Score Fusion — RRF vs Linear Combination

Đây là phần quan trọng nhất của hybrid search: làm sao kết hợp điểm từ hai hệ thống tìm kiếm khác nhau thành một bảng xếp hạng thống nhất?

5.1 Reciprocal Rank Fusion (RRF)

RRF là phương pháp mặc định, tính điểm dựa trên vị trí xếp hạng chứ không dựa trên giá trị score tuyệt đối:

# Công thức RRF cho mỗi document d:
RRF_score(d) = 1/(rank_text(d) + k) + 1/(rank_vector(d) + k)

# Trong đó k = CONSTANT (mặc định 60)
# rank_text(d) = vị trí của d trong kết quả full-text
# rank_vector(d) = vị trí của d trong kết quả vector

Ưu điểm của RRF

Không cần normalize scores: BM25 scores và cosine similarity có scale hoàn toàn khác nhau. RRF bỏ qua giá trị tuyệt đối, chỉ quan tâm thứ hạng — giải quyết triệt để vấn đề này.
Robust với outliers: Một document có BM25 score cao bất thường không "áp đảo" kết quả.
Tham số k = 60: Giá trị mặc định dựa trên nghiên cứu của Cormack et al. (2009), ưu tiên cân bằng giữa hai nguồn.

# Ví dụ: RRF với window = 40 và constant = 80
FT.HYBRID products-idx
  SEARCH "@category:electronics laptop"
  SCORER 4 BM25 1.5 0.8
  YIELD_SCORE_AS text_score
  VSIM @embedding $query_vec
  KNN 4 K 20 EF_RUNTIME 200
  YIELD_SCORE_AS vector_score
  COMBINE RRF 4 WINDOW 40 CONSTANT 80
  YIELD_SCORE_AS hybrid_score
  LOAD 3 @title @price @category
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."

5.2 Linear Combination

Khi bạn muốn kiểm soát trọng số giữa full-text và vector, Linear combination cho phép thiết lập tỷ lệ cụ thể:

# Công thức:
Linear_score(d) = alpha * normalized_text_score(d) + beta * normalized_vector_score(d)

# Ví dụ: ưu tiên semantic search (70% vector, 30% text)
FT.HYBRID docs-idx
  SEARCH "machine learning optimization"
  VSIM @content_vector $query_vec
  KNN 2 K 15
  COMBINE LINEAR 4 ALPHA 0.3 BETA 0.7
  LOAD *
  PARAMS 2 query_vec "\x00\x01..."

5.3 Chọn phương pháp nào?

Tiêu chí	RRF	Linear
Cần tune tham số?	Ít (chỉ constant, window)	Nhiều (alpha, beta cần thử nghiệm)
Robust với data mới?	Cao — rank-based	Trung bình — phụ thuộc score distribution
Kiểm soát trọng số?	Không trực tiếp	Có — alpha/beta rõ ràng
Nên dùng khi	Không biết trước tỷ lệ tối ưu	Đã A/B test và biết rõ trọng số
Khuyến nghị	Bắt đầu với RRF	Tối ưu sau khi có metrics

6. Xây dựng RAG Pipeline với FT.HYBRID

Đây là phần thực hành: kết hợp Redis 8.4 hybrid search vào một pipeline RAG hoàn chỉnh.

sequenceDiagram
    participant U as User
    participant App as Application
    participant Emb as Embedding Model
    participant R as Redis 8.4
    participant LLM as LLM (Claude/GPT)

    U->>App: Câu hỏi: "Cách tối ưu query PostgreSQL cho table lớn?"
    App->>Emb: Generate embedding
    Emb-->>App: query_vector [768d]
    App->>R: FT.HYBRID docs-idx
SEARCH "tối ưu query PostgreSQL"
VSIM @embedding $vec
COMBINE RRF
    R-->>App: Top 5 documents (ranked)
    App->>LLM: Prompt + Context từ 5 docs
    LLM-->>U: Câu trả lời có nguồn tham chiếu

Luồng RAG với FT.HYBRID — hybrid retrieval cho context chất lượng cao hơn

6.1 Bước 1 — Tạo Index

# Tạo index hỗ trợ cả full-text và vector search
FT.CREATE docs-idx ON HASH PREFIX 1 doc:
  SCHEMA
    title TEXT WEIGHT 2.0
    content TEXT
    category TAG
    created_at NUMERIC SORTABLE
    embedding VECTOR HNSW 6
      TYPE FLOAT32
      DIM 768
      DISTANCE_METRIC COSINE

6.2 Bước 2 — Index Documents

import redis
from openai import OpenAI
import numpy as np

client = OpenAI()
r = redis.Redis(host='localhost', port=6379)

def index_document(doc_id, title, content, category):
    # Generate embedding
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=f"{title} {content}"
    )
    embedding = np.array(response.data[0].embedding, dtype=np.float32)

    # Lưu vào Redis Hash
    r.hset(f"doc:{doc_id}", mapping={
        "title": title,
        "content": content,
        "category": category,
        "created_at": int(time.time()),
        "embedding": embedding.tobytes()
    })

6.3 Bước 3 — Hybrid Search

def hybrid_search(query, top_k=5):
    # Generate query embedding
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )
    query_vec = np.array(
        response.data[0].embedding, dtype=np.float32
    ).tobytes()

    # FT.HYBRID — kết hợp BM25 + vector similarity
    result = r.execute_command(
        "FT.HYBRID", "docs-idx",
        "SEARCH", query,
        "SCORER", "4", "BM25", "1.2", "0.75",
        "YIELD_SCORE_AS", "text_score",
        "VSIM", "@embedding", "$query_vec",
        "KNN", "2", "K", str(top_k * 3),
        "YIELD_SCORE_AS", "vector_score",
        "COMBINE", "RRF", "4",
        "WINDOW", "30", "CONSTANT", "60",
        "YIELD_SCORE_AS", "hybrid_score",
        "LOAD", "4", "@title", "@content",
        "@text_score", "@vector_score",
        "LIMIT", "0", str(top_k),
        "PARAMS", "2", "query_vec", query_vec
    )
    return parse_results(result)

6.4 Bước 4 — Kết hợp với LLM

import anthropic

client_ai = anthropic.Anthropic()

def rag_answer(question):
    # Hybrid retrieval
    docs = hybrid_search(question, top_k=5)

    # Build context
    context = "\n---\n".join(
        f"[{d['title']}] {d['content']}" for d in docs
    )

    # Gọi Claude với context
    response = client_ai.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=2048,
        messages=[{
            "role": "user",
            "content": f"""Dựa trên tài liệu sau, trả lời câu hỏi.

Tài liệu:
{context}

Câu hỏi: {question}"""
        }]
    )
    return response.content[0].text

7. Use Cases nâng cao với FT.HYBRID

7.1 Ưu tiên tài liệu mới (Recency Boost)

# Kết hợp hybrid search + sắp xếp theo thời gian
FT.HYBRID docs-idx
  SEARCH "kubernetes deployment strategy"
  VSIM @embedding $query_vec
  KNN 2 K 20
  COMBINE RRF 2 WINDOW 30
  SORTBY 2 created_at DESC
  LOAD 3 @title @created_at @category
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."

7.2 Tìm kiếm theo khu vực địa lý

# Hybrid search + geo filter cho ứng dụng location-based
FT.HYBRID places-idx
  SEARCH "quán cà phê wifi yên tĩnh"
  VSIM @description_vec $query_vec
  KNN 2 K 15
  FILTER "@location:[106.6 10.7 5 km]"
  COMBINE LINEAR 4 ALPHA 0.4 BETA 0.6
  LOAD 4 @name @address @rating @distance
  PARAMS 2 query_vec "\x00\x01..."

7.3 Fuzzy Matching kết hợp Semantic

# Tìm cả lỗi chính tả + ngữ nghĩa tương tự
FT.HYBRID products-idx
  SEARCH "%%samsuung%% %%gallaxy%% ~phone ~smartphone"
  VSIM @features_vec $query_vec
  KNN 2 K 10
  COMBINE RRF 2 WINDOW 20
  LOAD *
  PARAMS 2 query_vec "\x00\x01..."

# %%word%% = fuzzy matching (Levenshtein distance)
# ~word = optional term (boost nếu match, không loại nếu không match)

8. Redis 8.4 vs Dedicated Vector Databases

Tiêu chí	Redis 8.4	Pinecone	Weaviate	Milvus
Hybrid search	Native (FT.HYBRID)	Sparse + Dense vectors	BM25 + Vector	Sparse + Dense
Score fusion	RRF + Linear (in-engine)	Client-side	In-engine	RRF (in-engine)
Latency	Sub-ms (in-memory)	~10-50ms	~10-100ms	~10-50ms
Caching tích hợp	Native (vốn là cache)	Không	Không	Không
Full-text quality	BM25, stemming, fuzzy	Cơ bản	BM25	Cơ bản
Data structures	Hash, JSON, Stream, TS...	Chỉ vectors	Objects + vectors	Collections
Operational	Quen thuộc, ecosystem lớn	Managed only	Self-host / Cloud	Self-host / Cloud
Giá	Free + Enterprise	Freemium	Free + Cloud	Free + Cloud

Khi nào Redis là lựa chọn tốt nhất?

Bạn đã dùng Redis: Không cần thêm infrastructure mới. Vector search + caching + session + pub/sub trong cùng một hệ thống.
Latency cực thấp: In-memory engine cho sub-millisecond response — quan trọng cho real-time RAG, autocomplete, recommendation.
Hybrid queries phức tạp: FT.HYBRID với geo, time, numeric filtering trong một lệnh duy nhất.

Khi nào nên dùng dedicated vector DB?

Billion-scale vectors: Nếu dataset vượt quá RAM capacity, các hệ thống disk-based như Milvus phù hợp hơn.
Chỉ cần vector search: Nếu không cần full-text, caching hay data structures khác, dedicated vector DB đơn giản hơn.
Multi-tenancy phức tạp: Pinecone có namespace isolation tốt hơn cho SaaS multi-tenant.

9. Tối ưu hiệu năng FT.HYBRID

9.1 Tuning Parameters

Tham số	Mặc định	Khuyến nghị	Tác động
`EF_RUNTIME`	10	50-200	Cao = recall tốt hơn, latency tăng
`K` (KNN)	10	2-3x top_k mong muốn	Candidate pool cho score fusion
`WINDOW`	20	20-50	Số candidates xem xét khi fusion
`CONSTANT` (RRF)	60	60 (giữ nguyên)	Cân bằng giữa top-ranked và lower-ranked
`ALPHA/BETA`	—	Thử 0.3/0.7 → 0.5/0.5 → 0.7/0.3	Trọng số text vs vector

9.2 Pre-filtering Strategies

# FILTER trong VSIM giúp thu hẹp không gian vector TRƯỚC khi search
# → giảm đáng kể thời gian tính toán
FT.HYBRID products-idx
  SEARCH "gaming laptop RTX"
  VSIM @features_vec $query_vec
  KNN 2 K 20
  FILTER "@category:{electronics} @price:[500 3000]"
  COMBINE RRF 2 WINDOW 30
  LOAD 3 @title @price @rating
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."

# Có 2 chính sách pre-filter:
# ADHOC_BF — scan brute-force, tốt cho filter chọn ít (<10% data)
# BATCHES — xử lý theo batch, tốt cho filter chọn nhiều

10. Kết luận

Redis 8.4 với FT.HYBRID đánh dấu bước chuyển quan trọng: từ một công cụ cache/data structure thuần túy sang một nền tảng AI-native toàn diện. Việc tích hợp score fusion trực tiếp trong engine không chỉ đơn giản hóa kiến trúc mà còn mang lại hiệu năng vượt trội — sub-millisecond latency cho hybrid queries mà trước đây cần orchestrate nhiều service.

Với những hệ thống đã sử dụng Redis làm cache hoặc session store, việc mở rộng thêm hybrid search cho RAG pipeline gần như là "free" — không cần thêm infrastructure, không cần migration phức tạp. Đây chính là sức mạnh của triết lý "One Redis" mà team phát triển đã theo đuổi từ phiên bản 8.0.

Bắt đầu ngay

1. Nâng cấp lên Redis 8.4
2. Tạo index với VECTOR field (HNSW hoặc FLAT)
3. Bắt đầu với FT.HYBRID ... COMBINE RRF — default parameters đã tối ưu cho hầu hết use cases
4. A/B test giữa RRF và Linear khi có đủ metrics
5. Tune EF_RUNTIME và WINDOW dựa trên latency budget

Tham khảo:

#redis #Vector Search #Hybrid Search #RAG #AI #Full-Text Search #Score Fusion #Caching

# Redis 8.4 Hybrid Search — Tìm kiếm kết hợp Full-Text và Vector cho AI

## 1. Bối cảnh — Vì sao tìm kiếm đơn lẻ không đủ?

- **Full-text search (BM25)**: Tuyệt vời với keyword matching chính xác, nhưng "mù" trước ngữ nghĩa. Tìm "xe ô tô" sẽ không trả về kết quả chứa "phương tiện giao thông".
- **Vector search**: Hiểu ngữ nghĩa sâu nhờ embeddings, nhưng dễ "hallucinate" khi cần match chính xác tên riêng, mã sản phẩm, hoặc thuật ngữ kỹ thuật cụ thể.

49% Giảm tỷ lệ context failure khi dùng hybrid search so với single-mode

3.5x Cải thiện retrieval recall trong hệ thống Blended RAG

15% Tăng accuracy end-to-end trên các tác vụ reasoning phức tạp

87% Tốc độ thực thi lệnh nhanh hơn so với Redis 7.x

Nghiên cứu từ Anthropic (2025) và Apple ML Research (2024) đã chứng minh: **kết hợp cả hai phương pháp** mới là lời giải tối ưu. Redis 8.4 hiện thực hóa điều này qua lệnh `FT.HYBRID` — thực hiện score fusion ngay trong engine, không cần post-processing bên ngoài.

## 2. Redis 8.x — Hành trình tiến hóa

Trước khi đi sâu vào FT.HYBRID, hãy nhìn lại bức tranh lớn của Redis 8.x:

Redis 8.0 (2025)

**"One Redis"** — hợp nhất Redis Stack + Community Edition. Vector Sets (beta), hash field expiration (HGETEX/HSETEX/HGETDEL), 30+ tối ưu hiệu năng, BM25 thay thế TF-IDF làm default scorer.

Redis 8.4 (Q1/2026)

**FT.HYBRID** — lệnh hybrid search native. Score fusion (RRF + Linear), atomic cluster slot migration cho zero-downtime scaling, SIMD-driven optimizations.

Vector Sets (tiếp tục phát triển)

Kiểu dữ liệu mới cho vector similarity — nhẹ hơn Redis Query Engine, phù hợp use case đơn giản. Hỗ trợ quantization 8-bit, dimensionality reduction, JSON attribute filtering.

## 3. Vector Sets — Kiểu dữ liệu hoàn toàn mới

Vector Sets là phát minh của chính Salvatore Sanfilippo (cha đẻ Redis), lấy cảm hứng từ Sorted Sets nhưng thay vì lưu string + score, nó lưu **string + vector embedding đa chiều**.

### 3.1 Các lệnh cơ bản

```
# Thêm vector vào tập hợp
VADD product-vectors "laptop-001" VALUES 768 0.12 -0.34 0.56 ...
    SETATTR '{"category":"electronics","price":1299}'

# Tìm 5 vector tương tự nhất
VSIM product-vectors VALUES 768 0.11 -0.33 0.55 ... COUNT 5

# Tìm với filter theo attributes
VSIM product-vectors VALUES 768 0.11 -0.33 0.55 ... COUNT 5
    FILTER '.category == "electronics" && .price &lt; 2000'

# Cập nhật attributes
VSETATTR product-vectors "laptop-001"
    '{"category":"electronics","price":1199,"on_sale":true}'
```

### 3.2 Quantization và Dimensionality Reduction

Vector Sets hỗ trợ 3 chế độ quantization để cân bằng giữa bộ nhớ và độ chính xác:

| Chế độ | Kích thước/vector | Độ chính xác | Use case |
| --- | --- | --- | --- |
| **Full precision** (mặc định) | 100% | Cao nhất | Yêu cầu recall tối đa |
| **8-bit quantization** | ~25% | Gần như không mất | Production phổ thông |
| **Binary quantization** | ~3% | Giảm nhẹ | Dataset cực lớn, tìm kiếm sơ bộ |

Ngoài ra, **dimensionality reduction** qua random projection cho phép giảm số chiều vector mà vẫn giữ quan hệ tương tự giữa các điểm — đặc biệt hữu ích khi embedding model xuất ra vector 1536+ chiều nhưng ứng dụng chỉ cần độ phân biệt vừa phải.

### 3.3 Vector Sets vs Redis Query Engine

#### Khi nào dùng gì?

**Vector Sets**: Khi chỉ cần similarity search đơn thuần, API tối giản, single-node. Nghĩ như "Sorted Sets cho vectors".  
**Redis Query Engine (FT.SEARCH/FT.HYBRID)**: Khi cần kết hợp full-text + vector + geo + numeric filtering, horizontal scaling, enterprise features.

| Tính năng | Vector Sets | Redis Query Engine |
| --- | --- | --- |
| Vector search | Native, đơn giản | Advanced, scalable |
| Full-text search | Không | BM25, stemming, fuzzy |
| Hybrid queries | Không | FT.HYBRID |
| Geo/Numeric filter | JSON attributes | Native indexing |
| Scalability | Single-node | Cluster, Active-Active |
| Complexity | Rất thấp | Trung bình |

## 4. FT.HYBRID — Lệnh Hybrid Search trong Redis 8.4

`FT.HYBRID` là ngôi sao mới của Redis 8.4. Thay vì chạy 2 truy vấn riêng biệt rồi merge kết quả ở application layer, lệnh này thực hiện **score fusion ngay trong Redis engine** với độ phức tạp O(N+M).

```
graph TD
    A["Query đầu vào"] --> B["FT.HYBRID Engine"]
    B --> C["SEARCH Component  
BM25 Full-Text"]
    B --> D["VSIM Component  
Vector Similarity"]
    C --> E["Score Fusion  
RRF / Linear"]
    D --> E
    E --> F["Ranked Results"]
    F --> G["LOAD / APPLY / SORTBY  
Post-processing"]
    G --> H["Kết quả cuối cùng"]

style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style D fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style E fill:#e94560,stroke:#fff,color:#fff
    style F fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style G fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style H fill:#4CAF50,stroke:#fff,color:#fff

```

Luồng xử lý của FT.HYBRID — score fusion diễn ra trong engine, không cần post-processing bên ngoài

### 4.1 Cú pháp đầy đủ

```
FT.HYBRID index
  SEARCH query
    [SCORER scorer]
    [YIELD_SCORE_AS name]
  VSIM vector_field $vector_param
    [KNN count K k [EF_RUNTIME ef]]
    [RANGE count RADIUS radius [EPSILON epsilon]]
    [YIELD_SCORE_AS name]
    [FILTER filter]
  [COMBINE RRF count [CONSTANT c] [WINDOW w] [YIELD_SCORE_AS name]]
  [COMBINE LINEAR count [ALPHA a] [BETA b] [WINDOW w] [YIELD_SCORE_AS name]]
  [LIMIT offset num]
  [SORTBY count field [ASC | DESC]]
  [LOAD count field ...]
  [GROUPBY ... REDUCE ...]
  [APPLY expression AS name]
  PARAMS nargs vector_param vector_blob [name value ...]
  [TIMEOUT timeout]
```

### 4.2 Các thành phần chính

Mỗi lệnh FT.HYBRID gồm 3 phần bắt buộc:

```
graph LR
    A["SEARCH  
Full-text query  
BM25 scorer"] --> D["COMBINE  
Score Fusion"]
    B["VSIM  
Vector similarity  
KNN hoặc RANGE"] --> D
    D --> E["Kết quả  
xếp hạng"]

style A fill:#e94560,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style D fill:#4CAF50,stroke:#fff,color:#fff
    style E fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50

```

Ba thành phần bắt buộc của FT.HYBRID

## 5. Score Fusion — RRF vs Linear Combination

Đây là phần quan trọng nhất của hybrid search: **làm sao kết hợp điểm từ hai hệ thống tìm kiếm khác nhau** thành một bảng xếp hạng thống nhất?

### 5.1 Reciprocal Rank Fusion (RRF)

RRF là phương pháp mặc định, tính điểm dựa trên **vị trí xếp hạng** chứ không dựa trên giá trị score tuyệt đối:

```
# Công thức RRF cho mỗi document d:
RRF_score(d) = 1/(rank_text(d) + k) + 1/(rank_vector(d) + k)

# Trong đó k = CONSTANT (mặc định 60)
# rank_text(d) = vị trí của d trong kết quả full-text
# rank_vector(d) = vị trí của d trong kết quả vector
```

#### Ưu điểm của RRF

**Không cần normalize scores**: BM25 scores và cosine similarity có scale hoàn toàn khác nhau. RRF bỏ qua giá trị tuyệt đối, chỉ quan tâm thứ hạng — giải quyết triệt để vấn đề này.  
**Robust với outliers**: Một document có BM25 score cao bất thường không "áp đảo" kết quả.  
**Tham số k = 60**: Giá trị mặc định dựa trên nghiên cứu của Cormack et al. (2009), ưu tiên cân bằng giữa hai nguồn.

```
# Ví dụ: RRF với window = 40 và constant = 80
FT.HYBRID products-idx
  SEARCH "@category:electronics laptop"
  SCORER 4 BM25 1.5 0.8
  YIELD_SCORE_AS text_score
  VSIM @embedding $query_vec
  KNN 4 K 20 EF_RUNTIME 200
  YIELD_SCORE_AS vector_score
  COMBINE RRF 4 WINDOW 40 CONSTANT 80
  YIELD_SCORE_AS hybrid_score
  LOAD 3 @title @price @category
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."
```

### 5.2 Linear Combination

Khi bạn muốn **kiểm soát trọng số** giữa full-text và vector, Linear combination cho phép thiết lập tỷ lệ cụ thể:

```
# Công thức:
Linear_score(d) = alpha * normalized_text_score(d) + beta * normalized_vector_score(d)

# Ví dụ: ưu tiên semantic search (70% vector, 30% text)
FT.HYBRID docs-idx
  SEARCH "machine learning optimization"
  VSIM @content_vector $query_vec
  KNN 2 K 15
  COMBINE LINEAR 4 ALPHA 0.3 BETA 0.7
  LOAD *
  PARAMS 2 query_vec "\x00\x01..."
```

### 5.3 Chọn phương pháp nào?

| Tiêu chí | RRF | Linear |
| --- | --- | --- |
| Cần tune tham số? | Ít (chỉ constant, window) | Nhiều (alpha, beta cần thử nghiệm) |
| Robust với data mới? | Cao — rank-based | Trung bình — phụ thuộc score distribution |
| Kiểm soát trọng số? | Không trực tiếp | Có — alpha/beta rõ ràng |
| Nên dùng khi | Không biết trước tỷ lệ tối ưu | Đã A/B test và biết rõ trọng số |
| Khuyến nghị | Bắt đầu với RRF | Tối ưu sau khi có metrics |

## 6. Xây dựng RAG Pipeline với FT.HYBRID

Đây là phần thực hành: kết hợp Redis 8.4 hybrid search vào một pipeline RAG hoàn chỉnh.

```
sequenceDiagram
    participant U as User
    participant App as Application
    participant Emb as Embedding Model
    participant R as Redis 8.4
    participant LLM as LLM (Claude/GPT)

U->>App: Câu hỏi: "Cách tối ưu query PostgreSQL cho table lớn?"
    App->>Emb: Generate embedding
    Emb-->>App: query_vector [768d]
    App->>R: FT.HYBRID docs-idx  
SEARCH "tối ưu query PostgreSQL"  
VSIM @embedding $vec  
COMBINE RRF
    R-->>App: Top 5 documents (ranked)
    App->>LLM: Prompt + Context từ 5 docs
    LLM-->>U: Câu trả lời có nguồn tham chiếu

```

Luồng RAG với FT.HYBRID — hybrid retrieval cho context chất lượng cao hơn

### 6.1 Bước 1 — Tạo Index

```
# Tạo index hỗ trợ cả full-text và vector search
FT.CREATE docs-idx ON HASH PREFIX 1 doc:
  SCHEMA
    title TEXT WEIGHT 2.0
    content TEXT
    category TAG
    created_at NUMERIC SORTABLE
    embedding VECTOR HNSW 6
      TYPE FLOAT32
      DIM 768
      DISTANCE_METRIC COSINE
```

### 6.2 Bước 2 — Index Documents

```
import redis
from openai import OpenAI
import numpy as np

client = OpenAI()
r = redis.Redis(host='localhost', port=6379)

def index_document(doc_id, title, content, category):
    # Generate embedding
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=f"{title} {content}"
    )
    embedding = np.array(response.data[0].embedding, dtype=np.float32)

# Lưu vào Redis Hash
    r.hset(f"doc:{doc_id}", mapping={
        "title": title,
        "content": content,
        "category": category,
        "created_at": int(time.time()),
        "embedding": embedding.tobytes()
    })
```

### 6.3 Bước 3 — Hybrid Search

```
def hybrid_search(query, top_k=5):
    # Generate query embedding
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=query
    )
    query_vec = np.array(
        response.data[0].embedding, dtype=np.float32
    ).tobytes()

# FT.HYBRID — kết hợp BM25 + vector similarity
    result = r.execute_command(
        "FT.HYBRID", "docs-idx",
        "SEARCH", query,
        "SCORER", "4", "BM25", "1.2", "0.75",
        "YIELD_SCORE_AS", "text_score",
        "VSIM", "@embedding", "$query_vec",
        "KNN", "2", "K", str(top_k * 3),
        "YIELD_SCORE_AS", "vector_score",
        "COMBINE", "RRF", "4",
        "WINDOW", "30", "CONSTANT", "60",
        "YIELD_SCORE_AS", "hybrid_score",
        "LOAD", "4", "@title", "@content",
        "@text_score", "@vector_score",
        "LIMIT", "0", str(top_k),
        "PARAMS", "2", "query_vec", query_vec
    )
    return parse_results(result)
```

### 6.4 Bước 4 — Kết hợp với LLM

```
import anthropic

client_ai = anthropic.Anthropic()

def rag_answer(question):
    # Hybrid retrieval
    docs = hybrid_search(question, top_k=5)

# Build context
    context = "\n---\n".join(
        f"[{d['title']}] {d['content']}" for d in docs
    )

# Gọi Claude với context
    response = client_ai.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=2048,
        messages=[{
            "role": "user",
            "content": f"""Dựa trên tài liệu sau, trả lời câu hỏi.

Tài liệu:
{context}

Câu hỏi: {question}"""
        }]
    )
    return response.content[0].text
```

## 7. Use Cases nâng cao với FT.HYBRID

### 7.1 Ưu tiên tài liệu mới (Recency Boost)

```
# Kết hợp hybrid search + sắp xếp theo thời gian
FT.HYBRID docs-idx
  SEARCH "kubernetes deployment strategy"
  VSIM @embedding $query_vec
  KNN 2 K 20
  COMBINE RRF 2 WINDOW 30
  SORTBY 2 created_at DESC
  LOAD 3 @title @created_at @category
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."
```

### 7.2 Tìm kiếm theo khu vực địa lý

```
# Hybrid search + geo filter cho ứng dụng location-based
FT.HYBRID places-idx
  SEARCH "quán cà phê wifi yên tĩnh"
  VSIM @description_vec $query_vec
  KNN 2 K 15
  FILTER "@location:[106.6 10.7 5 km]"
  COMBINE LINEAR 4 ALPHA 0.4 BETA 0.6
  LOAD 4 @name @address @rating @distance
  PARAMS 2 query_vec "\x00\x01..."
```

### 7.3 Fuzzy Matching kết hợp Semantic

```
# Tìm cả lỗi chính tả + ngữ nghĩa tương tự
FT.HYBRID products-idx
  SEARCH "%%samsuung%% %%gallaxy%% ~phone ~smartphone"
  VSIM @features_vec $query_vec
  KNN 2 K 10
  COMBINE RRF 2 WINDOW 20
  LOAD *
  PARAMS 2 query_vec "\x00\x01..."

# %%word%% = fuzzy matching (Levenshtein distance)
# ~word = optional term (boost nếu match, không loại nếu không match)
```

## 8. Redis 8.4 vs Dedicated Vector Databases

| Tiêu chí | Redis 8.4 | Pinecone | Weaviate | Milvus |
| --- | --- | --- | --- | --- |
| **Hybrid search** | Native (FT.HYBRID) | Sparse + Dense vectors | BM25 + Vector | Sparse + Dense |
| **Score fusion** | RRF + Linear (in-engine) | Client-side | In-engine | RRF (in-engine) |
| **Latency** | Sub-ms (in-memory) | ~10-50ms | ~10-100ms | ~10-50ms |
| **Caching tích hợp** | Native (vốn là cache) | Không | Không | Không |
| **Full-text quality** | BM25, stemming, fuzzy | Cơ bản | BM25 | Cơ bản |
| **Data structures** | Hash, JSON, Stream, TS... | Chỉ vectors | Objects + vectors | Collections |
| **Operational** | Quen thuộc, ecosystem lớn | Managed only | Self-host / Cloud | Self-host / Cloud |
| **Giá** | Free + Enterprise | Freemium | Free + Cloud | Free + Cloud |

#### Khi nào Redis là lựa chọn tốt nhất?

**Bạn đã dùng Redis**: Không cần thêm infrastructure mới. Vector search + caching + session + pub/sub trong cùng một hệ thống.  
**Latency cực thấp**: In-memory engine cho sub-millisecond response — quan trọng cho real-time RAG, autocomplete, recommendation.  
**Hybrid queries phức tạp**: FT.HYBRID với geo, time, numeric filtering trong một lệnh duy nhất.

#### Khi nào nên dùng dedicated vector DB?

**Billion-scale vectors**: Nếu dataset vượt quá RAM capacity, các hệ thống disk-based như Milvus phù hợp hơn.  
**Chỉ cần vector search**: Nếu không cần full-text, caching hay data structures khác, dedicated vector DB đơn giản hơn.  
**Multi-tenancy phức tạp**: Pinecone có namespace isolation tốt hơn cho SaaS multi-tenant.

## 9. Tối ưu hiệu năng FT.HYBRID

### 9.1 Tuning Parameters

| Tham số | Mặc định | Khuyến nghị | Tác động |
| --- | --- | --- | --- |
| `EF_RUNTIME` | 10 | 50-200 | Cao = recall tốt hơn, latency tăng |
| `K` (KNN) | 10 | 2-3x top_k mong muốn | Candidate pool cho score fusion |
| `WINDOW` | 20 | 20-50 | Số candidates xem xét khi fusion |
| `CONSTANT` (RRF) | 60 | 60 (giữ nguyên) | Cân bằng giữa top-ranked và lower-ranked |
| `ALPHA/BETA` | — | Thử 0.3/0.7 → 0.5/0.5 → 0.7/0.3 | Trọng số text vs vector |

### 9.2 Pre-filtering Strategies

```
# FILTER trong VSIM giúp thu hẹp không gian vector TRƯỚC khi search
# → giảm đáng kể thời gian tính toán
FT.HYBRID products-idx
  SEARCH "gaming laptop RTX"
  VSIM @features_vec $query_vec
  KNN 2 K 20
  FILTER "@category:{electronics} @price:[500 3000]"
  COMBINE RRF 2 WINDOW 30
  LOAD 3 @title @price @rating
  LIMIT 0 10
  PARAMS 2 query_vec "\x00\x01..."

# Có 2 chính sách pre-filter:
# ADHOC_BF — scan brute-force, tốt cho filter chọn ít (<10% data)
# BATCHES — xử lý theo batch, tốt cho filter chọn nhiều
```

## 10. Kết luận

Redis 8.4 với `FT.HYBRID` đánh dấu bước chuyển quan trọng: từ một công cụ cache/data structure thuần túy sang một **nền tảng AI-native toàn diện**. Việc tích hợp score fusion trực tiếp trong engine không chỉ đơn giản hóa kiến trúc mà còn mang lại hiệu năng vượt trội — sub-millisecond latency cho hybrid queries mà trước đây cần orchestrate nhiều service.

#### Bắt đầu ngay

1. Nâng cấp lên Redis 8.4  
2. Tạo index với VECTOR field (HNSW hoặc FLAT)  
3. Bắt đầu với `FT.HYBRID ... COMBINE RRF` — default parameters đã tối ưu cho hầu hết use cases  
4. A/B test giữa RRF và Linear khi có đủ metrics  
5. Tune EF_RUNTIME và WINDOW dựa trên latency budget

**Tham khảo:**

- [FT.HYBRID Command Documentation — Redis](https://redis.io/docs/latest/commands/ft.hybrid/)
- [Revamping Context-Oriented Retrieval with Hybrid Search in Redis 8.4 — Redis Blog](https://redis.io/blog/revamping-context-oriented-retrieval-with-hybrid-search-in-redis-84/)
- [Announcing Vector Sets — Redis Blog](https://redis.io/blog/announcing-vector-sets-a-new-redis-data-type-for-vector-similarity/)
- [What's New in Redis 8.0 — Redis Docs](https://redis.io/docs/latest/develop/whats-new/8-0/)
- [Hybrid Search Benefits: Why RAG Systems Need Both Methods — Redis Blog](https://redis.io/blog/hybrid-search-benefits-rag-systems/)
- [Advanced RAG — Understanding Reciprocal Rank Fusion in Hybrid Search](https://glaforge.dev/posts/2026/02/10/advanced-rag-understanding-reciprocal-rank-fusion-in-hybrid-search/)

Domain-Driven Design thực chiến trên .NET 10 — Aggregate, Domain Event và Bounded Context

Nginx vs Caddy vs Traefik — Chọn Reverse Proxy cho hệ thống 2026

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.