Meilisearch 2026 — Deep Dive: Hybrid Search, Fragments API, and the Future of AI Retrieval

Posted on: 4/20/2026 10:27:44 AM

1. Why Meilisearch Is the Most Notable Search Engine of 2026

If Elasticsearch is the "bull" of the search world — powerful, versatile, but bulky — Meilisearch is a precision scalpel: small, unbelievably fast, and built for developer experience from day one. Founded in 2018 in France by a team of three engineers who previously worked with Algolia, Meilisearch inherits the "instant search" philosophy but takes the open source path with an MIT license.

2026 marks a major turning point for Meilisearch: from a pure full-text search engine, it's becoming a unified search & AI retrieval platform — integrating hybrid search (keyword + semantic), vector search, multi-modal (Fragments API), a chat engine for RAG, and soon sharding + serverless indexes. This article dives into the architecture, the latest v1.15/v1.16 features, the 2026 roadmap, and real-world .NET Core integration.

<50ms Typical response time
48K+ GitHub Stars
v1.16 Latest version
Rust Core language

2. Under the Hood — Rust, LMDB, and Ranking Rules

2.1. Why Rust + LMDB Matters

Meilisearch is written entirely in Rust — a language that provides near-C/C++ performance with absolute memory safety. The storage engine is LMDB (Lightning Memory-Mapped Database) — a B+Tree key-value store developed by Symas, known for its zero-copy reads and MVCC (Multi-Version Concurrency Control) support.

The key difference vs Typesense (keep everything in RAM) is that Meilisearch uses memory-mapped files: the OS only loads the needed pages into RAM following demand paging. This allows indexing datasets larger than physical RAM while keeping near-in-memory read speeds (thanks to the OS page cache).

graph TB
    A[Search Request] --> B[HTTP Server
Actix-web] B --> C[Query Parser] C --> D{Query Type} D -->|Keyword| E[Inverted Index] D -->|Semantic| F[Vector Index
HNSW] D -->|Hybrid| G[Both + Rerank] E --> H[Ranking Rules
Bucket Sort] F --> H G --> H H --> I[LMDB
Memory-mapped] I --> J[Response] style A fill:#e94560,stroke:#fff,color:#fff style B fill:#2c3e50,stroke:#fff,color:#fff style H fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style I fill:#16213e,stroke:#fff,color:#fff style J fill:#4CAF50,stroke:#fff,color:#fff style E fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style F fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style G fill:#f8f9fa,stroke:#e94560,color:#2c3e50

Meilisearch request processing pipeline — from HTTP to LMDB

2.2. Ranking Rules — The Heart of Relevance

Unlike Elasticsearch, which uses BM25 by default, Meilisearch has a tiered ranking rule system (bucket sort) — each rule gradually filters candidates instead of computing an aggregate score. The default order:

  1. Words — documents containing the most query terms
  2. Typo — prefer documents with fewer typos
  3. Proximity — terms appearing close together
  4. Attribute — matches on higher-priority attributes rank higher (title > description)
  5. Sort — custom sort by numeric/string fields
  6. Exactness — exact matches win

You can add custom rules, e.g., release_date:desc to favor newer content. The strength of this approach is that it's deterministic and easy to debug: you can trace why a document ranks high through each rule, instead of relying on BM25's "magic score."

// Configure ranking rules
{
  "rankingRules": [
    "words",
    "typo",
    "proximity",
    "attribute",
    "sort",
    "exactness",
    "release_date:desc",
    "popularity:desc"
  ]
}

3. Meilisearch v1.15 — Number Typos, String Filters, and Chinese Tokenizer

3.1. Disable Typo Tolerance on Numbers

Before v1.15, Meilisearch applied typo tolerance to every word, including numbers. This caused problems for data like postal codes, phone numbers, and years — searching 2024 could return both 2025 and 2004. Version 1.15 lets you disable typo tolerance for numbers specifically:

PATCH /indexes/{index_uid}/settings/typo-tolerance
{
  "disableOnNumbers": true
}

Bonus: significantly faster indexing for datasets with many unique numbers, since Meilisearch no longer needs to build typo variations for them.

3.2. Lexicographic String Filter

Comparison operators (<, <=, >, >=, TO) now work on strings, sorted lexicographically. Extremely useful for ISO date formats:

// Find posts before 2023-07-17
POST /indexes/posts/search
{
  "q": "rust",
  "filter": "release_date < \"2023-07-17\""
}

// Range: versions from v1.10 to v1.15
{
  "filter": "version \"1.10\" TO \"1.15\""
}

3.3. Improved Chinese Tokenizer

Meilisearch's Charabia tokenizer has been improved to segment Chinese more accurately — important because Chinese doesn't use spaces between words like English. Note: if your dataset has Chinese content, you must reindex after upgrading to v1.15, otherwise Chinese queries may be ignored entirely.

Vietnamese in Meilisearch

Meilisearch supports Vietnamese reasonably well via Unicode normalization, but it doesn't have special rules for tones/diacritics yet. For best results, include both accented and unaccented versions in searchableAttributes, or use synonyms maps ("ơn": ["on"]). Some developers use Python pre-processing with unidecode to create a parallel title_unsigned field.

4. Meilisearch v1.16 — Fragments API, Exports, and Documents Sort

This is the most valuable v1.16 feature. Fragments let you decompose documents and queries into distinct semantic pieces — e.g., a product can have a text fragment (name, description) and an image fragment (image URL). Real application: a user describes "loose-fit blue wool sweater" → Meilisearch uses text embedding to match the description AND image embedding to match the image → significantly better results than traditional full-text.

// Configure fragments for a movie index
PATCH /indexes/movies/settings/embedders
{
  "multimodal": {
    "source": "rest",
    "url": "https://api.openai.com/v1/embeddings",
    "indexingFragments": {
      "textPart": {
        "value": {
          "text": "{{doc.title}}. {{doc.description}}"
        }
      },
      "imagePart": {
        "value": {
          "image": "{{doc.poster_url}}"
        }
      }
    },
    "searchFragments": {
      "queryText": {
        "value": { "text": "{{q}}" }
      },
      "queryImage": {
        "value": { "image": "{{media.image}}" }
      }
    }
  }
}

// Search with a base64 image
POST /indexes/movies/search
{
  "q": "action movies",
  "hybrid": {
    "embedder": "multimodal",
    "semanticRatio": 0.7
  },
  "media": {
    "image": "data:image/jpeg;base64,/9j/4AAQ..."
  }
}

4.2. Exports API — Migration Without Dumps

Before v1.16, migrating data between instances required dumps/snapshots — slow and requiring downtime. The Exports API solves this: transfer documents directly from instance A to instance B over HTTP:

POST /experimental-features
{
  "exports": true
}

POST /export
{
  "url": "https://target-instance.meilisearch.io",
  "apiKey": "target_master_key",
  "indexes": {
    "products": {},
    "users": {
      "filter": "is_active = true",
      "overrideSettings": false
    }
  }
}

Particularly useful when migrating from local dev to Meilisearch Cloud — no SSH to the server, no scp-ing multi-GB dump files.

4.3. Sort on the Documents API

Previously, sort was only available in the Search API; v1.16 added sort for the Documents API — list/export documents in any order:

GET /indexes/products/documents?sort=price:desc,name:asc&limit=50

5. Hybrid Search — Combining Keyword + Semantic

5.1. Why Hybrid Is the "Right" Answer

Keyword search excels when users know exactly what they're after ("iPhone 15 Pro Max 256GB") but fails on descriptive queries ("a great phone for shooting vlogs"). Semantic (vector) search is the opposite: it captures intent well but may miss important exact matches. Hybrid combines both with a tunable ratio:

graph LR
    A[Query: phone
for good vlogging] --> B[Keyword
BM25-like] A --> C[Embedder
OpenAI/Jina/Voyage] C --> D[Vector Search
HNSW] B --> E[Score Fusion
semanticRatio: 0.5] D --> E E --> F[Reranker
Optional] F --> G[Final Results] style A fill:#e94560,stroke:#fff,color:#fff style B fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style C fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style D fill:#16213e,stroke:#fff,color:#fff style E fill:#2c3e50,stroke:#fff,color:#fff style F fill:#f8f9fa,stroke:#e94560,color:#2c3e50 style G fill:#4CAF50,stroke:#fff,color:#fff

Meilisearch hybrid search flow — keyword + vector + rerank

5.2. Configuring the Embedder

// Step 1: Configure the embedder (OpenAI)
PATCH /indexes/products/settings/embedders
{
  "openai-small": {
    "source": "openAi",
    "model": "text-embedding-3-small",
    "apiKey": "sk-...",
    "documentTemplate": "A product titled {{doc.name}} in category {{doc.category}} with description: {{doc.description}}",
    "dimensions": 1536
  }
}

// Step 2: Hybrid search
POST /indexes/products/search
{
  "q": "phone for good vlogging",
  "hybrid": {
    "embedder": "openai-small",
    "semanticRatio": 0.7
  },
  "limit": 20
}

Composite Embedder — A Real-World Pattern

The best production pattern: use a small local embedder (e.g., all-MiniLM-L6-v2 via Ollama) for queries (low latency, free), and a remote high-quality embedder (OpenAI, Voyage) for indexing (runs once, high-quality vectors). Meilisearch 2026 is turning this pattern into a single toggle via the composite embedder API.

5.3. Reranking — Boosting Relevance Another Notch

After hybrid returns the top-K candidates, you can pass them through a reranker model for re-sorting. Meilisearch has native Cohere Rerank support and is extending to Jina, Voyage, and generic REST:

POST /indexes/products/search
{
  "q": "phone for vlogging",
  "hybrid": {
    "embedder": "openai-small",
    "semanticRatio": 0.5
  },
  "ranker": {
    "model": "cohere-rerank-v3",
    "apiKey": "cohere_...",
    "topK": 10
  },
  "limit": 10
}

6. 2026 Roadmap — Where Meilisearch Is Headed

In March 2026, Meilisearch announced a roadmap with 4 major initiatives:

6.1. Any Workload — Sharding & Serverless

Sharding (released in v1.37): split an index across multiple shards on multiple nodes, breaking the single-machine limit. High availability via replication.

Serverless Indexes (Q3 2026): inactive indexes move to object storage (S3), spinning up on query. Huge use case: multi-tenant SaaS with millions of tenants where only 5% are concurrently active — instead of paying for 1 million warm indexes, pay only for 50,000 active ones.

6.2. Hybrid by Default

  • Guided setup on the Cloud dashboard — enable semantic search in <2 minutes
  • AI-generated document templates — Meilisearch analyzes sample documents and generates an optimal template
  • AI Gateway — middleware between Meilisearch and any AI provider: retry, fallback, auth, metering, and very strong caching (same text → same vector → 100% cache hit)
  • Proprietary embedding + reranking models — run inside the binary, no external API needed

6.3. From API to Platform

  • The Cloud dashboard exposes every API feature — shard management, index swap, webhooks, transfer
  • A "Top 50 slowest requests" dashboard with pattern detection and optimization suggestions
  • Per-request tracing (v1.35+) — detailed breakdown: tokenization, keyword, semantic, formatting
  • AI diagnostic helper that explains slow queries in plain language

6.4. RAG at Scale

  • Chat engine with parallel multi-search — a single tool call fires multiple searches in parallel
  • Dynamic facet discovery — the LLM autonomously explores filterable attributes
  • Conversational memory scoped by tenant
  • Unified OpenAI-compatible gateway: OpenAI, Anthropic, Mistral, Cohere, Vertex, Bedrock, Ollama with fallback chains

7. Comparison with Elasticsearch, Typesense, Algolia

Criterion Meilisearch Elasticsearch Typesense Algolia
License MIT (OSS) SSPL/AGPL GPLv3 (OSS) Proprietary
Core language Rust Java C++ C++ (closed)
Storage LMDB (mmap disk) Lucene (disk) In-memory Proprietary
Typical latency <50ms 100-500ms <30ms <50ms
RAM requirement Flexible High Equal to dataset N/A (hosted)
Typo tolerance Built-in, on by default Needs fuzzy config Built-in Built-in
Hybrid search Native (v1.6+) Via kNN + BM25 Native Via NeuralSearch (add-on)
Multi-modal Native (v1.16 Fragments) Via vectors Limited Limited
Sharding v1.37+ (new) Mature Yes Yes
RAG/Chat Native (Chats API) Via ES Relevance Engine Limited Via NeuralSearch
Setup complexity Very simple Complex Simple Simple (hosted)
Pricing model Self-host free / Cloud usage-based Elastic Cloud / self-host Self-host free / Cloud Per-operation (expensive)

8. .NET Core Integration

8.1. SDK Setup

# Install the package
dotnet add package MeiliSearch

# docker-compose.yml for dev
version: '3.8'
services:
  meilisearch:
    image: getmeili/meilisearch:v1.16
    ports:
      - "7700:7700"
    environment:
      - MEILI_MASTER_KEY=your_master_key_here
      - MEILI_ENV=development
    volumes:
      - meili_data:/meili_data

volumes:
  meili_data:

8.2. Model and Service Class

public class Product
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Description { get; set; }
    public string Category { get; set; }
    public decimal Price { get; set; }
    public DateTime CreatedAt { get; set; }
    public string[] Tags { get; set; }
}

public class MeiliSearchService
{
    private readonly MeilisearchClient _client;
    private readonly Index _productsIndex;

    public MeiliSearchService(IConfiguration config)
    {
        _client = new MeilisearchClient(
            config["Meilisearch:Url"] ?? "http://localhost:7700",
            config["Meilisearch:ApiKey"]
        );
        _productsIndex = _client.Index("products");
    }

    public async Task IndexProductsAsync(IEnumerable<Product> products)
    {
        var taskInfo = await _productsIndex.AddDocumentsAsync(products);
        await _client.WaitForTaskAsync(taskInfo.TaskUid);
    }

    public async Task<SearchResult<Product>> SearchAsync(
        string query,
        int page = 1,
        int pageSize = 20,
        string? categoryFilter = null)
    {
        var searchQuery = new SearchQuery
        {
            HitsPerPage = pageSize,
            Page = page,
            AttributesToHighlight = new[] { "name", "description" },
            HighlightPreTag = "<mark>",
            HighlightPostTag = "</mark>"
        };

        if (!string.IsNullOrWhiteSpace(categoryFilter))
        {
            searchQuery.Filter = $"category = \"{categoryFilter}\"";
        }

        return await _productsIndex.SearchAsync<Product>(query, searchQuery);
    }

    public async Task ConfigureIndexAsync()
    {
        await _productsIndex.UpdateSearchableAttributesAsync(new[]
        {
            "name",
            "description",
            "category",
            "tags"
        });

        await _productsIndex.UpdateFilterableAttributesAsync(new[]
        {
            "category",
            "price",
            "createdAt"
        });

        await _productsIndex.UpdateSortableAttributesAsync(new[]
        {
            "price",
            "createdAt"
        });

        await _productsIndex.UpdateRankingRulesAsync(new[]
        {
            "words", "typo", "proximity", "attribute",
            "sort", "exactness", "createdAt:desc"
        });
    }
}

8.3. API Controller

[ApiController]
[Route("api/products")]
public class ProductsController : ControllerBase
{
    private readonly MeiliSearchService _search;

    public ProductsController(MeiliSearchService search) => _search = search;

    [HttpGet("search")]
    public async Task<IActionResult> Search(
        [FromQuery] string q,
        [FromQuery] int page = 1,
        [FromQuery] int pageSize = 20,
        [FromQuery] string? category = null)
    {
        var result = await _search.SearchAsync(q, page, pageSize, category);

        return Ok(new
        {
            hits = result.Hits,
            page = result.Page,
            totalPages = result.TotalPages,
            totalHits = result.TotalHits,
            processingTimeMs = result.ProcessingTimeMs
        });
    }
}

8.4. Background Sync — CDC Pattern with SQL Server

In real systems, master data lives in a SQL database, and Meilisearch is only the search layer. A common sync pattern:

public class ProductSyncService : BackgroundService
{
    private readonly IServiceProvider _sp;
    private readonly MeiliSearchService _search;

    protected override async Task ExecuteAsync(CancellationToken stop)
    {
        while (!stop.IsCancellationRequested)
        {
            using var scope = _sp.CreateScope();
            var db = scope.ServiceProvider.GetRequiredService<AppDbContext>();

            var lastSync = await GetLastSyncTimeAsync();

            var changed = await db.Products
                .Where(p => p.UpdatedAt > lastSync)
                .AsNoTracking()
                .ToListAsync(stop);

            if (changed.Count > 0)
            {
                await _search.IndexProductsAsync(changed);
                await SaveLastSyncTimeAsync(DateTime.UtcNow);
            }

            await Task.Delay(TimeSpan.FromSeconds(10), stop);
        }
    }
}

Scale-up pattern

For large datasets, instead of polling, use Change Data Capture (SQL Server CDC, Debezium) → publish messages to Kafka → a consumer calls IndexProductsAsync. This pattern keeps Meilisearch eventually consistent with the master DB within seconds and handles very high write loads. Combined with UpdateDocumentsAsync (partial update) to avoid shipping the whole document every time a single field changes.

9. Performance and Real Benchmarks

A typical benchmark on a 1-million-product e-commerce dataset (~2GB JSON):

Operation Meilisearch v1.16 Elasticsearch 8.x Typesense
Initial indexing (1M docs) ~8 min ~15 min ~5 min
Index size on disk ~2.8 GB ~4.5 GB ~3.2 GB (RAM)
Search RAM (2GB dataset) ~500 MB ~4 GB 3.2 GB (required)
P50 latency (keyword) 12ms 45ms 8ms
P99 latency (keyword) 48ms 180ms 35ms
P50 latency (hybrid) 35ms 120ms 28ms
Single-node QPS (4 vCPU) ~2,500 ~1,200 ~3,800
Update latency 100-500ms 1-3s 50-200ms

Benchmarks — See beyond the numbers

Typesense wins on raw speed because everything is in RAM, but the trade-off is you must provision enough RAM for the entire dataset. Meilisearch wins on performance-per-dollar — a 10GB dataset runs fine on a 4GB RAM machine, while Typesense would crash. Elasticsearch loses on both but makes up for it with the Kibana/Logstash/ingest pipeline ecosystem that Meilisearch doesn't have.

10. Chat API — RAG Right Inside Meilisearch

Since v1.15.1, Meilisearch ships Chats API — an OpenAI-compatible endpoint enabling conversational retrieval without building a separate RAG pipeline:

// Configure a chat workspace
POST /chats/docs-support/settings
{
  "source": "openAi",
  "orgId": "org-...",
  "apiKey": "sk-...",
  "baseUrl": "https://api.openai.com/v1",
  "prompts": {
    "system": "You are our documentation support assistant. Only answer based on retrieved documents."
  }
}

// Enable the tool per index
PATCH /indexes/docs/settings/chat
{
  "description": "Technical product documentation, including FAQs and guides",
  "documentTemplate": "Title: {{doc.title}}\nContent: {{doc.content}}",
  "searchParameters": {
    "hybrid": { "embedder": "openai-small", "semanticRatio": 0.6 },
    "limit": 5
  }
}

// Send a chat request — OpenAI-compatible
POST /chats/docs-support/chat/completions
{
  "model": "gpt-4o-mini",
  "messages": [
    { "role": "user", "content": "How do I configure a webhook?" }
  ],
  "stream": true
}

Meilisearch handles: (1) using the LLM to generate search queries from messages, (2) querying the index, (3) injecting documents into the prompt, (4) streaming the response. Citation events are emitted alongside the content stream so the UI can highlight sources. It's the fastest way to add an AI chatbot to your app without building a complex RAG pipeline.

11. When to Choose Meilisearch?

graph TD
    A[Need a search engine] --> B{Dataset size?}
    B -->|<100M docs| C{Need AI/Semantic?}
    B -->|>100M docs + complex analytics| D[Elasticsearch]

    C -->|Yes| E{Prefer OSS?}
    C -->|No, keyword only| F{Latency critical?}

    E -->|Yes| G[✅ Meilisearch]
    E -->|No, budget ok| H[Algolia]

    F -->|Yes, enough RAM| I[Typesense]
    F -->|Cost-balanced| G

    style A fill:#e94560,stroke:#fff,color:#fff
    style G fill:#4CAF50,stroke:#fff,color:#fff
    style D fill:#2c3e50,stroke:#fff,color:#fff
    style H fill:#2c3e50,stroke:#fff,color:#fff
    style I fill:#2c3e50,stroke:#fff,color:#fff

Decision tree for choosing a search engine by workload

✅ Choose Meilisearch when:

  • You need search-as-you-type latency <50ms for a web/mobile app
  • Small team wanting fast setup, readable docs, self-host on ordinary VPS
  • Dataset from thousands to tens of millions of documents
  • Need hybrid search (keyword + semantic) out-of-the-box
  • Want to add a RAG chatbot without building your own pipeline
  • Limited budget, don't want Algolia pricing lock-in

❌ Consider alternatives when:

  • Need complex analytics (multi-level aggregations, time-series) — Elasticsearch/OpenSearch
  • Hundreds of millions of docs with heavy traffic — wait for Meilisearch sharding to mature, or use ES
  • Need log management, APM, SIEM ecosystem — Elastic Stack
  • Team strictly wants a hosted solution with zero infra worry — Algolia

12. Meilisearch Development Timeline

2018
Meilisearch is founded by former Algolia engineers, MIT-licensed open source from day one
2021
v0.20 — Stable typo tolerance, custom ranking rules
2023
v1.0 GA — Production-ready, API stable
2023 Q4
v1.6 — Vector search (HNSW) and hybrid search debut
2024
v1.10-v1.13 — Embedder API stable, Meilisearch Cloud GA, Chat GA
2025 Q4
v1.15 — Disable typo on numbers, lexicographic string filter, improved Chinese tokenizer, Chats API
2026 Q1
v1.16 — Fragments API (multi-modal), Exports API, Documents sort
2026 Q2
v1.35-v1.37 — Per-request tracing, sharding release
2026 Q3 (planned)
Serverless indexes, AI Gateway, proprietary embedding + reranking models

Conclusion

Meilisearch in 2026 is no longer simply "open-source Algolia." It has evolved into a unified platform for search + AI retrieval, with features that a few years ago required combining Elasticsearch + Pinecone + LangChain + a custom reranker. With Fragments API, the upcoming AI Gateway, serverless indexes, and proprietary embedding models, Meilisearch is betting big on a future where every app needs intelligent search, not just plain full-text.

For .NET developers, Meilisearch is practically without competition if you need a modern, low-latency, hybrid-ready search engine without managing a JVM-based Elasticsearch cluster. A 5-minute Docker setup, a clean C# SDK, and excellent docs — all the DX factors Vietnamese developers consistently value highly.

References