LangGraph — Điều phối AI Agent phức tạp bằng kiến trúc đồ thị

Posted on: 5/8/2026 10:15:02 AM

Table of contents

1. LangGraph là gì
1. LangGraph ≠ LangChain
2. Kiến trúc cốt lõi — State, Node, Edge
3. Persistence & Checkpointer
1. 3.1. Tại sao cần Persistence
2. 3.2. Các loại Checkpointer
4. Human-in-the-Loop — Con người trong vòng lặp
1. 4.1. Cơ chế Interrupt
  1. Use case thực tế
5. Multi-Agent Patterns
6. So sánh với CrewAI và AutoGen
1. AutoGen đang ở chế độ bảo trì
7. Triển khai Production
1. 7.1. LangSmith Deployment
2. 7.2. Self-hosted Architecture
8. Khi nào nên dùng LangGraph
1. Nên dùng khi
2. Không nên dùng khi
9. Thực hành tốt nhất
10. Kết luận

Khi AI Agent ngày càng phức tạp — từ chatbot đơn giản đến hệ thống tự động hóa đa bước, đa công cụ — câu hỏi không còn là "dùng LLM nào" mà là "điều phối workflow agent như thế nào để đáng tin cậy trong production". LangGraph, framework đồ thị của LangChain, đang nổi lên như câu trả lời hàng đầu: mô hình hóa toàn bộ agent workflow dưới dạng directed graph có trạng thái, hỗ trợ persistence, human-in-the-loop, và multi-agent orchestration ngay từ thiết kế.

Bài viết này sẽ đi sâu vào kiến trúc cốt lõi của LangGraph, các pattern quan trọng để xây dựng AI Agent production-ready, và so sánh thực tế với các framework cạnh tranh như CrewAI và AutoGen.

34%Thị phần agent framework tại doanh nghiệp lớn (Gartner Q1/2026)

76%Điểm benchmark task phức tạp (cao nhất trong các framework)

40K+GitHub Stars

2Ngôn ngữ hỗ trợ (Python & TypeScript)

1. LangGraph là gì

LangGraph là một low-level orchestration framework cho phép xây dựng, quản lý và triển khai các AI Agent có trạng thái (stateful), chạy lâu dài (long-running). Thay vì chuỗi pipeline tuyến tính (linear chain) như LangChain truyền thống, LangGraph mô hình hóa workflow dưới dạng directed cyclic graph — đồ thị có hướng, cho phép lặp (cycle), rẽ nhánh có điều kiện (conditional branching), và checkpoint tại mọi node.

Điều này giải quyết một vấn đề cốt lõi: agent thực tế không chạy tuần tự từ A đến Z. Chúng cần lặp lại khi kết quả chưa đạt, rẽ nhánh dựa trên phản hồi, tạm dừng chờ phê duyệt của con người, và phục hồi sau sự cố. LangGraph được thiết kế chính xác cho những yêu cầu này.

LangGraph ≠ LangChain

LangGraph là thư viện độc lập, có thể sử dụng mà không cần LangChain. Nó tập trung vào orchestration layer — điều phối luồng thực thi — trong khi LangChain cung cấp abstractions cho LLM calls, prompt templates, và tool integrations. Nhiều production deployment chỉ dùng LangGraph + LLM SDK trực tiếp mà bỏ qua LangChain hoàn toàn.

2. Kiến trúc cốt lõi — State, Node, Edge

Mọi thứ trong LangGraph xoay quanh ba khái niệm: State (trạng thái chia sẻ), Node (đơn vị xử lý), và Edge (luồng điều kiện). Ba thành phần này tạo nên một StateGraph — đồ thị có typed state được cập nhật incremental qua mỗi node.

graph TD
    START(["__start__"]) --> A["Agent Node
(LLM reasoning)"]
    A -->|"tool_calls detected"| B["Tool Node
(execute tools)"]
    A -->|"no tool_calls"| END(["__end__"])
    B --> A

    style START fill:#4CAF50,stroke:#fff,color:#fff
    style END fill:#e94560,stroke:#fff,color:#fff
    style A fill:#2c3e50,stroke:#fff,color:#fff
    style B fill:#f8f9fa,stroke:#e94560,color:#2c3e50

Hình 1: ReAct Agent loop cơ bản — Agent suy luận, gọi tool, nhận kết quả, lặp lại cho đến khi hoàn tất

2.1. State — Trạng thái chia sẻ

State trong LangGraph là một typed dictionary đại diện cho snapshot hiện tại của toàn bộ workflow. Mỗi node nhận state làm input, xử lý, và trả về một partial update — LangGraph tự động merge vào state chung thay vì overwrite toàn bộ.

from typing import Annotated, TypedDict
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    messages: Annotated[list, add_messages]
    next_action: str
    iteration_count: int

Cơ chế reducer (như add_messages) cho phép định nghĩa cách merge state: append vào list, cộng dồn số, hoặc custom logic bất kỳ. Đây là nền tảng để nhiều node chạy song song mà không conflict — mỗi node cập nhật field riêng, reducer gộp kết quả.

2.2. Node — Đơn vị xử lý logic

Node là một Python function (hoặc TypeScript function) nhận state và trả về partial state update. Node có thể là:

LLM call: gửi messages đến model, nhận phản hồi
Tool execution: chạy function/API dựa trên tool_calls từ LLM
Logic thuần: transform data, validate, filter
Subgraph: một StateGraph lồng nhau, chạy như node đơn

from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o")

def agent_node(state: AgentState):
    response = model.invoke(state["messages"])
    return {"messages": [response]}

def tool_node(state: AgentState):
    last_message = state["messages"][-1]
    results = execute_tools(last_message.tool_calls)
    return {"messages": results}

2.3. Edge — Luồng điều kiện

Edge kết nối các node và quyết định node nào chạy tiếp theo. LangGraph hỗ trợ hai loại edge:

Normal edge: luôn đi từ node A sang node B
Conditional edge: hàm nhận state và trả về tên node tiếp theo

from langgraph.graph import StateGraph

graph = StateGraph(AgentState)

graph.add_node("agent", agent_node)
graph.add_node("tools", tool_node)

graph.set_entry_point("agent")

graph.add_conditional_edges(
    "agent",
    should_continue,  # function trả về "tools" hoặc "__end__"
    {"tools": "tools", "__end__": "__end__"}
)
graph.add_edge("tools", "agent")  # sau khi chạy tool, quay lại agent

Conditional edge là trái tim của LangGraph — nó cho phép agent tự quyết định luồng thực thi dựa trên trạng thái hiện tại, tạo ra các workflow phức tạp mà pipeline tuyến tính không thể biểu diễn.

3. Persistence & Checkpointer

3.1. Tại sao cần Persistence

Agent trong production không chạy xong trong 1 request. Chúng có thể cần chờ phê duyệt hàng giờ, bị gián đoạn do deploy, hoặc crash giữa chừng. Không có persistence, toàn bộ tiến trình sẽ mất. LangGraph giải quyết điều này bằng Checkpointer — tự động lưu state sau mỗi node execution.

graph LR
    A["Node A
thực thi"] -->|"save state"| CP[("Checkpointer
PostgreSQL / Redis")]
    CP -->|"load state"| B["Node B
thực thi"]
    B -->|"save state"| CP
    CP -->|"crash recovery"| B

    style A fill:#2c3e50,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style CP fill:#e94560,stroke:#fff,color:#fff

Hình 2: Checkpointer lưu state sau mỗi node — cho phép resume sau crash hoặc restart

3.2. Các loại Checkpointer

Checkpointer	Sử dụng khi	Đặc điểm
MemorySaver	Development, testing	Lưu trong RAM, mất khi restart
SqliteSaver	Single-process, prototype	File-based, đơn giản
PostgresSaver	Production	Multi-process, durable, scale tốt
RedisSaver	High-throughput production	In-memory + persistence, TTL support

from langgraph.checkpoint.postgres import PostgresSaver

DB_URI = "postgresql://user:pass@localhost:5432/langgraph_db"

with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
    checkpointer.setup()  # tạo bảng nếu chưa có
    app = graph.compile(checkpointer=checkpointer)

    # Mỗi thread_id là một conversation/workflow instance riêng
    config = {"configurable": {"thread_id": "order-processing-42"}}
    result = app.invoke(initial_state, config)

Mỗi thread_id đại diện cho một workflow instance. Bạn có thể resume bất kỳ thread nào bằng cách invoke lại với cùng thread_id — state sẽ được load từ checkpoint cuối cùng.

4. Human-in-the-Loop — Con người trong vòng lặp

4.1. Cơ chế Interrupt

Một trong những tính năng mạnh nhất của LangGraph là interrupt — tạm dừng workflow tại bất kỳ node nào, chờ input từ con người (có thể hàng giờ hoặc hàng ngày), rồi resume chính xác từ vị trí đã dừng.

Use case thực tế

Một agent xử lý hoàn tiền: tự động phân tích yêu cầu, kiểm tra lịch sử đơn hàng, tính toán số tiền — nhưng tạm dừng để manager phê duyệt trước khi thực sự chuyển tiền. Không có interrupt, bạn phải tự xây dựng state persistence, queue, polling — LangGraph xử lý hết.

from langgraph.types import interrupt, Command

def approval_node(state: AgentState):
    # Tạm dừng workflow, gửi thông tin cho human
    decision = interrupt({
        "question": "Approve refund of $150 for order #42?",
        "options": ["approve", "reject", "escalate"]
    })

    if decision == "approve":
        return Command(goto="process_refund")
    elif decision == "reject":
        return Command(goto="notify_customer_rejected")
    else:
        return Command(goto="escalate_to_senior")

# Resume workflow sau khi human quyết định
app.invoke(
    Command(resume="approve"),
    config={"configurable": {"thread_id": "refund-request-42"}}
)

Khi interrupt() được gọi, LangGraph lưu toàn bộ state vào checkpointer, đánh dấu thread là interrupted, và trả control về caller. Khi human gửi quyết định qua Command(resume=...), workflow tiếp tục chính xác từ dòng code sau interrupt().

5. Multi-Agent Patterns

LangGraph hỗ trợ ba pattern chính để xây dựng hệ thống multi-agent:

5.1. Supervisor Pattern

Một agent trung tâm (supervisor) điều phối các agent chuyên biệt (workers). Supervisor quyết định worker nào xử lý task tiếp theo dựa trên trạng thái hiện tại và kết quả trước đó.

graph TD
    S["Supervisor Agent
(điều phối)"] -->|"research task"| R["Research Agent"]
    S -->|"code task"| C["Coding Agent"]
    S -->|"review task"| V["Review Agent"]
    R -->|"kết quả"| S
    C -->|"kết quả"| S
    V -->|"kết quả"| S
    S -->|"hoàn tất"| END(["__end__"])

    style S fill:#e94560,stroke:#fff,color:#fff
    style R fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#2c3e50,stroke:#fff,color:#fff
    style V fill:#2c3e50,stroke:#fff,color:#fff
    style END fill:#4CAF50,stroke:#fff,color:#fff

Hình 3: Supervisor Pattern — agent trung tâm điều phối các worker chuyên biệt

from langgraph_supervisor import create_supervisor

supervisor = create_supervisor(
    model=ChatOpenAI(model="gpt-4o"),
    agents=[research_agent, coding_agent, review_agent],
    prompt="You are a tech lead. Delegate tasks to the right team member."
)

app = supervisor.compile()

5.2. Subgraph & Hierarchical Teams

Khi hệ thống phức tạp hơn, bạn có thể lồng subgraph vào graph chính — mỗi team là một subgraph với supervisor riêng. Graph cấp cao nhất chỉ thấy team-level nodes, không cần biết chi tiết bên trong.

# Team Research: 3 agents chuyên biệt
research_team = StateGraph(ResearchState)
research_team.add_node("web_searcher", web_search_agent)
research_team.add_node("analyst", data_analyst_agent)
research_team.add_node("team_lead", research_supervisor)
research_subgraph = research_team.compile()

# Main graph: gộp các team
main_graph = StateGraph(MainState)
main_graph.add_node("research_team", research_subgraph)
main_graph.add_node("dev_team", dev_subgraph)
main_graph.add_node("orchestrator", orchestrator_node)

5.3. Handoff Pattern

Thay vì routing qua supervisor trung tâm, agent có thể trực tiếp chuyển giao (handoff) quyền điều khiển cho agent khác kèm payload. Pattern này phù hợp khi luồng xử lý có trình tự rõ ràng.

from langgraph.prebuilt import create_react_agent
from langgraph.types import Command

def transfer_to_billing(state):
    """Chuyển sang billing agent khi cần xử lý thanh toán."""
    return Command(
        goto="billing_agent",
        update={"context": "Customer needs billing help"}
    )

support_agent = create_react_agent(
    model=model,
    tools=[transfer_to_billing, search_knowledge_base]
)

6. So sánh với CrewAI và AutoGen

Tiêu chí	LangGraph	CrewAI	AutoGen
Kiến trúc	Graph-based (nodes & edges)	Role-based (crew & tasks)	Conversational (chat-based)
State management	Typed state + reducers, incremental update	Shared memory cơ bản	Chat history làm state
Persistence	Built-in checkpointer (Postgres, Redis)	Không có native	Không có native
Human-in-the-loop	interrupt() API — pause/resume bất kỳ node	Manual qua callback	Chat-based input
Benchmark (medium tasks)	76%	71%	68%
Độ phức tạp học	Cao — cần hiểu graph theory	Thấp — role/task dễ hình dung	Trung bình
Production readiness	Cao nhất — deterministic execution	Tốt cho prototyping	Maintenance mode (Microsoft chuyển sang Agent Framework)
Enterprise adoption	Uber, JP Morgan, Klara	Startups, SMBs	Azure ecosystem
Ngôn ngữ	Python, TypeScript	Python	Python, .NET

AutoGen đang ở chế độ bảo trì

Microsoft đã chuyển trọng tâm sang Agent Framework rộng hơn, và việc phát triển tính năng mới cho AutoGen đã dừng. Nếu bạn đang xây dựng hệ thống mới, cân nhắc LangGraph hoặc CrewAI thay vì AutoGen.

7. Triển khai Production

7.1. LangSmith Deployment

LangGraph Platform (nay đổi tên thành LangSmith Deployment) cung cấp hạ tầng triển khai chuyên dụng cho agent:

Tùy chọn	Mô tả	Phù hợp với
Cloud SaaS	Hosted bởi LangChain, zero-ops	Startup, prototype nhanh
BYOC (AWS)	Chạy trong VPC của bạn, LangChain quản lý provisioning	Enterprise cần data sovereignty
Self-hosted	Toàn quyền kiểm soát trên Kubernetes	Regulated industries (finance, healthcare)
Standalone	Lightweight — chỉ Agent Server + Postgres + Redis	Team nhỏ, single-service deployment

7.2. Self-hosted Architecture

Kiến trúc self-hosted bao gồm: Control Plane (quản lý deployment, routing) và Data Plane (Agent Servers chạy graph). Data Plane cần PostgreSQL (state + checkpoints) và Redis (task queue + pub/sub). Kubernetes là yêu cầu bắt buộc cho cả hai plane.

graph TB
    subgraph CP["Control Plane"]
        API["LangSmith API"]
        UI["Dashboard UI"]
    end

    subgraph DP["Data Plane"]
        AS1["Agent Server 1"]
        AS2["Agent Server 2"]
        AS3["Agent Server N"]
    end

    PG[("PostgreSQL
State & Checkpoints")]
    RD[("Redis
Task Queue")]

    UI --> API
    API --> AS1
    API --> AS2
    API --> AS3
    AS1 --> PG
    AS2 --> PG
    AS3 --> PG
    AS1 --> RD
    AS2 --> RD
    AS3 --> RD

    style CP fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style DP fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style PG fill:#e94560,stroke:#fff,color:#fff
    style RD fill:#2c3e50,stroke:#fff,color:#fff
    style API fill:#4CAF50,stroke:#fff,color:#fff
    style UI fill:#4CAF50,stroke:#fff,color:#fff
    style AS1 fill:#2c3e50,stroke:#fff,color:#fff
    style AS2 fill:#2c3e50,stroke:#fff,color:#fff
    style AS3 fill:#2c3e50,stroke:#fff,color:#fff

Hình 4: Kiến trúc Self-hosted LangGraph — Control Plane quản lý, Data Plane chạy agent

8. Khi nào nên dùng LangGraph

Nên dùng khi

Workflow phức tạp với nhiều bước, rẽ nhánh, lặp — như hệ thống xử lý đơn hàng, pipeline phân tích dữ liệu, hoặc AI assistant đa công cụ. Cần persistence — workflow chạy hàng giờ, cần survive restart/crash. Human-in-the-loop — cần phê duyệt con người ở các bước quan trọng. Multi-agent — cần nhiều agent chuyên biệt phối hợp.

Không nên dùng khi

Simple chatbot — nếu chỉ cần LLM + vài tools, dùng create_react_agent hoặc LLM SDK trực tiếp. Prototype nhanh — CrewAI có learning curve thấp hơn nhiều. Conversation-heavy — nếu agent chủ yếu chat qua lại giữa các role, AutoGen phù hợp hơn.

9. Thực hành tốt nhất

Bắt đầu nhỏ: Xây dựng single-agent ReAct loop trước, thêm complexity khi cần. Đừng nhảy thẳng vào multi-agent supervisor.
State typing nghiêm ngặt: Dùng TypedDict với type hints đầy đủ. State không có type sẽ rất khó debug khi graph phức tạp.
Checkpointer từ ngày đầu: Dùng MemorySaver cho dev, chuyển PostgresSaver cho staging/production. Đừng thêm persistence sau — refactor sẽ khó.
Giữ node nhỏ: Mỗi node nên làm một việc duy nhất. Node "thần thánh" vừa gọi LLM, vừa parse, vừa validate sẽ rất khó test và debug.
Observability: Tích hợp LangSmith tracing để visualize graph execution. Khi agent chạy sai, trace giúp bạn thấy chính xác node nào quyết định sai.

10. Kết luận

LangGraph đã chứng minh vị thế framework hàng đầu cho việc xây dựng AI Agent production-ready. Kiến trúc đồ thị có trạng thái giải quyết những vấn đề mà pipeline tuyến tính không thể: lặp, rẽ nhánh, persistence, và human-in-the-loop. Với sự tin tưởng của Uber, JP Morgan, Klara và 34% thị phần enterprise, LangGraph không chỉ là framework — nó đang định hình cách ngành công nghiệp xây dựng AI Agent.

Nếu bạn đang chuyển từ prototype sang production, LangGraph là framework đáng đầu tư thời gian để thành thạo. Bắt đầu với một ReAct agent đơn giản, thêm persistence, rồi mở rộng sang multi-agent — mỗi bước đều có abstraction phù hợp.

Tài liệu tham khảo:

#LangGraph #AI Agents #Agentic AI #Python #Multi-Agent Systems

# LangGraph — Điều phối AI Agent phức tạp bằng kiến trúc đồ thị

Khi AI Agent ngày càng phức tạp — từ chatbot đơn giản đến hệ thống tự động hóa đa bước, đa công cụ — câu hỏi không còn là "dùng LLM nào" mà là "điều phối workflow agent như thế nào để đáng tin cậy trong production". **LangGraph**, framework đồ thị của LangChain, đang nổi lên như câu trả lời hàng đầu: mô hình hóa toàn bộ agent workflow dưới dạng *directed graph* có trạng thái, hỗ trợ persistence, human-in-the-loop, và multi-agent orchestration ngay từ thiết kế.

34%Thị phần agent framework tại doanh nghiệp lớn (Gartner Q1/2026)

76%Điểm benchmark task phức tạp (cao nhất trong các framework)

40K+GitHub Stars

2Ngôn ngữ hỗ trợ (Python & TypeScript)

## 1. LangGraph là gì

LangGraph là một **low-level orchestration framework** cho phép xây dựng, quản lý và triển khai các AI Agent có trạng thái (stateful), chạy lâu dài (long-running). Thay vì chuỗi pipeline tuyến tính (linear chain) như LangChain truyền thống, LangGraph mô hình hóa workflow dưới dạng **directed cyclic graph** — đồ thị có hướng, cho phép lặp (cycle), rẽ nhánh có điều kiện (conditional branching), và checkpoint tại mọi node.

#### LangGraph ≠ LangChain

LangGraph là thư viện **độc lập**, có thể sử dụng mà không cần LangChain. Nó tập trung vào orchestration layer — điều phối luồng thực thi — trong khi LangChain cung cấp abstractions cho LLM calls, prompt templates, và tool integrations. Nhiều production deployment chỉ dùng LangGraph + LLM SDK trực tiếp mà bỏ qua LangChain hoàn toàn.

## 2. Kiến trúc cốt lõi — State, Node, Edge

Mọi thứ trong LangGraph xoay quanh ba khái niệm: **State** (trạng thái chia sẻ), **Node** (đơn vị xử lý), và **Edge** (luồng điều kiện). Ba thành phần này tạo nên một `StateGraph` — đồ thị có typed state được cập nhật incremental qua mỗi node.

```
graph TD
    START(["__start__"]) --> A["Agent Node  
(LLM reasoning)"]
    A -->|"tool_calls detected"| B["Tool Node  
(execute tools)"]
    A -->|"no tool_calls"| END(["__end__"])
    B --> A

style START fill:#4CAF50,stroke:#fff,color:#fff
    style END fill:#e94560,stroke:#fff,color:#fff
    style A fill:#2c3e50,stroke:#fff,color:#fff
    style B fill:#f8f9fa,stroke:#e94560,color:#2c3e50

```

Hình 1: ReAct Agent loop cơ bản — Agent suy luận, gọi tool, nhận kết quả, lặp lại cho đến khi hoàn tất

### 2.1. State — Trạng thái chia sẻ

State trong LangGraph là một **typed dictionary** đại diện cho snapshot hiện tại của toàn bộ workflow. Mỗi node nhận state làm input, xử lý, và trả về một *partial update* — LangGraph tự động merge vào state chung thay vì overwrite toàn bộ.

```python
from typing import Annotated, TypedDict
from langgraph.graph.message import add_messages

class AgentState(TypedDict):
    messages: Annotated[list, add_messages]
    next_action: str
    iteration_count: int

```
Cơ chế **reducer** (như `add_messages`) cho phép định nghĩa cách merge state: append vào list, cộng dồn số, hoặc custom logic bất kỳ. Đây là nền tảng để nhiều node chạy song song mà không conflict — mỗi node cập nhật field riêng, reducer gộp kết quả.

### 2.2. Node — Đơn vị xử lý logic

Node là một **Python function** (hoặc TypeScript function) nhận state và trả về partial state update. Node có thể là:

- **LLM call**: gửi messages đến model, nhận phản hồi
- **Tool execution**: chạy function/API dựa trên tool_calls từ LLM
- **Logic thuần**: transform data, validate, filter
- **Subgraph**: một StateGraph lồng nhau, chạy như node đơn

```python
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o")

def agent_node(state: AgentState):
    response = model.invoke(state["messages"])
    return {"messages": [response]}

def tool_node(state: AgentState):
    last_message = state["messages"][-1]
    results = execute_tools(last_message.tool_calls)
    return {"messages": results}

```

### 2.3. Edge — Luồng điều kiện

Edge kết nối các node và quyết định node nào chạy tiếp theo. LangGraph hỗ trợ hai loại edge:

- **Normal edge**: luôn đi từ node A sang node B
- **Conditional edge**: hàm nhận state và trả về tên node tiếp theo

```python
from langgraph.graph import StateGraph

graph = StateGraph(AgentState)

graph.add_node("agent", agent_node)
graph.add_node("tools", tool_node)

graph.set_entry_point("agent")

graph.add_conditional_edges(
    "agent",
    should_continue,  # function trả về "tools" hoặc "__end__"
    {"tools": "tools", "__end__": "__end__"}
)
graph.add_edge("tools", "agent")  # sau khi chạy tool, quay lại agent

```
Conditional edge là trái tim của LangGraph — nó cho phép agent **tự quyết định** luồng thực thi dựa trên trạng thái hiện tại, tạo ra các workflow phức tạp mà pipeline tuyến tính không thể biểu diễn.

## 3. Persistence & Checkpointer

### 3.1. Tại sao cần Persistence

Agent trong production không chạy xong trong 1 request. Chúng có thể cần chờ phê duyệt hàng giờ, bị gián đoạn do deploy, hoặc crash giữa chừng. Không có persistence, toàn bộ tiến trình sẽ mất. LangGraph giải quyết điều này bằng **Checkpointer** — tự động lưu state sau mỗi node execution.

style A fill:#2c3e50,stroke:#fff,color:#fff
    style B fill:#2c3e50,stroke:#fff,color:#fff
    style CP fill:#e94560,stroke:#fff,color:#fff

```

Hình 2: Checkpointer lưu state sau mỗi node — cho phép resume sau crash hoặc restart

### 3.2. Các loại Checkpointer

| Checkpointer | Sử dụng khi | Đặc điểm |
| --- | --- | --- |
| **MemorySaver** | Development, testing | Lưu trong RAM, mất khi restart |
| **SqliteSaver** | Single-process, prototype | File-based, đơn giản |
| **PostgresSaver** | Production | Multi-process, durable, scale tốt |
| **RedisSaver** | High-throughput production | In-memory + persistence, TTL support |

```python
from langgraph.checkpoint.postgres import PostgresSaver

DB_URI = "postgresql://user:pass@localhost:5432/langgraph_db"

with PostgresSaver.from_conn_string(DB_URI) as checkpointer:
    checkpointer.setup()  # tạo bảng nếu chưa có
    app = graph.compile(checkpointer=checkpointer)

# Mỗi thread_id là một conversation/workflow instance riêng
    config = {"configurable": {"thread_id": "order-processing-42"}}
    result = app.invoke(initial_state, config)

```
Mỗi `thread_id` đại diện cho một workflow instance. Bạn có thể resume bất kỳ thread nào bằng cách invoke lại với cùng thread_id — state sẽ được load từ checkpoint cuối cùng.

## 4. Human-in-the-Loop — Con người trong vòng lặp

### 4.1. Cơ chế Interrupt

Một trong những tính năng mạnh nhất của LangGraph là **interrupt** — tạm dừng workflow tại bất kỳ node nào, chờ input từ con người (có thể hàng giờ hoặc hàng ngày), rồi resume chính xác từ vị trí đã dừng.

#### Use case thực tế

Một agent xử lý hoàn tiền: tự động phân tích yêu cầu, kiểm tra lịch sử đơn hàng, tính toán số tiền — nhưng **tạm dừng để manager phê duyệt** trước khi thực sự chuyển tiền. Không có interrupt, bạn phải tự xây dựng state persistence, queue, polling — LangGraph xử lý hết.

```python
from langgraph.types import interrupt, Command

def approval_node(state: AgentState):
    # Tạm dừng workflow, gửi thông tin cho human
    decision = interrupt({
        "question": "Approve refund of $150 for order #42?",
        "options": ["approve", "reject", "escalate"]
    })

if decision == "approve":
        return Command(goto="process_refund")
    elif decision == "reject":
        return Command(goto="notify_customer_rejected")
    else:
        return Command(goto="escalate_to_senior")

```

```python
# Resume workflow sau khi human quyết định
app.invoke(
    Command(resume="approve"),
    config={"configurable": {"thread_id": "refund-request-42"}}
)

```
Khi `interrupt()` được gọi, LangGraph lưu toàn bộ state vào checkpointer, đánh dấu thread là *interrupted*, và trả control về caller. Khi human gửi quyết định qua `Command(resume=...)`, workflow tiếp tục chính xác từ dòng code sau `interrupt()`.

## 5. Multi-Agent Patterns

LangGraph hỗ trợ ba pattern chính để xây dựng hệ thống multi-agent:

### 5.1. Supervisor Pattern

style S fill:#e94560,stroke:#fff,color:#fff
    style R fill:#2c3e50,stroke:#fff,color:#fff
    style C fill:#2c3e50,stroke:#fff,color:#fff
    style V fill:#2c3e50,stroke:#fff,color:#fff
    style END fill:#4CAF50,stroke:#fff,color:#fff

```

Hình 3: Supervisor Pattern — agent trung tâm điều phối các worker chuyên biệt

```python
from langgraph_supervisor import create_supervisor

supervisor = create_supervisor(
    model=ChatOpenAI(model="gpt-4o"),
    agents=[research_agent, coding_agent, review_agent],
    prompt="You are a tech lead. Delegate tasks to the right team member."
)

app = supervisor.compile()

```

### 5.2. Subgraph & Hierarchical Teams

```python
# Team Research: 3 agents chuyên biệt
research_team = StateGraph(ResearchState)
research_team.add_node("web_searcher", web_search_agent)
research_team.add_node("analyst", data_analyst_agent)
research_team.add_node("team_lead", research_supervisor)
research_subgraph = research_team.compile()

# Main graph: gộp các team
main_graph = StateGraph(MainState)
main_graph.add_node("research_team", research_subgraph)
main_graph.add_node("dev_team", dev_subgraph)
main_graph.add_node("orchestrator", orchestrator_node)

```

### 5.3. Handoff Pattern

Thay vì routing qua supervisor trung tâm, agent có thể **trực tiếp chuyển giao** (handoff) quyền điều khiển cho agent khác kèm payload. Pattern này phù hợp khi luồng xử lý có trình tự rõ ràng.

```python
from langgraph.prebuilt import create_react_agent
from langgraph.types import Command

def transfer_to_billing(state):
    """Chuyển sang billing agent khi cần xử lý thanh toán."""
    return Command(
        goto="billing_agent",
        update={"context": "Customer needs billing help"}
    )

support_agent = create_react_agent(
    model=model,
    tools=[transfer_to_billing, search_knowledge_base]
)

```

## 6. So sánh với CrewAI và AutoGen

| Tiêu chí | LangGraph | CrewAI | AutoGen |
| --- | --- | --- | --- |
| **Kiến trúc** | Graph-based (nodes & edges) | Role-based (crew & tasks) | Conversational (chat-based) |
| **State management** | Typed state + reducers, incremental update | Shared memory cơ bản | Chat history làm state |
| **Persistence** | Built-in checkpointer (Postgres, Redis) | Không có native | Không có native |
| **Human-in-the-loop** | interrupt() API — pause/resume bất kỳ node | Manual qua callback | Chat-based input |
| **Benchmark (medium tasks)** | 76% | 71% | 68% |
| **Độ phức tạp học** | Cao — cần hiểu graph theory | Thấp — role/task dễ hình dung | Trung bình |
| **Production readiness** | Cao nhất — deterministic execution | Tốt cho prototyping | Maintenance mode (Microsoft chuyển sang Agent Framework) |
| **Enterprise adoption** | Uber, JP Morgan, Klara | Startups, SMBs | Azure ecosystem |
| **Ngôn ngữ** | Python, TypeScript | Python | Python, .NET |

#### AutoGen đang ở chế độ bảo trì

## 7. Triển khai Production

### 7.1. LangSmith Deployment

LangGraph Platform (nay đổi tên thành **LangSmith Deployment**) cung cấp hạ tầng triển khai chuyên dụng cho agent:

| Tùy chọn | Mô tả | Phù hợp với |
| --- | --- | --- |
| **Cloud SaaS** | Hosted bởi LangChain, zero-ops | Startup, prototype nhanh |
| **BYOC (AWS)** | Chạy trong VPC của bạn, LangChain quản lý provisioning | Enterprise cần data sovereignty |
| **Self-hosted** | Toàn quyền kiểm soát trên Kubernetes | Regulated industries (finance, healthcare) |
| **Standalone** | Lightweight — chỉ Agent Server + Postgres + Redis | Team nhỏ, single-service deployment |

### 7.2. Self-hosted Architecture

Kiến trúc self-hosted bao gồm: **Control Plane** (quản lý deployment, routing) và **Data Plane** (Agent Servers chạy graph). Data Plane cần PostgreSQL (state + checkpoints) và Redis (task queue + pub/sub). Kubernetes là yêu cầu bắt buộc cho cả hai plane.

```
graph TB
    subgraph CP["Control Plane"]
        API["LangSmith API"]
        UI["Dashboard UI"]
    end

subgraph DP["Data Plane"]
        AS1["Agent Server 1"]
        AS2["Agent Server 2"]
        AS3["Agent Server N"]
    end

PG[("PostgreSQL  
State & Checkpoints")]
    RD[("Redis  
Task Queue")]

UI --> API
    API --> AS1
    API --> AS2
    API --> AS3
    AS1 --> PG
    AS2 --> PG
    AS3 --> PG
    AS1 --> RD
    AS2 --> RD
    AS3 --> RD

style CP fill:#f8f9fa,stroke:#e94560,color:#2c3e50
    style DP fill:#f8f9fa,stroke:#2c3e50,color:#2c3e50
    style PG fill:#e94560,stroke:#fff,color:#fff
    style RD fill:#2c3e50,stroke:#fff,color:#fff
    style API fill:#4CAF50,stroke:#fff,color:#fff
    style UI fill:#4CAF50,stroke:#fff,color:#fff
    style AS1 fill:#2c3e50,stroke:#fff,color:#fff
    style AS2 fill:#2c3e50,stroke:#fff,color:#fff
    style AS3 fill:#2c3e50,stroke:#fff,color:#fff

```

Hình 4: Kiến trúc Self-hosted LangGraph — Control Plane quản lý, Data Plane chạy agent

## 8. Khi nào nên dùng LangGraph

#### Nên dùng khi

**Workflow phức tạp** với nhiều bước, rẽ nhánh, lặp — như hệ thống xử lý đơn hàng, pipeline phân tích dữ liệu, hoặc AI assistant đa công cụ. **Cần persistence** — workflow chạy hàng giờ, cần survive restart/crash. **Human-in-the-loop** — cần phê duyệt con người ở các bước quan trọng. **Multi-agent** — cần nhiều agent chuyên biệt phối hợp.

#### Không nên dùng khi

**Simple chatbot** — nếu chỉ cần LLM + vài tools, dùng `create_react_agent` hoặc LLM SDK trực tiếp. **Prototype nhanh** — CrewAI có learning curve thấp hơn nhiều. **Conversation-heavy** — nếu agent chủ yếu chat qua lại giữa các role, AutoGen phù hợp hơn.

## 9. Thực hành tốt nhất

- **Bắt đầu nhỏ:** Xây dựng single-agent ReAct loop trước, thêm complexity khi cần. Đừng nhảy thẳng vào multi-agent supervisor.
- **State typing nghiêm ngặt:** Dùng TypedDict với type hints đầy đủ. State không có type sẽ rất khó debug khi graph phức tạp.
- **Checkpointer từ ngày đầu:** Dùng MemorySaver cho dev, chuyển PostgresSaver cho staging/production. Đừng thêm persistence sau — refactor sẽ khó.
- **Giữ node nhỏ:** Mỗi node nên làm một việc duy nhất. Node "thần thánh" vừa gọi LLM, vừa parse, vừa validate sẽ rất khó test và debug.
- **Observability:** Tích hợp LangSmith tracing để visualize graph execution. Khi agent chạy sai, trace giúp bạn thấy chính xác node nào quyết định sai.

## 10. Kết luận

**Tài liệu tham khảo:**

- [LangGraph Official — langchain.com/langgraph](https://www.langchain.com/langgraph)
- [GitHub — langchain-ai/langgraph](https://github.com/langchain-ai/langgraph)
- [LangGraph Documentation — docs.langchain.com](https://docs.langchain.com/oss/python/langgraph/overview)
- [AI Agent Frameworks 2026 Production-Tested Ranking — Alice Labs](https://alicelabs.ai/en/insights/best-ai-agent-frameworks-2026)
- [CrewAI vs LangGraph vs AutoGen — DataCamp](https://www.datacamp.com/tutorial/crewai-vs-langgraph-vs-autogen)
- [LangGraph Platform Announcement — LangChain Blog](https://www.langchain.com/blog/langgraph-platform-announce)

Google ADK — Framework Mã nguồn Mở để Xây dựng AI Agent Production

AlphaEvolve — AI Agent tự khám phá thuật toán vượt qua con người

Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.