Liên kết

Forgemax - MCP Gateway giảm 76 tools xuống còn 2 tools

25 tháng 2, 2026 · github

TL;DR

Bài này nói về công cụ giúp LLM gọi MCP tools mà không bị context overflow. Thay vì đưa toàn bộ schema của 76 tools vào prompt (~15,000 tokens), Forgemax chỉ cần ~1,000 tokens cho 2 tools duy nhất.

Nói đơn giản: Tool này giúp AI gọi API của nhiều dịch vụ (GitHub, Figma, Stripe…) mà không bị quá tải context.

Bài này dành cho ai?

1. Người muốn AI làm việc thay mình

Vấn đề: Đang dùng nhiều MCP servers (GitHub, Figma, Supabase…) nhưng bị context overflow, token consumption quá lớn Khi nào cần: Khi cần AI tự động gọi nhiều tools mà không muốn prompt dài nghìn tokens Được gì: Giảm token usage ~93%, gọi tools nhanh hơn (1 call thay vì 5-10 round-trips)

2. Người muốn build sản phẩm

Vấn đề: Cần orchestrate nhiều MCP services nhưng sợ security, scalability Khi nào cần: Khi build AI agent cần gọi nhiều external APIs Được gì: Solution an toàn, đã qua 3 vòng security testing, có audit logging đầy đủ

Các điểm chính

1. Traditional MCP gặp vấn đề context bloat 76 tools = ~15,000 tokens schema. Mỗi tool mới thêm lại làm prompt dài thêm. 5-10 round-trips để gọi xong 1 task. → Làm gì: Xem xét Forgemax nếu đang dùng nhiều MCP servers

2. Forgemax collapse N servers x M tools thành 2 tools search — query capability manifest để discover tools (read-only) execute — chạy JavaScript gọi qua typed proxy objects Chỉ ~1,000 tokens thay vì 15,000 tokens → Làm gì: Thử config Forgemax connect thử 1-2 MCP servers

3. V8 sandbox là core innovation Dùng deno_core chạy LLM-generated JavaScript trong locked-down V8 isolate Không fs, network, env, child process access Mỗi lần execute là fresh runtime — không state leakage Credentials, file paths, internal state never leave host → Làm gì: Đọc phần forge-sandbox trong doc để hiểu rõ security model

4. Layered discovery giúp scale Layer 0: Server names + descriptions (~50 tokens) Layer 1: Categories per server (~200 tokens) Layer 2: Tool list per category (~500 tokens) Layer 3: Full schema per tool (~200 tokens each) → Làm gì: Không cần load hết schema — chỉ load khi cần

5. Security model nhiều lớp Code Validator → V8 Bootstrap → V8 Isolate → API Boundary → Manifest Sanitization → Content Size Limits → Error Redaction → Resource Limits → Circuit Breakers → Server Groups → Process Isolation → Binary Security → IPC Protocol → Audit Logging Đã qua 3 vòng adversarial testing, resolve 19 findings → Làm gì: Tin tưởng được vì đã có security review đàng hoàng