MCP Server & Client Design: เมื่อ Developer ต้องออกแบบ AI Agent Infrastructure เอง

June 11, 2026 · 14 min read

Author

สารบัญ

หลังจาก optimize DGX Spark และ debug LiteLLM มาหลายวัน ผมเริ่มเข้าใจว่า "AI infrastructure" ไม่ได้จบแค่ตรงรัน model — มันคือการออกแบบ layer ทั้งหมด ตั้งแต่ model serving → routing → agent framework → tool integration และหนึ่งในเรื่องที่ developer หลายคน (รวมถึงผม) ต้องเรียนรู้เพิ่มคือ Model Context Protocol (MCP) — มาตรฐานที่ Anthropic สร้างขึ้นเพื่อให้ agent คุยกับ tool ได้แบบเสียบแล้วใช้ได้เลย

วันนี้จะมาเล่าเรื่อง MCP architecture แบบครบวงจร — ตั้งแต่ transport types, การออกแบบ server, การออกแบบ client, ไปจนถึงการเชื่อมต่อกับ self-hosted models ของเราเอง

TL;DR

MCP เป็น open standard ที่กำหนดวิธี AI agent คุยกับเครื่องมือภายนอกแบบมีโครงสร้าง แทนการ hardcode API ในทุก agent บทความนี้เจาะลึก transport types (stdio / HTTP+SSE / Streamable HTTP), server/client design patterns, ปัญหา token bloat ที่มักถูกมองข้าม, ข้อควรพิจารณาด้านความปลอดภัยสำหรับ production และวิธีเชื่อมต่อ MCP เข้ากับ self-hosted stack (DGX Spark + LiteLLM + Qwen3.6-35B-A3B) ของเราเอง

MCP คืออะไร (ฉบับย่อ)

MCP (Model Context Protocol) เป็น open standard จาก Anthropic ที่กำหนดวิธี AI agent (เช่น Claude Code, OpenClaw, Hermes) คุยกับเครื่องมือ/ข้อมูลภายนอกแบบมีโครงสร้าง แทนที่จะ hardcode API เข้าไปใน agent

ถ้าเปรียบง่ายๆ — ถ้า OpenAPI คือ "มาตรฐาน REST API" → MCP คือ "มาตรฐานสำหรับ agent เรียก tool"

Note: MCP ไม่ใช่ framework — เป็น protocol เหมือน HTTP ไม่ใช่ web framework คุณสามารถสร้าง MCP server ได้หลายภาษา (Python, TypeScript, Go, Rust) และ client ก็เชื่อมต่อได้ทุกภาษาที่รองรับ JSON-RPC 2.0

Transport Types - เลือกยังไง

MCP กำหนดกลไกการส่งข้อมูลไว้ 3 แบบ แต่ละแบบมีกรณีการใช้งานต่างกัน

Transport Comparison

Transport	Local/Remote	Clients per Server	Complexity	OAuth	Best For
stdio	Local only	1	Low	No	Dev tools, IDE plugins, scripts
HTTP+SSE	Remote	Many	Medium	Limited	Legacy systems (deprecated soon)
Streamable HTTP	Remote	Many	Medium	Yes (2.1)	Production, enterprise, multi-user

Note: เลือก transport ตาม deployment scenario — ถ้า agent + server อยู่เครื่องเดียวกัน stdio คือทางเลือกที่ง่ายและเร็วที่สุด ถ้าต้องการ remote access หรือ multi-user ให้ใช้ Streamable HTTP เป็น default ตั้งแต่แรก (HTTP+SSE กำลังจะ deprecated)

Designing an MCP Server

ตอนออกแบบ MCP server ผมคิดเป็น 3 layer: Tools (actions), Resources (data), Prompts (templates) — แต่ละอันมีหน้าที่ต่างกัน

MCP Server Lifecycle

นี่คือสิ่งที่เกิดขึ้นเบื้องหลังตอน agent เชื่อมต่อ MCP server:

Code Example: Python (FastMCP)

ถ้าใช้ Python, FastMCP เป็น SDK ที่ง่ายที่สุด:

# Conceptual example - simplified for clarity
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("trading-data-server")

@mcp.tool()
def get_market_data(symbol: str, timeframe: str = "1h") -> dict:
    """Fetch OHLCV candles for a symbol."""
    data = fetch_from_db(symbol, timeframe)
    return {"symbol": symbol, "candles": data}

@mcp.tool()
def get_account_balance() -> dict:
    """Get current account balance."""
    return {"USD": 12500.00, "BTC": 0.15}

@mcp.resource("market://{symbol}/price")
def current_price(symbol: str) -> str:
    """Live price for any symbol."""
    return f'{{"symbol": "{symbol}", "price": 67450.32}}'

@mcp.prompt()
def analyze_market(symbol: str) -> str:
    """Pre-filled prompt for market analysis."""
    return f"""Analyze {symbol} market conditions:
- Current trend
- Key support/resistance
- Recommended action (LONG/SHORT/HOLD)
- Confidence level (0-100)"""

if __name__ == "__main__":
    # stdio transport (default)
    mcp.run()
    # OR for HTTP:
    # mcp.run(transport="streamable-http", host="0.0.0.0", port=8000)

Note: ใน Python SDK 1.x ใช้ @mcp.tool() decorator เหมือน Flask routes — function signature กลายเป็น tool schema อัตโนมัติ และ Type hints ของคุณกลายเป็น JSON Schema ใน inputSchema

Code Example: TypeScript (Official SDK)

// Conceptual example - simplified for clarity
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js"
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js"
import { z } from "zod"

const server = new McpServer({
  name: "trading-data-server",
  version: "1.0.0",
})

// Tool with Zod schema validation
server.tool(
  "get_market_data",
  "Fetch OHLCV candles for a symbol",
  {
    symbol: z.string().describe("Trading pair, e.g. BTCUSDT"),
    timeframe: z.enum(["1m", "5m", "1h", "1d"]).default("1h"),
  },
  async ({ symbol, timeframe }) => {
    const data = await fetchFromDB(symbol, timeframe)
    return {
      content: [{ type: "text", text: JSON.stringify(data) }],
    }
  }
)

// Resource template
server.resource(
  "current-price",
  "market://{symbol}/price",
  async (uri, { symbol }) => ({
    contents: [{
      uri: uri.href,
      text: JSON.stringify({ symbol, price: 67450.32 })
    }]
  })
)

// Start server
const transport = new StdioServerTransport()
await server.connect(transport)

Note: TypeScript SDK ใช้ Zod สำหรับ schema validation ข้อดีคือ IDE autocomplete + runtime type safety ข้อเสียคือ verbose กว่า Python ถ้า schema ซับซ้อน

Designing an MCP Client

MCP client คือส่วนประกอบที่อยู่ใน agent framework (เช่น Hermes, OpenClaw, Claude Code) ที่ดูแลการเชื่อมต่อกับ MCP server(s)

Code Example: Python MCP Client

# Conceptual example - simplified
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import asyncio

async def use_trading_mcp():
    # 1. Connect to server
    server_params = StdioServerParameters(
        command="python",
        args=["trading_mcp_server.py"]
    )
    
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            
            # 2. Initialize
            await session.initialize()
            
            # 3. Discover tools
            tools = await session.list_tools()
            print(f"Available: {[t.name for t in tools.tools]}")
            
            # 4. Call a tool
            result = await session.call_tool(
                "get_market_data",
                arguments={"symbol": "BTCUSDT", "timeframe": "1h"}
            )
            print(f"Result: {result.content[0].text}")
            
            # 5. Read a resource
            resource = await session.read_resource("market://BTCUSDT/price")
            print(f"Current price: {resource.contents[0].text}")

asyncio.run(use_trading_mcp())

Note: MCP client ไม่ได้เป็นแค่ "library call" — มันต้องจัดการ async lifecycle, การกู้คืนจากข้อผิดพลาด, การเชื่อมต่อ session ใหม่ ในระบบ production จริง ผมแนะนำให้ใช้ framework ที่จัดการให้แล้ว (เช่น LangChain MCP adapters, OpenAI Agents SDK MCP support)

Token Budget Visualization

ถ้า tool definitions กิน context ไป 40-50% ให้รวม server หรือใช้ tool filtering เพื่อลดภาระ token

Token Efficiency - ปัญหาใหญ่ที่ต้องออกแบบตั้งแต่ต้น

Note: ตอนผมออกแบบ server แรก ใส่ description ยาวๆ ไว้ทุก tool ผลคือ 8 tools ใช้ tokens ไปเกือบ 20K ตอนลด description เหลือแค่ 1-2 บรรทัด + รวม related tools → เหลือ 8K tokens ใช้งานได้จริง

Budget: 32,768 tokens

Component	Tokens	%
System prompt	~500	1.5%
MCP tool schemas	~8,000	24%
Conversation history	~12,000	37%
User prompt	~2,000	6%
LLM response	~1,500	5%
Tool call/result	~8,000	24%

วิธีแก้ไข:

Filter tools by request type
Use tool aliases (short names)
Cache tool schemas (version + hash)
Summarize history (sliding window)
Set max_tokens for tool responses

Security Considerations

Note: stdio ดูเหมือนปลอดภัย แต่จริง ๆ คือ "trust everything local" MCP server ที่รัน local มีสิทธิ์เท่ากับ user ที่รัน agent — ถ้า server เขียนไฟล์ ~/.ssh/authorized_keys ก็ทำได้ ต้องตรวจสอบ MCP servers ที่ติดตั้งเสมอ เหมือนกับตรวจสอบ npm packages

OAuth 2.1 for Remote MCP

Note: ใช้ external IdP เสมอสำหรับ production เช่น Keycloak, Okta, Auth0 — อย่าสร้าง OAuth server เอง เพราะผลกระทบด้านความปลอดภัยใหญ่มาก MCP server ควรเป็นแค่ "resource server" ที่ตรวจสอบความถูกต้องของ tokens ไม่ใช่ออก token

Connecting to Self-Hosted Models

Model	Tool call format
GPT-4	`place_order(sym, qty)`
Claude	`place_order(sym, qty, side)`
Gemini	`place_order(sym, qty, type)`
Local	`place_order(...)` ← different!

ปัญหา: แต่ละ model มีรูปแบบการเรียก tool ที่ต่างกัน วิธีแก้: ใช้ strict schema validation + graceful fallback

Hermes Agent + MCP: Real Example

Hermes Agent (จาก Nous Research) เป็นตัวอย่างที่ดีของ self-hosted agent ที่เชื่อมต่อกับ MCP ได้ดี:

# hermes-config.yaml (conceptual)
agent:
  name: "trading-assistant"
  llm:
    provider: openai  # OpenAI-compatible API
    base_url: http://10.0.0.155:4000/v1  # LiteLLM gateway
    model: qwen3.6-35b
    max_tokens: 4000
    temperature: 0.2

mcp_servers:
  # Local stdio server
  - name: filesystem
    transport: stdio
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem", "/data"]
    auth: none  # trust local

  # Remote HTTP server (internal)
  - name: trading-db
    transport: streamable-http
    url: http://internal-mcp:8000/mcp
    auth: oauth2
    client_id: "trading-assistant"
    scopes: ["tools:read", "trading:execute"]
    # token cached + auto-refresh

Note: Hermes Agent กับ OpenClaw ใช้รูปแบบคล้ายกัน — แต่ต่างกันตรงที่ OpenClaw เป็น "feature-rich agent runtime" ส่วน Hermes เน้น "self-improving" และ migration path (hermes claw migrate) เลือกตามกรณีการใช้งาน — ถ้าปรับแต่งเยอะใช้ OpenClaw, ถ้าอยากให้พฤติกรรมพัฒนาต่อเนื่องใช้ Hermes

เข้าใจ protocol ก่อน
- อ่าน MCP spec (modelcontextprotocol.io)
- ลอง stdio transport กับเครื่องมือง่าย ๆ
สร้าง server
- สร้าง tools/list endpoint
- เพิ่ม handler สำหรับเรียกใช้งาน tool
- ทดสอบด้วย MCP Inspector
สร้าง client
- เชื่อมต่อกับ server ของคุณ
- จัดการการค้นพบ tool
- ผสานรวม tools เข้าไปใน LLM prompt
ข้อควรคำนึงสำหรับ production
- เพิ่ม OAuth 2.1 authentication
- สร้างระบบ audit logging
- จัดการข้อผิดพลาดอย่างเหมาะสม
- ติดตามการใช้ token

ข้อผิดพลาดที่พบบ่อย (จากประสบการณ์ตรง)

เริ่มจาก tools ไม่เกิน 5 ตัว — ใช้งานจริง 1-2 สัปดาห์ แล้วค่อยเพิ่ม
Description สั้น ๆ ตั้งแต่แรก — ไม่ใช่ "ใส่เยอะดี" แล้วมาตัดทีหลัง
วางแผน transport ก่อน deploy — stdio หรือ HTTP ไม่ใช่สิ่งที่คิดทีหลัง
Test token usage ตั้งแต่ week 1 — วัดเลยว่า tool defs กินเท่าไหร่
OAuth 2.1 ตั้งแต่ต้น — แม้จะเป็น internal ใส่ไว้ก่อน ไม่ต้องมา retrofit
เขียนเอกสารประกอบ MCP server — README บอกชัดว่า tools อะไร, ใช้ตอนไหน

สิ่งที่ผมจะทำต่างไปครั้งหน้า

หลังจากลองผิดลองถูก นี่คือสิ่งที่ผมจะทำต่างไป:

เริ่มจาก tools ไม่เกิน 5 ตัว — ใช้งานจริง 1-2 สัปดาห์ แล้วค่อยเพิ่ม
Description สั้น ๆ ตั้งแต่แรก — ไม่ใช่ "ใส่เยอะดี" แล้วมาตัดทีหลัง
วางแผน transport ก่อน deploy — stdio หรือ HTTP ไม่ใช่สิ่งที่คิดทีหลัง
Test token usage ตั้งแต่ week 1 — วัดเลยว่า tool defs กินเท่าไหร่
OAuth 2.1 ตั้งแต่ต้น — แม้จะเป็น internal ใส่ไว้ก่อน ไม่ต้องมา retrofit
เขียนเอกสารประกอบ MCP server — README บอกชัดว่า tools อะไร, ใช้ตอนไหน

Final Thought — MCP ไม่ใช่ "magic"

MCP เป็นมาตรฐานที่ดี — มันช่วยให้ agent framework กับ tool integration แยกออกจากกันได้ ไม่ต้อง hardcode API ในทุก agent แต่ก็ไม่ใช่ "magic" — ต้องออกแบบดี ๆ ตั้งแต่ต้น ไม่งั้น token bloat, security holes, และ maintenance nightmare จะตามมา

ถ้าใครกำลังจะเริ่มสร้าง MCP server เอง ผมแนะนำ:

เริ่มต้นด้วย stdio — เรียนรู้ concept ใน local environment ก่อน
ใช้ FastMCP (Python) หรือ @modelcontextprotocol/sdk (TypeScript) — ลดแรงเสียดทาน
ทดสอบการใช้ token ตั้งแต่แรก — cl100k_base encoder นับ tokens ได้
วางแผนอัปเกรด transport — เริ่ม stdio แล้วค่อยย้ายไป HTTP เมื่อต้อง remote
ใช้ OAuth 2.1 ตั้งแต่วันแรก — ถ้าจะ remote

Note: ถ้าสนใจ integrate MCP กับ self-hosted model stack ของผมเองก็กำลังทำอยู่ (DGX Spark + LiteLLM + Hermes Agent + custom MCP servers) — เดี๋ยวจะเขียน blog ต่อเกี่ยวกับ production setup ตัวจริง ✨

อ้างอิง:

Honcho Memory Layer — memory infrastructure ที่ผมใช้กับ MCP-native agents (reasoning-first, self-host)

mcp model-context-protocol ai-agent architecture llm

แชร์บทความ

Facebook X

☕

เนื้อหานี้มีประโยชน์ไหม? ช่วยสนับสนุนค่ากาแฟให้ผู้เขียนสักแก้ว

Buy Me a Coffee

สารบัญ

TL;DR​

MCP คืออะไร (ฉบับย่อ)​

Transport Types - เลือกยังไง​

Transport Comparison​

Designing an MCP Server​

MCP Server Lifecycle​

Code Example: Python (FastMCP)​

Code Example: TypeScript (Official SDK)​

Designing an MCP Client​

Code Example: Python MCP Client​

Token Budget Visualization​

Token Efficiency - ปัญหาใหญ่ที่ต้องออกแบบตั้งแต่ต้น​

Security Considerations​

OAuth 2.1 for Remote MCP​

Connecting to Self-Hosted Models​

Hermes Agent + MCP: Real Example​

ข้อผิดพลาดที่พบบ่อย (จากประสบการณ์ตรง)​

สิ่งที่ผมจะทำต่างไปครั้งหน้า​

Final Thought — MCP ไม่ใช่ "magic"​

Related Blogs​