Agentic AI Era Developer: เมื่อเราต้องเก่งเกินกว่าแค่เขียนโค้ด

June 18, 2026 · 14 min read

Author

สารบัญ

TL;DR
The Shift - จาก Developer สู่ AI-Era Developer
ทำไมองค์กรต้องลงทุนสร้าง AI Infra เอง
AI-Era Developer ต้องรู้อะไรบ้าง
1. Model Context Awareness
2. Model Serving Knowledge
3. MCP as Standard
4. Middleware / Routing Layer
5. Security & Governance
Self-Hosted Models + Agents: Pattern ที่ผมใช้
Real Config Example (Hermes Agent + vLLM + LiteLLM)
Sweet Spot Finding - ศิลปะที่ Developer ต้องเรียน
Example: หา Sweet Spot สำหรับ Backtest Bot
ความท้าทายที่ Developer ต้องเจอ
ความท้าทายที่ 1: Token Economics
ความท้าทายที่ 2: Specs ≠ Performance
ความท้าทายที่ 3: Model + Tool Coupling
ความท้าทายที่ 4: กรณีศึกษา MIT — 95% ของ AI Project ล้มเหลวที่ Production
The New Developer Archetype
What's Next - Roadmap สำหรับ Developer
Quick Start: What to Learn First
Conclusion
References
Related Blogs

เมื่อก่อนตอนผมเริ่มเขียนโปรแกรม เครื่องมือที่ต้องรู้คือภาษาโปรแกรม, frameworks, databases, CI/CD - แค่นั้นจบ แต่ตอนนี้ผมต้องปรับแต่ง DGX Spark, แก้ไข LiteLLM proxy, ออกแบบ MCP servers, ทดสอบประสิทธิภาพ token, ตั้งค่า middleware สำหรับ routing ระหว่าง local กับ cloud, แล้วก็ยังต้องเข้าใจรูปแบบ security/auth สำหรับ AI agent อีก - Developer ในยุค Agentic AI ไม่ใช่แค่ developer ธรรมดาอีกต่อไป

วันนี้อยากเล่าเรื่องที่ผมเจอตอนสร้าง "AI Agent Infrastructure" เอง - ตั้งแต่ความคาดหวังขององค์กร, ความท้าทายที่ไม่คาดคิด, ไปจนถึง framework ที่ช่วยให้ชีวิตง่ายขึ้น

TL;DR

Developer ยุค Agentic AI ต้องมีชุดทักษะที่หลากหลายขึ้นมาก ตั้งแต่ model serving, MCP protocol, routing layer, ไปจนถึง security และ cost optimization การลงทุนสร้าง AI infrastructure เองช่วยให้ควบคุม privacy, cost, และ reliability ได้ แต่ต้องเข้าใจ trade-offs ระหว่าง speed, quality, cost และ context — ไม่มี model ที่ดีที่สุด มีแต่ config ที่ตอบโจทย์ use case ที่สุด

The Shift - จาก Developer สู่ AI-Era Developer

ย้อนกลับไปสมัยผมเริ่มเขียนโค้ด (ราวๆ 2015) "developer" หมายถึง:

เขียน Rust / TypeScript / Python ได้
รู้จัก framework (React, Express, Actix)
ออกแบบ database schema
Deploy ผ่าน Docker / K8s
CI/CD pipeline

แต่ตอนนี้ถ้าผมทำงานกับระบบ AI ผมต้องรู้เพิ่มอีก:

2015 Developer	2025 Developer
• Programming language	+ Model selection
• Framework	+ Inference optimization
• Database	+ Token budget awareness
• Docker / K8s	+ vLLM / Ollama
• CI/CD	+ MCP / Agent frameworks
• Monitoring	+ Routing layer
	+ LLM security
	+ Cost tracking
Stack: 5 tools	Stack: 15+ tools
Problem: features	Problem: tokens
Optimization: caching	Optimization: $0/1M tokens

Note: "เขียนโค้ดเก่ง" ไม่พอแล้ว ในยุคนี้ developer ต้องเข้าใจทั้ง software engineering + AI/ML infrastructure + DevOps + Security ของ agent - เป็น "T-shaped engineer" ที่กว้างขึ้นเรื่อยๆ

ทำไมองค์กรต้องลงทุนสร้าง AI Infra เอง

คำถามแรกที่หลายคนถาม: "ใช้ ChatGPT API ก็ได้ ทำไมต้องเสียเงินซื้อ GPU เอง?" - คำตอบขึ้นอยู่กับ use case แต่มี 4 เหตุผลหลักที่ผมเจอจริง

Note: สี่ข้อนี้ไม่ใช่เชิงทฤษฎี - มันคือเหตุผลที่ IDC คาดการณ์ว่า AI infrastructure market จะ > $100B ภายในปี 2028 องค์กรขนาดกลาง-ใหญ่กำลังเปลี่ยนจาก "AI เป็น SaaS" เป็น "AI เป็นความสามารถภายในองค์กร"

AI-Era Developer ต้องรู้อะไรบ้าง

ตอนนี้ developer ที่จะทำงานกับระบบ AI จริงๆ ต้องมีชุดทักษะที่หลากหลายขึ้น - ผมสรุปเป็น 5 ทักษะหลักที่ต้องเรียนรู้:

1. Model Context Awareness

ก่อนหน้านี้ developer คิดแค่ "request → response" - ตอนนี้ต้องคิดเรื่อง context window ด้วย

Source: "We've seen tool metadata take 40-50% of available context" — Feig, The New Stack

Note: ทุก feature ที่เพิ่มต้องคิดเรื่อง tokens ไม่ใช่แค่ "ทำงานได้" แต่ "ทำงานได้ใน context ที่มี" - เพราะถ้า tools + history กิน tokens หมด model จะตอบไม่ได้เลย

2. Model Serving Knowledge

ต้องเข้าใจว่า model เป็น "server" ที่ต้อง deploy, scale, monitor - ไม่ใช่แค่เรียก API

Engine	Use Case	Properties
vLLM	Production	Dense / MoE, high throughput
llama.cpp	Edge / local	FP4/FP8/BF16 quantization
SGLang	Alternative	8K-256K context window
TensorRT-LLM	NVIDIA GPU	Vision / code optimized

Infra Tradeoffs: Throughput vs latency · Single node vs cluster · Memory vs bandwidth · Cost vs performance

Note: ไม่ต้องเป็น ML engineer แต่ต้องอ่านเอกสารได้ - เวลาเลือก model ต้องรู้ว่า "MoE = ประหยัด compute แพง memory", "FP8 = sweet spot บน Blackwell", "MTPs ใช้ k=1 ดีกว่า k=15" พวกนี้ต้องค้นคว้าเอง

3. MCP as Standard

Model Context Protocol (MCP) กลายเป็น de facto standard สำหรับ agent ↔ tool communication - developer ที่ไม่รู้จัก MCP จะตกยุค

Note: MCP ทำให้ "write tool once, use everywhere" - ถ้าเขียน MCP server ดีๆ ตัวเดียว ใช้ได้กับ Claude Code, Hermes, OpenClaw, OpenCode เลย - ไม่ต้องเขียน integration ใหม่ทุก agent

ผมเขียนบล็อกแยกเรื่อง MCP architecture แล้ว (transport types, server design, client patterns) - แนะนำให้อ่านต่อถ้าสนใจ

4. Middleware / Routing Layer

ถ้ามีทั้ง local model และ cloud model ต้องมี middleware ที่จัดการ routing ให้ request ไปถูกที่

Note: "Try local first, promote to cloud" เป็น pattern ที่ทุกคนใช้ - ตาม case study ของ Daniel Vaughan: "A routing layer that tries local models first and promotes to cloud only when necessary is how you survive the transition" - 70% cost reduction เป็นไปได้

เครื่องมือที่ผมใช้:

LiteLLM - gateway ที่ abstract providers (local + cloud)
Aliases - qwen3.6-35b (thinking) vs qwen3.6-35b-gpt (low reasoning)
Caching - Redis cache ช่วย retry pattern

5. Security & Governance

Agent + tool = ฝันร้ายด้านความปลอดภัย ถ้าไม่คิดตั้งแต่ต้น

Note: "stdio MCP servers ดูเหมือน local = safe" แต่จริงๆ มีสิทธิ์เท่ากับ user ที่รัน agent ถ้า MCP server เขียนไฟล์ ~/.ssh/authorized_keys ก็ทำได้ - audit installed MCP servers เสมอ เหมือน audit npm packages

Self-Hosted Models + Agents: Pattern ที่ผมใช้

นี่คือ full stack ที่ผมตั้งค่าเอง - DGX Spark + LiteLLM + MCP + Hermes/OpenClaw

Real Config Example (Hermes Agent + vLLM + LiteLLM)

# ~/.hermes/config.yaml (ตัวอย่าง config)
llm:
  primary: qwen3.6-35b        # via LiteLLM (thinking enabled)
  fallback: gpt-4o-mini       # cloud, only if local fails
  timeout: 60s

# LiteLLM model definitions
model_list:
  - model_name: qwen3.6-35b
    litellm_params:
      model: openai/qwen3.6-35b-base
      api_base: http://10.0.0.246:8000/v1   # ตัวอย่าง IP ภายใน
      reasoning_effort: medium

  - model_name: qwen3.6-35b-gpt
    litellm_params:
      model: openai/qwen3.6-35b-base
      api_base: http://10.0.0.246:8000/v1   # ตัวอย่าง IP ภายใน
      reasoning_effort: low
      max_tokens: 500

# MCP server config
mcp_servers:
  - name: trading-data
    transport: stdio
    command: python
    args: ["./mcp_servers/trading.py"]

  - name: internal-api
    transport: streamable-http
    url: http://internal-mcp:8000/mcp
    auth: oauth2
    scopes: ["tools:read", "trading:execute"]

Note: ใช้ "alias trick" เพื่อให้ 1 model ทำงานได้หลาย config - qwen3.6-35b (thinking) + qwen3.6-35b-gpt (fast) ชี้ไป vLLM endpoint เดียวกัน แค่ config ต่างกันใน LiteLLM - ไม่ต้อง deploy model ซ้ำ

Sweet Spot Finding - ศิลปะที่ Developer ต้องเรียน

นี่คือทักษะที่ผมคิดว่า สำคัญที่สุด ในยุค Agentic - ไม่ใช่แค่ "เขียนโค้ด" แต่ "หา sweet spot ระหว่าง 4 มิติ"

Every use case has a different sweet spot:

Use Case	Quality	Latency	Cost	Best Fit
Trading	High	Mid	$0	Local model + tuned config
Commit msg	Low	Low	$0	Small model, fast inference
Code Review	High	High	$$$	Cloud API for accuracy
Q&A bot	Mid	Low	$0	Cached local responses

Note: ไม่มี "model ที่ดีที่สุด" - มีแต่ "model + config ที่ตอบโจทย์ use case ที่สุด" developer ต้อง:

Define use case ชัดเจน

เลือก model candidate

ทดสอบหลายๆ config (reasoning_effort, max_tokens, temperature)

Benchmark throughput / latency / quality

Optimize จนเจอ sweet spot

Monitor in production

Example: หา Sweet Spot สำหรับ Backtest Bot

Context	Reasoning	Max Tokens	Result
16K	low	500	❌ cut
16K	low	2000	⚠️ tight
32K	low	3000	✅ win
32K	medium	3000	⚠️ slow
32K	low	4000	✅ fast
256K	low	3000	❌ slow

Sweet spot: 32K ctx + low reasoning + 3K tokens — deterministic, fast, not truncated

ความท้าทายที่ Developer ต้องเจอ

ตอนทำ infra เอง ผมเจอ 4 ความท้าทายใหญ่ - อาจเป็น "เรื่องปกติใหม่" ของ developer ยุค AI

ความท้าทายที่ 1: Token Economics

Provider	Monthly	Relative
GPT-4	$3,000-6,000	💸💸💸
Claude Sonnet	$3,000-6,000	💸💸💸
Self-hosted GPU (electricity)	$200-500	💸

วิธีรับมือ:

Use small models for simple tasks
Cache aggressively (Redis, prefix caching)
Tune context window to minimum needed
Use thinking mode only when necessary

Note: "Fine-tuning is back" ไม่ใช่ hype - ตาม case study จริง small fine-tuned model (7B-13B) รัน 1000+ calls/day ได้ที่ cost ต่ำกว่า GPT-4 แบบ orders of magnitude ถ้า fine-tune สำหรับ domain-specific task ได้ quality เทียบเท่า 80-90%

ความท้าทายที่ 2: Specs ≠ Performance

คำกล่าวอ้างทางการตลาด: "128GB unified memory = fast" ความจริง: 273 GB/s memory bandwidth

Hardware	Memory Bandwidth
RTX 5090	1,700 GB/s (6× faster)
Apple M4 Max	~400 GB/s
DGX Spark	273 GB/s ← bottleneck!

สำหรับการสร้าง token ของ LLM bandwidth สำคัญกว่าขนาดหน่วยความจำทั้งหมด 128GB เหมาะกับ context แต่ถ้า bandwidth ต่ำ ก็ใช้งานได้ไม่เต็มประสิทธิภาพ

💡 บทเรียน: Read benchmarks, not spec sheets

Note: เครื่องแพงไม่ได้แปลว่าเร็วเสมอ - DGX Spark มี unified memory สุดอลังการ แต่ bandwidth จำกัด RTX 5090 build ถูกกว่า + bandwidth สูงกว่า 6 เท่า - ต้องดู workload ว่า memory-bound หรือ bandwidth-bound

ความท้าทายที่ 3: Model + Tool Coupling

Model	Tool Call Format
GPT-4	place_order(sym, qty)
Claude	place_order(sym, qty, side)
Gemini	place_order(sym, qty, type)
Local	place_order(...) ← different!

MCP helps, but you still need:

Strict tool schemas (Zod / Pydantic)
Server-side validation
Testing across multiple models
Graceful fallback when tool call fails

Note: ใช้ strict schema + validation ช่วยได้ แต่ต้องยอมรับว่า model ทุกตัวมี "personality" ต่างกัน - เคสที่ tool ใช้ได้กับ GPT-4 อาจพังกับ local model → test matrix เป็นสิ่งจำเป็น

ความท้าทายที่ 4: กรณีศึกษา MIT — 95% ของ AI Project ล้มเหลวที่ Production

Note: MIT Project NANDA (Aug 2025) - "The GenAI Divide" - 95% ของ AI projects ล้มเหลวที่ production ไม่ใช่เพราะ model ไม่ดี แต่เพราะ:

No clear use case

Skip evaluation

Skip monitoring

Skip iteration

Demo-to-production gap คือปัญหาจริง ไม่ใช่แค่ "model ไม่เก่งพอ"

The New Developer Archetype

สรุปแล้ว - developer ในยุค Agentic AI ต้องเป็น "AI Infrastructure Engineer" ที่มีทักษะหลายด้าน

What's Next - Roadmap สำหรับ Developer

ถ้าคุณเพิ่งเริ่มต้นเป็น AI-era developer นี่คือ roadmap ที่ผมแนะนำ:

Note: ไม่ต้องเรียนรู้ทุกอย่างพร้อมกัน เริ่มจาก foundation ก่อน (MCP, local model), แล้วค่อยขยาย ใช้เวลา 3-4 เดือนถึงจะคล่อง - แต่ถ้าผ่านแล้วจะเป็น developer ที่หายากมากในตลาด

Quick Start: What to Learn First

ถ้าคุณเพิ่งเริ่มต้นกับ AI infrastructure ให้ตาม roadmap 4 เดือนนี้ ถ้ามี use case เฉพาะทางแล้วสามารถข้ามไปตามความเหมาะสมได้เลย

Conclusion

ยุค Agentic AI ไม่ได้แทนที่ developer - แต่ทำให้ developer ต้อง evolve คนที่ยังเขียนแค่ CRUD app จะถูก AI ช่วยได้ แต่คนที่ออกแบบ AI infrastructure, tune model per use case, build custom MCP servers, balance local vs cloud routing, แล้วก็รู้จัก sweet spot ระหว่าง speed/cost/quality - คนพวกนี้จะหายากและมีค่ามาก

ผมเชื่อว่า "AI-era developer" ไม่ใช่ role ใหม่ - แต่เป็นวิวัฒนาการของ software engineer ที่ต้องปรับตัว ถ้าใครกำลังจะเริ่ม ผมแนะนำ:

เริ่มเล็กๆ - Ollama + Open WebUI เล่น local
สร้างอะไรซักอย่างจริงจัง - MCP server สำหรับ use case ตัวเอง
วัดทุกอย่าง - latency, cost, tokens, quality
ปรับปรุงซ้ำ - fine-tune ไปเรื่อยๆ
แบ่งปัน - blog, community, contribute back

ไม่มี shortcut - แต่ก็ไม่ยากเกินจะเรียน แค่ต้อง commit เวลาเข้าใจ ecosystem ใหม่

สิ่งที่ผมจะเขียนต่อไป:

MCP Server & Client Design - transport types, pseudo code, security patterns
เจาะลึก production stack - DGX + LiteLLM + Hermes + custom MCP
Cost optimization case study - วงจรลองใช้ local ก่อน → ขยับไป cloud ที่ทำงานจริง

ถ้ามีคำถามหรืออยากให้เจาะลึกเรื่องไหน บอกได้เลยครับ!

References

Honcho Memory Layer - memory infrastructure ที่ผมใช้กับ AI agents (reasoning-first, MCP-native, self-host)

agentic-ai mcp model-engineer llm-infrastructure devops

แชร์บทความ

Facebook X

☕

เนื้อหานี้มีประโยชน์ไหม? ช่วยสนับสนุนค่ากาแฟให้ผู้เขียนสักแก้ว

Buy Me a Coffee

สารบัญ

TL;DR​

The Shift - จาก Developer สู่ AI-Era Developer​

ทำไมองค์กรต้องลงทุนสร้าง AI Infra เอง​

AI-Era Developer ต้องรู้อะไรบ้าง​

1. Model Context Awareness​

2. Model Serving Knowledge​

3. MCP as Standard​

4. Middleware / Routing Layer​

5. Security & Governance​

Self-Hosted Models + Agents: Pattern ที่ผมใช้​

Real Config Example (Hermes Agent + vLLM + LiteLLM)​

Sweet Spot Finding - ศิลปะที่ Developer ต้องเรียน​

Example: หา Sweet Spot สำหรับ Backtest Bot​

ความท้าทายที่ Developer ต้องเจอ​

ความท้าทายที่ 1: Token Economics​

ความท้าทายที่ 2: Specs ≠ Performance​

ความท้าทายที่ 3: Model + Tool Coupling​

ความท้าทายที่ 4: กรณีศึกษา MIT — 95% ของ AI Project ล้มเหลวที่ Production​

The New Developer Archetype​

What's Next - Roadmap สำหรับ Developer​

Quick Start: What to Learn First​

Conclusion​

References​

Related Blogs​