Tech Thoughts
Deep Dives & Analysis
Personal takes on AI trends, cloud engineering, developer experience, and the craft of building software.
Notes on Agent Skill Evolution
Reading notes on six recent systems that let agents turn past experience into reusable skills: Trace2Skill, SkillClaw, CoEvoSkills, SkillRL, Skill-SD, and Hermes. Four sub-questions (extraction, organization, verification, update cadence) and an honest take on what actually ships.
Context Engineering and the Real Agent Problem
The bottleneck for production AI agents is no longer model capability. A deep dive into context engineering: memory architecture, KV-cache optimization, multi-tenancy isolation, and how Anthropic, Google, and AWS approach agent context management differently.
Hermes and the Closed Learning Loop
Most AI agents forget everything between sessions. Hermes Agent takes a different approach: it extracts skills from experience, improves them through GEPA (a reflection-based prompt evolution method from ICLR 2026), and maintains a four-layer memory system with hard token limits. Here is how it compares to OpenClaw, and what I borrowed for Agent Greenhouse.
Graph of Skills and PageRank for Agent Skill Retrieval
When an AI agent has 2,000 skills, semantic search alone misses critical dependencies. A new paper proposes Graph of Skills — using Personalized PageRank to walk backward through dependency graphs and retrieve complete skill bundles. The same family of algorithms I used in my PhD research on trust propagation.
SKILL0 Puts Skills in the Weights
SKILL0 introduces In-Context RL to internalize agent skills into model parameters. A 3B model matches skill-augmented baselines while using 5.8x fewer tokens per step — with zero skill retrieval at inference time. An analysis of what this means for the agent skills ecosystem.
With Agent Skills, Do We Still Need MCP?
MCP and Agent Skills solve different problems. This post examines both patterns, proposes a hybrid architecture with capability declaration and three-tier fallback, and argues they’re different layers of the same stack.
Is That Agent Skill Any Good? A Complete Framework for Evaluating AI Agent Skills
A four-dimension evaluation framework for agent skills — covering safety, quality, reliability, and regression. Introduces skill-eval, an open-source tool that measures all four dimensions with a single command. Includes CI/CD integration patterns and lifecycle management.
Agent Foundry and Why Heavy Skills Backfire
We tested three skill designs across three domains and found that heavy, checklist-driven skills degraded agent performance by up to 81%. Lightweight knowledge injection improved results by up to 34% while using fewer tokens. The key insight: skills should enhance judgment, not replace it.
How I Built Two AI Agents That Talk to Each Other
A practical guide to multi-agent communication — the architecture, the Brain-Sync Protocol, shared memory via Git, and the hard-won lessons from watching two AI agents collaborate in real time.
How I Actually Use OpenClaw — From Content Pipeline to Foundational Agent
Real workflows, real examples, and what I learned after 3 weeks of raising a personal AI. From content pipeline with Claude Code dispatch to Notion as an AI-powered second brain.
SKILL.md Is Quietly Replacing Team Wikis
Agent Skills are now an open standard supported across Claude Code, Codex, Kiro, Strands, and more. But the real story isn’t the tech — it’s how skills are fundamentally changing how teams distribute expertise, how organizations will evolve, and the security challenges we need to solve before it’s too late.
OpenFang on AWS (Part 2): Security Review, Deployment & Lessons Learned
Part 2 of the OpenFang on AWS series. Covers the WAF security review that uncovered critical findings, the fixes applied, full CDK deployment architecture, and lessons learned from putting an open-source agent OS into production on AWS.
OpenFang on AWS — Deploying an Open-Source Agent OS with Amazon Bedrock
A hands-on guide to deploying OpenFang — an open-source autonomous agent OS — on AWS with Amazon Bedrock via LiteLLM proxy. Covers architecture, model routing, CDK infrastructure, and the emerging category of agents that run on schedules and produce deliverables without human prompting.
Deep Dive: Agent Skills & Skill Evaluation — How to Build, Measure, and Trust What Your AI Agent Can Do
A deep dive into how Anthropic and OpenAI independently built agent skill evaluation frameworks — and what their convergence means for the future of agentic AI. Covers the SKILL.md open standard, four-dimension eval framework, two-layer evaluation architecture, and why skill testing is now a governance question.
The Complete Guide to RAG on AWS — Architecture, Deep Dives & Evaluation
A comprehensive 21,000-word guide to Retrieval-Augmented Generation on AWS — covering architecture patterns, chunking strategies, vector search with Amazon OpenSearch and Aurora, Amazon Bedrock Knowledge Bases, evaluation frameworks, and production best practices.
AgentCore Gateway Deep Dive: Architecture, Interceptors, and Semantic Discovery
A deep dive into Amazon Bedrock AgentCore Gateway — its architecture, interceptor system, semantic tool discovery, and enterprise patterns for production agent systems.
Claude Code Best Practices: From Zero to Hero
A comprehensive guide to mastering Claude Code — from environment setup and effective prompting to advanced workflows, custom slash commands, and building your own MCP servers.
Running Claude Agent SDK Inside OpenClaw on AgentCore
Six iterations of trying to run the Claude Agent SDK inside OpenClaw on AgentCore Runtime — from naive subprocess spawning to discovering that the real integration tax is authentication threading. A practical account of what works, what doesn’t, and when to use an agent pattern vs. a simple script.
Amazon Bedrock AgentCore Memory: Isolation Patterns for Multi-Agent Applications
How to architect memory isolation in multi-agent platforms on Amazon Bedrock AgentCore. Covers app-level isolation (Silo vs Pool patterns), user-level isolation via actorId and namespace scoping, IAM defense-in-depth, and end-to-end identity propagation with Cognito.
Deploy OpenClaw on AWS with Amazon Bedrock AgentCore
Walk through deploying OpenClaw as a multi-user, multi-channel AI assistant on AgentCore Runtime. Covers per-user microVM isolation, S3 workspace persistence, webhook ingestion from Telegram and Slack, and defense-in-depth security — all deployed via six CDK stacks.
OpenClaw 深度技术分析:安装、安全与最佳实践指南
一个拥有 219K+ GitHub Stars 的开源 AI 个人助手,它能帮你做几乎所有事情——但你需要先了解它的风险。深入分析其技术架构、6+ 个 CVE 漏洞、供应链攻击事件、AWS Bedrock 企业级部署方案及 20 条安全最佳实践。