Tech Thoughts - Dr. Melanie Li

Tech Thoughts

Deep Dives & Analysis

Personal takes on AI trends, cloud engineering, developer experience, and the craft of building software.

Agent SkillsSelf-Evolving AgentsResearch NotesLLM Agents

Notes on Agent Skill Evolution

Reading notes on six recent systems that let agents turn past experience into reusable skills: Trace2Skill, SkillClaw, CoEvoSkills, SkillRL, Skill-SD, and Hermes. Four sub-questions (extraction, organization, verification, update cadence) and an honest take on what actually ships.

April 18, 2026 · 10 min read

Context EngineeringAI AgentsProduction AIBedrock AgentCore

Context Engineering and the Real Agent Problem

The bottleneck for production AI agents is no longer model capability. A deep dive into context engineering: memory architecture, KV-cache optimization, multi-tenancy isolation, and how Anthropic, Google, and AWS approach agent context management differently.

April 17, 2026 · 14 min read

Self-Improving AgentsGEPAAgent Memory

Hermes and the Closed Learning Loop

Most AI agents forget everything between sessions. Hermes Agent takes a different approach: it extracts skills from experience, improves them through GEPA (a reflection-based prompt evolution method from ICLR 2026), and maintains a four-layer memory system with hard token limits. Here is how it compares to OpenClaw, and what I borrowed for Agent Greenhouse.

April 13, 2026 · 12 min read

ResearchAgent SkillsGraph Algorithms

Graph of Skills and PageRank for Agent Skill Retrieval

When an AI agent has 2,000 skills, semantic search alone misses critical dependencies. A new paper proposes Graph of Skills — using Personalized PageRank to walk backward through dependency graphs and retrieve complete skill bundles. The same family of algorithms I used in my PhD research on trust propagation.

📅 2026-04-13📖 ~15 min read🌐 English

Read article →

Research PaperReinforcement LearningAgent SkillsSkill Internalization

SKILL0 Puts Skills in the Weights

SKILL0 introduces In-Context RL to internalize agent skills into model parameters. A 3B model matches skill-augmented baselines while using 5.8x fewer tokens per step — with zero skill retrieval at inference time. An analysis of what this means for the agent skills ecosystem.

📅 April 8, 2026📖 ~15 min read🌐 English

Read article →

AI AgentsMCPAgent SkillsArchitecture

With Agent Skills, Do We Still Need MCP?

MCP and Agent Skills solve different problems. This post examines both patterns, proposes a hybrid architecture with capability declaration and three-tier fallback, and argues they’re different layers of the same stack.

📅 March 20, 2026📖 ~15 min read

Read article →

Agent SkillsEvaluationSecurityTesting

Is That Agent Skill Any Good? A Complete Framework for Evaluating AI Agent Skills

A four-dimension evaluation framework for agent skills — covering safety, quality, reliability, and regression. Introduces skill-eval, an open-source tool that measures all four dimensions with a single command. Includes CI/CD integration patterns and lifecycle management.

📅 March 2026⏱ 15 min read🌐 English

Read article →

Agent SkillsEvaluationClaudeBedrock

Agent Foundry and Why Heavy Skills Backfire

We tested three skill designs across three domains and found that heavy, checklist-driven skills degraded agent performance by up to 81%. Lightweight knowledge injection improved results by up to 34% while using fewer tokens. The key insight: skills should enhance judgment, not replace it.

📅 March 2026⏱ 15 min read🌐 English

Read article →

Multi-AgentAICommunicationOpenClaw

How I Built Two AI Agents That Talk to Each Other

A practical guide to multi-agent communication — the architecture, the Brain-Sync Protocol, shared memory via Git, and the hard-won lessons from watching two AI agents collaborate in real time.

📅 2026-03-14📖 ~15 min read🌐 English

Read article →

OpenClawAI AgentsProductivity

How I Actually Use OpenClaw — From Content Pipeline to Foundational Agent

Real workflows, real examples, and what I learned after 3 weeks of raising a personal AI. From content pipeline with Claude Code dispatch to Notion as an AI-powered second brain.

📅 2026-03-14📖 ~8 min read

Read article →

AgentSkillsAIFutureOfWork

SKILL.md Is Quietly Replacing Team Wikis

Agent Skills are now an open standard supported across Claude Code, Codex, Kiro, Strands, and more. But the real story isn’t the tech — it’s how skills are fundamentally changing how teams distribute expertise, how organizations will evolve, and the security challenges we need to solve before it’s too late.

📅 2026-03-12📖 ~16 min read🌐 English

Read article →

OpenFangAWSSecurityCDK

OpenFang on AWS (Part 2): Security Review, Deployment & Lessons Learned

Part 2 of the OpenFang on AWS series. Covers the WAF security review that uncovered critical findings, the fixes applied, full CDK deployment architecture, and lessons learned from putting an open-source agent OS into production on AWS.

📅 2026-03-09📖 ~20 min read🌐 English

Read article →

OpenFangAWSBedrockAgent OSLiteLLM

OpenFang on AWS — Deploying an Open-Source Agent OS with Amazon Bedrock

A hands-on guide to deploying OpenFang — an open-source autonomous agent OS — on AWS with Amazon Bedrock via LiteLLM proxy. Covers architecture, model routing, CDK infrastructure, and the emerging category of agents that run on schedules and produce deliverables without human prompting.

📅 2026-03-09📖 ~25 min read🌐 English

Read article →

Agent SkillsAI EvaluationAgentic AILLMOps

Deep Dive: Agent Skills & Skill Evaluation — How to Build, Measure, and Trust What Your AI Agent Can Do

A deep dive into how Anthropic and OpenAI independently built agent skill evaluation frameworks — and what their convergence means for the future of agentic AI. Covers the SKILL.md open standard, four-dimension eval framework, two-layer evaluation architecture, and why skill testing is now a governance question.

📅 2026-03-09📖 ~20 min read🌐 English

Read article →

RAGAWSBedrockVector SearchLLM

The Complete Guide to RAG on AWS — Architecture, Deep Dives & Evaluation

A comprehensive 21,000-word guide to Retrieval-Augmented Generation on AWS — covering architecture patterns, chunking strategies, vector search with Amazon OpenSearch and Aurora, Amazon Bedrock Knowledge Bases, evaluation frameworks, and production best practices.

📅 2026-03-07📖 ~45 min read🌐 English

Read article →

AgentCoreGatewayArchitecture

AgentCore Gateway Deep Dive: Architecture, Interceptors, and Semantic Discovery

A deep dive into Amazon Bedrock AgentCore Gateway — its architecture, interceptor system, semantic tool discovery, and enterprise patterns for production agent systems.

📅 2026-03-04📖 ~20 min read🌐 English

Read article →

Claude CodeBest PracticesDeveloper Experience

Claude Code Best Practices: From Zero to Hero

A comprehensive guide to mastering Claude Code — from environment setup and effective prompting to advanced workflows, custom slash commands, and building your own MCP servers.

📅 2026-03-03📖 ~15 min read🌐 English

Read article →

Claude SDKOpenClawAgentCore

Running Claude Agent SDK Inside OpenClaw on AgentCore

Six iterations of trying to run the Claude Agent SDK inside OpenClaw on AgentCore Runtime — from naive subprocess spawning to discovering that the real integration tax is authentication threading. A practical account of what works, what doesn’t, and when to use an agent pattern vs. a simple script.

📅 2026-03-01📖 ~15 min read🌐 English

Read article →

AgentCoreMemoryMulti-Tenant

Amazon Bedrock AgentCore Memory: Isolation Patterns for Multi-Agent Applications

How to architect memory isolation in multi-agent platforms on Amazon Bedrock AgentCore. Covers app-level isolation (Silo vs Pool patterns), user-level isolation via actorId and namespace scoping, IAM defense-in-depth, and end-to-end identity propagation with Cognito.

📅 2026-02-28📖 ~18 min read🌐 English

Read article →

AgentCoreOpenClawHow-To

Deploy OpenClaw on AWS with Amazon Bedrock AgentCore

Walk through deploying OpenClaw as a multi-user, multi-channel AI assistant on AgentCore Runtime. Covers per-user microVM isolation, S3 workspace persistence, webhook ingestion from Telegram and Slack, and defense-in-depth security — all deployed via six CDK stacks.

📅 2026-02-23📖 ~22 min read🌐 English

Read article →

AI AgentSecurityAWS

OpenClaw 深度技术分析：安装、安全与最佳实践指南

一个拥有 219K+ GitHub Stars 的开源 AI 个人助手，它能帮你做几乎所有事情——但你需要先了解它的风险。深入分析其技术架构、6+ 个 CVE 漏洞、供应链攻击事件、AWS Bedrock 企业级部署方案及 20 条安全最佳实践。

📅 2026-02-23📖 ~25 min read🌱 Chinese

Read article →