With Agent Skills, Do We Still Need MCP?
- What MCP Actually Does
- What Agent Skills Actually Do
- The Real Difference: Who Owns Control?
- The Hybrid Architecture: Skills as the Brain, MCP as the Hands
- Making It Work: Capability Declaration
- Where Not to Split: The Monolith Has Its Place
- The Current Gap: What’s Missing
- Conclusion: Not Either/Or — It’s a Stack
You just connected 15 MCP servers to your AI agent. GitHub, Slack, databases, file systems, APIs — the works. Context window: 40% consumed before a single user message. The agent sees 120 tool definitions and picks the wrong one half the time. Sound familiar?
Meanwhile, a colleague installs an Agent Skill. One line loads into memory. The agent handles the same task flawlessly — but the skill’s Python script crashes on their machine because pandas isn’t installed.
Two extensibility patterns. Two very real failure modes. And a growing question in the AI agent community: now that we have Agent Skills, do we still need MCP? Or does one make the other redundant?
The answer is more nuanced than a simple yes or no. This post examines both patterns, their architectural trade-offs, and proposes a hybrid architecture that plays to both strengths.
What MCP Actually Does
Model Context Protocol is an open standard — originally introduced by Anthropic — that standardizes how AI agents discover and invoke external tools. Think of it as the USB-C of the agent world: one standard interface that replaces a mess of proprietary integrations.
Before MCP, if you had N agents and M tool providers, you needed N×M custom integrations. MCP reduces this to N+M: each tool provider implements one MCP server, each agent implements one MCP client, and they just work together. This is the exact same value proposition that LSP (Language Server Protocol) brought to code editors — and it worked spectacularly there.
An MCP server exposes tools (functions the agent can call), resources (data the agent can read), and prompts (reusable templates). The protocol handles discovery, invocation, and response formatting through a standardized JSON-RPC interface.
Example: A weather-data MCP server might expose tools like get_forecast(location), get_historical(location, date_range), and get_alerts(region). Any MCP-compatible agent can discover and call these tools without knowing anything about the implementation — whether it uses Open-Meteo, WeatherAPI, or a proprietary data feed.
What MCP does well
- Interoperability: Write a tool once, use it from Claude Code, Cursor, OpenClaw, or any MCP-compatible agent
- Encapsulation: The server manages its own dependencies, authentication, and runtime environment
- Discovery: Agents can dynamically discover available tools and their schemas
- Ecosystem growth: The standard enables a marketplace of tools — anyone can publish an MCP server
Where MCP struggles
Context window consumption. This is MCP’s most practical limitation. When an agent connects to MCP servers, it loads all available tool schemas into its system prompt — tool names, descriptions, parameter definitions, everything. Connect 10 MCP servers, each exposing 5–10 tools, and you’ve consumed several thousand tokens before the conversation even starts. Every tool call’s request and response further accumulates in the context window.
No orchestration logic. MCP tools are independent and stateless. A GitHub MCP server doesn’t know a Slack MCP server exists. If you want a workflow like “fetch PR details from GitHub, run analysis, post results to Slack,” the orchestration has to come from somewhere else — either the agent’s own reasoning or a higher-level abstraction.
Eager loading. MCP’s design is inherently eager: connect a server, load all its tools. There’s no standardized mechanism for lazy loading or dynamic tool registration based on the current task. The community is exploring solutions (tool routing layers, dynamic registration/deregistration), but nothing is standardized yet.
Tool proliferation. As the ecosystem grows, agents face a paradox of choice. With 200 available tools, the agent has to read through all their descriptions to decide which one to use — burning context and increasing the chance of picking the wrong tool.
What Agent Skills Actually Do
Agent Skills (also called AgentSkills, based on an open standard by Anthropic) are a fundamentally different kind of extension. While MCP extends what an agent can do, skills extend what an agent knows how to do.
A skill is essentially a domain expert’s decision tree, encoded in a format an agent can internalize. It consists of:
- SKILL.md — The core knowledge file: when to trigger, what steps to follow, what quality standards to apply, what pitfalls to avoid
- scripts/ — Lightweight executable scripts (bash, Python) the skill can invoke
- references/ — Deep reference material loaded on demand
- assets/ — Supporting files (templates, configs, sample data)
The critical mechanism is the description field: a one-line summary that tells the agent when this skill is relevant. The agent framework matches incoming tasks against skill descriptions and loads the full SKILL.md only when there’s a match.
Example: An aws-blog-review skill doesn’t just “check articles” — it contains the complete AWS editorial standards: prohibited language lists, structural requirements, SEO rules, compliance checklists. This domain knowledge can’t be expressed as a single MCP tool call.
What skills do well
- Lazy loading: Only the one-line description lives in memory. The full SKILL.md loads on demand, and unloads after use. This is fundamentally more context-efficient than MCP’s eager approach.
- Domain knowledge encoding: Skills capture how to think about a problem, not just how to execute a function. They encode expert judgment, quality standards, edge cases, and decision logic.
- Multi-tool orchestration: A single skill can coordinate multiple tools in a coherent workflow, with conditional logic, error handling, and fallback strategies.
- Progressive disclosure: Skills can structure knowledge in layers — load the overview first, then drill into references only when needed, minimizing context consumption.
Where skills struggle
Environment dependencies. This is the Achilles’ heel. A skill’s scripts/ directory might contain a Python script that requires boto3, pandas, and matplotlib. But the host machine might not have these installed — or might have the wrong version. Current solutions are ad hoc:
uv run --with boto3,pandas python3 scripts/analysis.py— creates a temporary virtual environment (fast, clean, but only works ifuvis installed)- A setup script in the skill — fragile and not standardized
- Assuming the environment is pre-configured — works on your machine, breaks on everyone else’s
There is no standardized dependency declaration or runtime resolution mechanism for skills today. This severely limits portability: a skill that works perfectly on one machine may fail silently on another.
No standard runtime contract. Unlike MCP servers (which are self-contained processes with their own environments), skills run in the agent’s environment. They inherit whatever Python version, system packages, and permissions the host has. This makes them lighter but more fragile.
The Real Difference: Who Owns Control?
Beyond the technical details, there’s a philosophical difference that matters for architecture decisions.
MCP is tool-centric. The tool provider defines what you can do, what parameters to pass, and what format comes back. The agent is the caller, but the rules are set by the tool. It’s like ordering from a menu — the kitchen decides what’s available.
Skills are agent-centric. The agent decides when to trigger, what workflow to follow, and what quality standards to apply. Tools are just execution means. It’s like cooking at home — the recipe is yours, and you can swap utensils freely.
This control difference determines composability. Two MCP servers are unaware of each other. But a skill can orchestrate multiple tools in a single workflow: “get data from source A, cross-reference with source B, apply business rules, format output, deliver to destination C.” This cross-tool orchestration capability exists only at the skill layer or in the agent’s own reasoning.
The Hybrid Architecture: Skills as the Brain, MCP as the Hands
So, can we combine them? Not only can we — we probably should. The two patterns solve complementary problems, and their weaknesses are each other’s strengths.
The core idea: MCP encapsulates environment dependencies. Skills encapsulate domain logic.
Consider a concrete example: multi-cloud infrastructure monitoring. The current approach crams everything into a single Python script inside a skill — data collection (boto3, azure-sdk), metric calculation (pandas/numpy), visualization (matplotlib), and alerting logic (when to page on-call, what thresholds matter). Environment dependencies, data access, and domain knowledge are all tangled together.
With a hybrid architecture:
MCP layer (solves the environment problem)
- A
cloud-metricsMCP server encapsulates the cloud SDKs, exposingget_cpu_utilization,get_cost_breakdown, andget_alert_history - It manages its own Python environment, dependency versions, and credential management
- It can run in a container, on a remote server, or as a Lambda function — the skill never needs to know
Skill layer (solves the cognition problem)
- SKILL.md defines: when to trigger monitoring, the escalation criteria for different severity levels, how to correlate cross-service anomalies
- It calls MCP tools to get data, then applies domain knowledge to make judgments
- It doesn’t need to know how boto3 is installed or what Python version is running
What this solves
infra-monitor skill works on a Mac calling a local MCP server or on an EC2 instance calling a remote one — identical logic, different execution backends.cloud-metrics MCP server can be used by multiple skills: a cost optimization skill, an incident response skill, and a capacity planning skill. Each skill brings different domain logic, but the underlying data source is shared.The architecture in layers
┌─────────────────────────────────────────────┐ │ Agent Layer │ │ Matches tasks to skills, manages context │ ├─────────────────────────────────────────────┤ │ Skill Layer │ │ Domain knowledge, orchestration, │ │ decisions (SKILL.md + lightweight scripts) │ ├─────────────────────────────────────────────┤ │ MCP Layer │ │ Tool execution, environment isolation, │ │ standardized interfaces │ └─────────────────────────────────────────────┘
- Skill layer: Pure text knowledge + lightweight scripts (bash/curl-level), zero external dependencies
- MCP layer: Encapsulated capabilities with full runtime environments, standardized interfaces
- Agent layer: Uses skill guidance to invoke MCP tools, manages the conversation
Making It Work: Capability Declaration
For this hybrid to work at scale, we need a mechanism for skills to declare what capabilities they need without binding to specific MCP servers. Think of it like Kubernetes resource requests: declare what you need, let the runtime figure out who provides it.
A skill could declare:
requires_capabilities:
- cloud_metrics
- chart_generation
- notification_delivery
The agent runtime resolves these against available MCP servers:
cloud_metrics→aws-cloudwatch-mcp(local) ordatadog-mcp(remote)chart_generation→matplotlib-mcp(container) orplotly-mcp(local)notification_delivery→slack-mcporpagerduty-mcp
If an MCP server isn’t available, the runtime falls back to the skill’s built-in scripts (if any). Same behavior, different execution path. This gives skills maximum portability: they work in MCP-rich environments with full isolation, and degrade gracefully in simpler setups.
Fallback in practice
Here’s what the resolution flow looks like concretely. Say an infra-monitor skill declares requires_capabilities: [chart_generation] and the agent triggers it:
- Runtime checks MCP registry: Is there an MCP server that provides
chart_generation? → Yes,matplotlib-mcpis running locally. Use it. The skill callsgenerate_chart(service, period, metrics)through the MCP interface. The server handles matplotlib, numpy, font rendering — all inside its own environment. - MCP server unavailable (not installed, crashed, remote endpoint down): Runtime checks if the skill has a built-in fallback. The skill’s
scripts/containsgenerate_chart.pywith a# requires: matplotlib, pandasheader. Runtime usesuv run --with matplotlib,pandas python3 scripts/generate_chart.pyto execute it in a temporary virtual environment. Same output, slightly slower, no pre-configured server needed. - No MCP, no script, no
uv: Runtime reports to the agent: “capabilitychart_generationunavailable.” The skill’s SKILL.md has a fallback instruction: “If chart generation is unavailable, output the data as a formatted table instead.” The agent degrades gracefully — the user still gets the analysis, just without the visual.
This three-tier fallback (MCP → skill script → graceful degradation) means a skill author can target the best-case environment without breaking in the worst case. The skill’s logic is the same at every tier; only the execution path changes.
This is fundamentally different from hardcoding requires_mcp: ["cloudwatch-server"] in a skill, which binds to a specific implementation. Capability-based declaration is provider-agnostic.
Where Not to Split: The Monolith Has Its Place
Not every script should be decomposed into MCP tools. If a Python script is a tightly-coupled pipeline — “fetch data → calculate → output conclusions” — with intermediate steps that depend on each other, splitting it into multiple MCP tool calls introduces unnecessary overhead:
- Latency: Each call round-trips back to the agent for the next decision
- Token cost: Each intermediate result consumes context window
- Complexity: The agent has to reason about a multi-step data pipeline that the script handles in milliseconds
The right heuristic: extract reusable data capabilities into MCP; keep one-shot orchestration pipelines in skill scripts. It’s the microservices vs. monolith decision — more granular isn’t always better. Cut at the right boundaries.
| Scenario | MCP? | Skill Script? | Why |
|---|---|---|---|
| Cloud metrics API used by 3+ skills | ✅ | Reusable, environment-heavy, stable interface | |
| 50-line fetch → calculate → format | ✅ | Tightly coupled, single-purpose | |
| Chart generation (matplotlib/plotly) | ✅ | Heavy dependencies, reusable across skills | |
| Conditional workflow with business logic | ✅ | Domain logic, not a reusable capability | |
| Database connector with auth | ✅ | Credentials + pooling belong in isolated process | |
| One-off data transformation | ✅ | Too specific to be worth the MCP overhead |
A real-world example of when not to split: Consider an infrastructure health check script that queries 15 services, calculates availability percentages for each, cross-references against SLA thresholds, correlates with recent deployment events, and outputs a single go/no-go deployment readiness verdict. The entire pipeline takes 3 seconds as a monolithic script. Splitting it into MCP calls would mean 15+ round-trips back to the agent (one per service), each burning context tokens for intermediate results the user never sees. Keep it as a skill script. Extract the cloud metrics API as MCP for reuse elsewhere, but let the analysis pipeline stay monolithic.
The Current Gap: What’s Missing
For this hybrid architecture to mature, several pieces are still missing from the ecosystem:
Conclusion: Not Either/Or — It’s a Stack
So, do we still need MCP now that we have Agent Skills?
Yes — but for different reasons than you might think.
MCP isn’t just about connecting to tools. It’s about environment isolation and standardized execution. Skills aren’t just about knowing things. They’re about encoding expert judgment and orchestrating complex workflows.
The real question isn’t “MCP or skills?” — it’s “where do you draw the line between them?” Our answer:
- MCP owns the execution boundary: environment management, dependency isolation, standardized tool interfaces, cross-agent interoperability
- Skills own the knowledge boundary: when to act, what steps to follow, how to evaluate quality, how to orchestrate multiple tools into coherent workflows
- The agent owns the decision boundary: matching tasks to skills, resolving capabilities to MCP servers, managing context
Neither replaces the other. MCP without skills is a toolbox without a craftsman’s knowledge — powerful tools sitting idle because no one knows when to use them. Skills without MCP are expertise without reliable hands — knowing exactly what to do but struggling with environment dependencies and tool fragmentation.
The future of agent extensibility isn’t choosing between MCP and skills. It’s building the stack that lets them work together — where skills declare what capabilities they need, MCP servers provide those capabilities with full environment isolation, and agent runtimes handle the binding between them.
We’re not there yet. But the direction is clear.
⚠️ Disclaimer: The views and opinions expressed in this article are my own and do not represent those of my employer. This content is for educational purposes only.
💬 Comments
Related Posts