Claude Code Skills 2026 — Progressive Disclosure and How to Standardise Workflow for Engineering Teams
Posted on: 4/16/2026 11:11:27 PM
Table of contents
- 1. The Problem of Scaling Claude Code Across a Team
- 2. Timeline: From System Prompt to Skills
- 3. Progressive Disclosure — The Core Engineering Principle
- 4. Anatomy of a Skill: Detailed Structure
- 5. Skills vs CLAUDE.md vs Sub-Agents vs Hooks
- 6. Examples from Anthropic's First-Party Skills
- 7. Building a Custom Skill for a Team: a Real-World Case Study
- 8. Discovery: How Claude Finds the Right Skill
- 9. Skills in the 2026 Plugins Architecture
- 10. Best Practices for High-Quality Skills
- 11. Governance, Security, and Risk Control
- 12. Rollout Patterns for 30–100-Engineer Organisations
- 13. Limits and Anti-Patterns to Avoid
- 14. The Future of Skills and the 2026+ Landscape
- 15. Conclusion: Skills Are the Standard Library of Agentic Engineering
1. The Problem of Scaling Claude Code Across a Team
When one engineer uses Claude Code alone, things are easy: open a project, type a few messages, tweak the CLAUDE.md, and the agent picks up your personal conventions. But as the team grows to ten, thirty, then a hundred engineers, three problems surface. First, instruction file bloat: every project ends up with dozens of conventions for commits, code style, deploys, testing, PR review — dumping all of it into CLAUDE.md produces a multi-thousand-line file the agent has to chew through on every question, even when it's only tweaking CSS. Second, duplicated knowledge: each developer writes their own prompt for the same recurring task (security review, generating a migration, producing migration SQL), with no shared standard. Third, non-reusable: a good workflow for Team A is hard to package so Team B can use it without copy-pasting disjointed prompts.
This is exactly the gap Skills was designed to fill. Announced alongside the major updates to Claude Code and the Agent SDK in late 2025, Skills is a unit that packages one agent capability as a folder containing markdown, with a name, a description, and optional scripts or supporting assets. Claude doesn't load the whole Skill into context up front — the agent only sees the name and a short description, and when your question matches a Skill, only then does it read the detailed content. This mechanism has an official name: Progressive Disclosure, and it is the key that allows a single agent to carry hundreds of capabilities while keeping its context lean.
This article dives into Skills in 2026 from three angles: the engineering mechanism (how progressive disclosure actually works), the practical how-to (building a Skill for a team, testing it, versioning it, packaging it into a plugin), and the role of Skills in the CLAUDE.md — Sub-Agents — Hooks — Skills quartet. You'll know exactly when to reach for a Skill instead of the other tools, what a canonical folder layout looks like, how to write a description so the agent self-activates at the right moment, and how to govern dozens of Skills in an organisation without chaos.
2. Timeline: From System Prompt to Skills
CLAUDE.md at the project root, loaded automatically every session. "Project-level memory" at last. But the problem remained: the file grew long, wasn't split by context, and duplicated across projects..claude/commands/*.md: typing /review-pr loads the file content as a prompt. Useful but manual — users had to know which commands existed and type them by name.PreToolUse, PostToolUse). Powerful tools but hard to share across teams — each team built its own.3. Progressive Disclosure — The Core Engineering Principle
Progressive Disclosure is an idea that originates in UI design: don't dump everything at once — reveal information in steps as the user needs it. Applied to prompt design for an LLM, the idea becomes: don't load all instructions into context; only load the slice relevant to the task at hand. A context window, even at 1M tokens, is still precious — attention drops off in far regions, token cost scales linearly, and noise hurts accuracy.
Skills implement progressive disclosure through three distinct tiers:
graph TD
START[User message arrives] --> T1
subgraph T1["Tier 1: Metadata (always in context)"]
M1[name + description of ALL available skills]
end
T1 --> DECIDE{Claude decides
which skill fits?}
DECIDE -- No match --> NORMAL[Respond normally]
DECIDE -- Matches 1 skill --> T2
subgraph T2["Tier 2: SKILL.md (loaded on activation)"]
B1[Full body: how-to, code samples, script paths]
end
T2 --> ACT[Claude starts executing]
ACT --> NEED{Supporting files needed?
references, scripts, assets}
NEED -- Yes --> T3
subgraph T3["Tier 3: Supporting files (loaded on demand)"]
S1[references/*.md, scripts/*.py, assets/*.png]
end
T3 --> ACT
NEED -- No --> DONE[Task done]
Tier one is metadata: each Skill has a name and description pair listed in the system prompt. With ten or twenty Skills, this typically costs 1–2 thousand tokens — negligible. Tier two is SKILL.md: the main markdown file with detailed instructions. It is loaded into context only when Claude sees your question matching the description. Tier three is supporting files: accompanying files (long reference docs, Python scripts, templates, assets) that SKILL.md points to — read only when genuinely needed during execution.
The subtlety is that Claude reads a Skill the way a person reads a document: not all at once, but choosing which section to read based on the current goal. A "write-pdf-report" Skill might have a 3000-word SKILL.md and 10 reference files totalling 50,000 words — but for a specific session, the agent may only actually load 5000 words in total. Average token cost drops sharply compared with cramming everything into the system prompt up front.
Principle: the context window is not a parking lot
Don't treat 1M tokens as a closet. Model attention is not uniformly distributed — tokens near the start and end attract more than those in the middle of a long scroll. Progressive disclosure ensures the "hot" slice of context is always the most relevant for the task at hand. The result: higher accuracy, lower cost, lower latency.
4. Anatomy of a Skill: Detailed Structure
Physically, a Skill is a folder containing at least a SKILL.md. The folder name is the Skill name. A fully fleshed-out professional Skill typically looks like:
pptx/
├── SKILL.md (required: frontmatter + body)
├── references/
│ ├── python-pptx-guide.md (long reference material)
│ └── design-principles.md (slide design principles)
├── scripts/
│ ├── create_slides.py (main execution script)
│ └── validate_output.py (output validator)
└── assets/
├── template-modern.pptx (sample template)
└── fonts/Inter.ttf (embedded font)
The heart of the Skill is SKILL.md, which must have a YAML frontmatter block at the top with at least two fields, name and description:
---
name: pptx
description: Use this skill whenever the user wants to create, read,
edit, or manipulate PowerPoint presentations (.pptx files). Triggers
include mentions of "PowerPoint", "slide deck", "presentation" or
requests to produce professional slide presentations.
---
# PowerPoint Skill
## When to use
This skill applies when the user's task involves...
## Creation workflow
1. Read `references/python-pptx-guide.md` for API details
2. Use `scripts/create_slides.py` to generate the base file
3. Apply templates from `assets/` if user specifies a style
...
Three parts need careful design:
| Part | Purpose | Requirements |
|---|---|---|
name | Short identifier, used as a key and in logs | Kebab-case, no punctuation, short (< 30 chars) |
description | Decides when the Skill is activated | 1–3 sentences, describing the task AND the trigger signals (keywords) |
| Body (markdown) | The instructions executed when the Skill activates | Clear, with examples, concrete commands, and references to supporting files when long |
Description is the most important piece because Claude reads it to decide activation. A weak description ("pptx skill for PowerPoint") leaves the agent unsure when to use it. A good description clearly states the task (create/read/edit PowerPoint), trigger keywords (mentions of "PowerPoint", "slide deck"), and sometimes anti-triggers ("do not use for PDF exports").
The description trap: write like a label, not an ad
The description is not marketing — don't write "An amazing skill that handles PowerPoint lightning-fast". Write it like onboarding notes for a new colleague: "Use this skill when the user asks to create/read/edit a .pptx file or slide deck". Claude matches semantically — the more specific the task and context, the more accurate the match.
5. Skills vs CLAUDE.md vs Sub-Agents vs Hooks
These four tools are easily confused because they all "teach" Claude Code. But they serve four different purposes and work together:
| Tool | Loading tier | Activation trigger | Typical use case |
|---|---|---|---|
CLAUDE.md | System prompt, always | Every message in the session | Fixed project conventions: commit style, where tests live, language preferences |
| Custom Commands | Inline when user types /command | User-initiated | Repeat shortcuts: /review-pr, /write-migration |
| Sub-Agents | Separate session, isolated context | Spawned by the parent agent | Offload heavy work from main context: research agent, review agent |
| Hooks | Scripts outside Claude, running in the harness | On events (PreToolUse, PostMessage...) | Auto-run a linter after edit, block dangerous tools, emit notifications |
| Skills | Progressive: metadata → body → files | Claude decides when the task matches the description | Capabilities with a workflow: handle pptx, call an internal API, deploy an app |
The rule of thumb is simple: if the guidance applies to every question in the project, use CLAUDE.md. If it's a user-typed shortcut, use a Command. If you need context isolation or parallel work, use a Sub-Agent. If you need automation outside Claude (running a script after a tool call), use a Hook. If it's a capability that self-activates based on context, a Skill is the right choice.
Real-world composition usually looks like this: CLAUDE.md pins team conventions, Skills provide context-sensitive capabilities (produce an Excel report, generate a migration SQL, run a security review), Sub-Agents handle background work (research, large test suites), Hooks enforce safety rails (auto-format, forbid editing .env). None replaces another — they're four layers of one architecture.
graph LR
USER[User message] --> HARNESS[Claude Code harness]
HARNESS --> HOOK1[Hook: PreToolUse?]
HARNESS --> AGENT[Main Claude agent]
CLAUDE_MD[CLAUDE.md
always present] --> AGENT
SKILLS_META[Skills metadata
always present] --> AGENT
AGENT -- "task matches description" --> SKILL_BODY[SKILL.md loaded]
SKILL_BODY --> SKILL_FILES[references/ scripts/ assets/]
AGENT -- "heavy work" --> SUBAGENT[Sub-Agent separate session]
AGENT -- "user types /cmd" --> CMD[Inline Command]
AGENT --> TOOL[Tool call]
TOOL --> HOOK2[Hook: PostToolUse]
6. Examples from Anthropic's First-Party Skills
Anthropic open-sourced its own Skill set — the most practical learning source because these have been exercised at scale and follow every best practice.
pptx / docx / pdf / xlsx are four office-document Skills with a similar structure: a SKILL.md describing the create/read/edit workflow, a handful of reference files in references/, and a Python script in scripts/ using python-pptx, python-docx, pypdf, openpyxl. An interesting point: these Skills don't bundle the Python libraries — they assume pip install is available and take responsibility only for "directing" Claude to use the library correctly. The Skill stays lean (a few hundred KB) but has large capabilities because it rides the existing Python ecosystem.
skill-creator is a Skill used to create new Skills — a superb demonstration of the "agent improves its own capabilities" mindset. Its content is a five-step workflow: ask the user about the purpose, scaffold the folder structure, write a compliant SKILL.md with frontmatter, add supporting files if needed, and run random tests to confirm the description is distinct enough.
consolidate-memory is a very short Skill (a few dozen lines) but it hints at a powerful pattern: using a Skill to encode the agent's own internal workflow. It instructs Claude to review a memory file (MEMORY.md), merge duplicate entries, and prune outdated items. No library, no script required — just a specialised prompt that activates when the user says "consolidate memory" or "clean up my memory files".
A Skill as a Standard Operating Procedure
If your team already has a workflow written up as an SOP PDF, you can package it almost verbatim as a Skill. The only change: replace "review the print-out" with "read SKILL.md", and swap manual steps for tool calls (Read, Bash, Edit). The agent inherits your team's vocabulary directly.
7. Building a Custom Skill for a Team: a Real-World Case Study
Suppose your team builds a .NET 10 SaaS with a Vue 3 frontend. Every time you add a new entity, the manual workflow includes: create an EF Core migration, produce a DTO, write a FastEndpoints endpoint, add a Vue composable that calls the API, and update the OpenAPI spec. This repeats often, costs 20 minutes each time, and it's easy to forget a step. That's an ideal Skill candidate. Call it add-entity:
.claude/skills/add-entity/
├── SKILL.md
├── references/
│ ├── fastendpoints-pattern.md
│ └── vue-composable-template.md
└── scripts/
└── validate-migration.sh
SKILL.md frontmatter:
---
name: add-entity
description: Use when the user asks to add a new domain entity (table,
model, resource) to the SaaS. Covers EF Core migration, DTO, FastEndpoints
endpoint, OpenAPI update, and Vue composable. Triggers on phrases like
"add entity", "add a new table", "new resource for", "scaffold CRUD".
---
# Add Entity Workflow
## Context
Our codebase uses:
- .NET 10 with FastEndpoints (see `references/fastendpoints-pattern.md`)
- EF Core 10 with per-module schema (ordering, billing, catalog)
- Vue 3.6 with Pinia store and composables (see `references/vue-composable-template.md`)
## Checklist
1. **Confirm module ownership**: ask the user which module owns the entity
2. **Domain class**: create `Modules/{Module}/Domain/Entities/{Name}.cs`
3. **Migration**: run `dotnet ef migrations add Add{Name} -c {Module}DbContext`
4. **DTO**: `{Module}.Contracts/Dtos/{Name}Dto.cs` - keep minimal, no internal fields
5. **Endpoint**: `{Module}.Api/Endpoints/{Name}/` with List/Get/Create/Update/Delete
6. **OpenAPI**: run `dotnet run --project tools/openapi` to regenerate
7. **Vue composable**: `src/composables/use{Name}.ts` following template
8. **Pinia store**: only if the entity has cross-page shared state
## Validation
After generation, run `./scripts/validate-migration.sh` to ensure:
- Migration file exists
- DTO has no EF navigation properties
- Vue composable uses TanStack Query, not raw fetch
When a developer types "I want to add a Subscription entity to the Billing module", Claude sees a description match, loads the SKILL.md, and follows the checklist. A few material differences compared with scattered ad-hoc prompts:
- Knowledge is versioned in git: the team edits the Skill, reviews the PR, merges — the whole team benefits.
- Newcomers don't need to memorise a chain of prompts — the Skill self-activates when they speak natural language.
- You can include a validation script at the final step, invoked by the Skill through the Bash tool — guaranteeing output matches team constraints.
8. Discovery: How Claude Finds the Right Skill
An important question: if Claude only sees metadata, how does it pick the right Skill when the user's question is ambiguous? The actual mechanism combines three signals:
- Description match: Claude reads every description and performs an implicit match between the question's intent and the keywords/context in the descriptions. The more specific the description is about the task and trigger, the stronger the match.
- Trigger phrases: some Skills explicitly include "Triggers on phrases like ..." in their description. A blunt way to "teach" Claude how to recognise them.
- Negative signals: a description can list anti-triggers ("do NOT use for X"). Very handy when two Skills are similar (e.g. pdf-reader vs pdf-writer).
Skills have no hard "priority" — Claude picks based on semantic match. When two Skills both match, the agent generally picks the tighter match, or loads both if the task is complex. In ambiguous cases, it may ask the user to confirm.
The over-trigger trap: descriptions that are too broad
A description like "Use this skill for any code task" will trigger on every question, breaking progressive disclosure entirely. Always narrow the scope and include explicit trigger keywords. Rule of thumb: if your Skill activates on 80% of questions, it's doing CLAUDE.md's job.
9. Skills in the 2026 Plugins Architecture
By 2026, Claude Code supports Plugins — a distribution unit that bundles Skills + Commands + Sub-Agents + Hooks + MCP servers into a single git or marketplace package. A typical Plugin layout:
my-team-plugin/
├── plugin.json (metadata, version, author)
├── skills/
│ ├── add-entity/
│ ├── review-security/
│ └── deploy-staging/
├── commands/
│ └── daily-standup.md
├── agents/
│ └── research-agent.md
├── hooks/
│ └── pre-edit-format.sh
└── mcp/
└── internal-api.json (MCP server config)
The big win: a team can publish a plugin to GitHub or an internal registry, and a single install command gives teammates the complete set of conventions. A 200-engineer company might maintain 3–5 plugins across areas (backend, frontend, infra, security, data), with each engineer installing the plugins matching their role.
graph TB
subgraph REGISTRY[Plugin Registry]
P1[team-backend-plugin]
P2[team-frontend-plugin]
P3[security-compliance-plugin]
end
subgraph DEV1[Dev A: Full-stack]
I1[.claude/plugins/backend]
I2[.claude/plugins/frontend]
end
subgraph DEV2[Dev B: Frontend-only]
J1[.claude/plugins/frontend]
end
subgraph DEV3[Dev C: Security]
K1[.claude/plugins/security-compliance]
end
P1 --> I1
P2 --> I2
P2 --> J1
P3 --> K1
REGISTRY -. "claude code plugin install" .-> DEV1
REGISTRY -. "claude code plugin install" .-> DEV2
REGISTRY -. "claude code plugin install" .-> DEV3
10. Best Practices for High-Quality Skills
Short but specific descriptions. Aim for 50–100 words. State three things: the task, representative input/context, and trigger keywords. Avoid marketing language. Write from the "onboarding notes for a new colleague" perspective, not a "pitch to the CEO".
Structured body. Use clear sections: "When to use", "Context", "Workflow", "Validation", "Examples". Claude navigates by headings — clear structure helps it pull the right part.
Move long knowledge to references/. If you have long API docs, an internal glossary, or a bulky PDF template, don't cram them into SKILL.md. Put them in references/ and link from SKILL.md. Claude will read the supporting file when it reaches the step that needs it — saving tokens.
Scripts are tools, not the backbone. Scripts in scripts/ should do mechanical, deterministic work (validate, format, run pre-existing CLIs). Leave reasoning to the agent — if everything is already in the script, there's no point involving Claude.
Test before you ship. Use Anthropic's skill-creator, or manually test by posing 5 variant questions near the trigger and see whether Claude activates correctly. Especially check for false positives (the Skill activating on unrelated questions).
Version like code. Keep Skills in git with semantic versioning. A breaking change (rename, drop a section) is major. Adding a step is minor. Wording fixes are patch. Developers can pin a version if they need stability.
Practical size limits
Experience: a SKILL.md under 2000 words is comfortable. 2000–5000 is heavy but acceptable if the body must be detailed. Above 5000 words is almost certainly a sign you should split into references/. Metadata per skill (name + description) should stay under 150 words so that having many skills doesn't bloat the system prompt.
11. Governance, Security, and Risk Control
A Skill is a form of "letting the agent read and execute prepared logic". That carries an attack surface worth considering:
- Prompt injection via SKILL.md: if the Skill repository allows free contributions, a malicious actor could add a Skill with a benign description but malicious body ("exfiltrate .env", "disable hook"). Mitigation: all internal Skills must go through PR review; public Skills must come from trusted sources or be pinned to a commit SHA.
- Tool permission scope: a Skill inherits the agent's permissions; it can't grant itself more. But if the agent has broad permissions (unrestricted Bash), a malicious Skill can abuse them. Defence: combine Hooks PreToolUse to block dangerous commands regardless of caller.
- Data exfiltration: a Skill can encourage the agent to read more files than needed. Control via policy: limit readable paths in a hook, or use restricted permission modes.
- Knowledge drift: a stale Skill (e.g. guiding against an outdated internal API version) leads the agent to produce wrong code. You need a periodic review process, an owner for each Skill, and CI alerts when referenced APIs change.
Some organisations have adopted a "Skill Council" model — a group of 3–5 engineers who review every internally published Skill, checking that the description is accurate, the workflow aligns with security standards, and that minimum tests exist. This is analogous to reviewing infra code or major design documents.
12. Rollout Patterns for 30–100-Engineer Organisations
From early-adopter organisations, a few architectural patterns recur:
Pattern 1 — Skills per role. Split the Skill repository by role: plugins/backend-dev, plugins/frontend-dev, plugins/sre, plugins/data. Each developer installs the plugin matching their role. Pro: compact, no bloat. Con: full-stack developers have to install multiple plugins.
Pattern 2 — Skills per domain. Fits modular monoliths: each bounded context has its own Skills (billing-add-invoice, catalog-import-sku). Skills live alongside the module's code. Pro: Skills and code evolve together; PRs that change code can change the Skill when the workflow shifts.
Pattern 3 — A meta-skill for onboarding. A single Skill with the description "Use when a new engineer needs help finding which skill/doc to consult for a task" — it contains a table of contents pointing to child Skills, helping newcomers navigate the team's "forest of tools".
Pattern 4 — Skills as incident playbooks. SRE teams have runbooks for incident response. Turn each runbook into a Skill ("respond-to-cache-miss-storm", "rollback-failed-deploy"). When the agent is activated in an alert context, it loads the right Skill and walks the on-call through the steps.
Pattern 5 — Skill + MCP combination. MCP provides tools (reading data from internal systems); Skills provide procedures (how to use those tools per team standards). A good plugin often ships both together: an MCP server to connect to Linear/Jira/Jenkins, and a Skill to guide the agent in using those MCP tools the right way.
The actual ROI of investing in Skills
Experience from early-adopter teams: investing 2–4 hours writing a Skill for a recurring workflow (migration, review, deploy) saves 10–20 minutes each time a developer uses it. In a 30-person team where a good Skill runs 5 times/week/person, payoff arrives in the first week. More importantly: output quality becomes uniform because every developer follows the same workflow, instead of 30 personal variations.
13. Limits and Anti-Patterns to Avoid
Skills that are too small. A Skill with only 50 words of guidance for a trivial task is usually better expressed as a Command, or even a single line in CLAUDE.md. A Skill carries fixed overhead (metadata costs tokens, governance costs effort) — use one only when the value is clear.
Skills that replace code. If the logic can be packaged as a pure function/script, write a script. A Skill should not contain complex algorithms — that's code's job. A Skill only guides the agent to use code/tools correctly.
Overlapping Skills. Two Skills both triggered by "code review" but with slightly different workflows will confuse Claude. If you spot overlap, merge into one Skill with clear branches in the body, or split along another dimension (review-security vs review-performance, not review-a vs review-b).
No tests. An untested Skill is a latent bug. At minimum: 3 example questions that should activate it, 3 that shouldn't. Automate with a simple harness built on the Agent SDK.
No owner. Every Skill needs a person/team responsible for updates. Without an owner, a Skill fossilises within months, causing more harm than good.
14. The Future of Skills and the 2026+ Landscape
Skills are moving in three clear directions. First, public skill registries: marketplaces where developers publish Skills the way they publish npm packages. Quality, reviews, and download counts will become signals. Anthropic and the community plan to launch an official registry in the second half of 2026.
Second, self-generated skills: the agent observes user habits, recognises repeated workflows, and proposes creating a Skill. This is a form of meta-learning — the agent learns beyond its one-shot training by encoding operational experience into new Skills.
Third, cross-agent skills: a standard Skill usable by Claude Code, the Agent SDK, Claude Desktop, and other frameworks via an open standard (much like MCP does for tools). If this materialises, writing a Skill once makes it portable across agents, the way writing a JavaScript library works for both Node and the browser.
For engineering teams that want to get ahead, the pragmatic answer isn't reading more — it's writing your first 3 Skills. Pick the three most repeated workflows, package them in the structure described, and measure how output quality and time-to-finish change. Most teams realise after the first week: Skills aren't some exotic toy to try out, but the simplest way to convert collective knowledge into an agent's executable capability.
15. Conclusion: Skills Are the Standard Library of Agentic Engineering
In 2026, the line between "writing code" and "writing instructions for an agent" is blurring. Skills are the clearest expression of that trend: a knowledge-packaging unit that can be versioned, tested, reviewed, and distributed — every quality of modern code — yet whose content is natural language that a new colleague can read. Progressive disclosure is the elegant engineering principle behind it, but the core philosophy is simple: load only what's relevant, exactly when needed, and nothing extra.
For an individual, Skills cut repetitive time and ensure every pass follows best practice. For a team, Skills are how knowledge flows from one head to another without manual onboarding. For a large organisation, Skills are infrastructure — like the standard library, CI/CD, or a design system — something you don't notice daily but whose absence slows the whole org down. Investing in Skills in 2026 isn't about chasing a trend; it's the shortest path to your team operating at the new cadence.
References
- Anthropic — Claude Code Skills documentation
- GitHub: anthropics/skills — first-party skill set (pptx, docx, pdf, xlsx, skill-creator)
- Anthropic Engineering Blog — Equipping agents for the real world with Agent Skills
- Anthropic — Agent SDK Skills reference
- Anthropic — Claude Code Plugins documentation
Modular Monolith with .NET 10 — The Middle Path Between Monolith and Microservices with Vertical Slice, Wolverine, and Bounded Context
Pinia 3 and TanStack Query 5 for Vue 3.6 — Modern State Management in the Vapor Mode Era 2026
Disclaimer: The opinions expressed in this blog are solely my own and do not reflect the views or opinions of my employer or any affiliated organizations. The content provided is for informational and educational purposes only and should not be taken as professional advice. While I strive to provide accurate and up-to-date information, I make no warranties or guarantees about the completeness, reliability, or accuracy of the content. Readers are encouraged to verify the information and seek independent advice as needed. I disclaim any liability for decisions or actions taken based on the content of this blog.