AI SummaryComprehensive quality audit system for Claude Code agents, skills, and commands. Provides quantitative scoring, comparative analysis, and production readiness grading based on industry best practices. The 16-criteria framework is derived from: 1. Claude Code Best Practices (Ultimate Guide line 4921:
Install
Copy this and paste it into Claude Code, Cursor, or any AI assistant:
I want to install the "audit-agents-skills" skill in my project. Please run this command in my terminal: # Install skill into your project (2 files) mkdir -p .claude/skills/audit-agents-skills && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/audit-agents-skills/SKILL.md "https://raw.githubusercontent.com/FlorianBruniaux/claude-code-ultimate-guide/main/examples/skills/audit-agents-skills/SKILL.md" && mkdir -p .claude/skills/audit-agents-skills/scoring && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/audit-agents-skills/scoring/criteria.yaml "https://raw.githubusercontent.com/FlorianBruniaux/claude-code-ultimate-guide/main/examples/skills/audit-agents-skills/scoring/criteria.yaml" Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.
Description
Audit Claude Code agents, skills, and commands for quality and production readiness. Use when evaluating skill quality, checking production readiness scores, or comparing agents against best-practice templates.
Audit Agents/Skills/Commands (Advanced Skill)
Comprehensive quality audit system for Claude Code agents, skills, and commands. Provides quantitative scoring, comparative analysis, and production readiness grading based on industry best practices.
Purpose
Problem: Manual validation of agents/skills is error-prone and inconsistent. According to the LangChain Agent Report 2026, 29.5% of organizations deploy agents without systematic evaluation, leading to "agent bugs" as the top challenge (18% of teams). Solution: Automated quality scoring across 16 weighted criteria with production readiness thresholds (80% = Grade B minimum for production deployment). Key Features: • Quantitative scoring (32 points for agents/skills, 20 for commands) • Weighted criteria (Identity 3x, Prompt 2x, Validation 1x, Design 2x) • Production readiness grading (A-F scale with 80% threshold) • Comparative analysis vs reference templates • JSON/Markdown dual output for programmatic integration • Fix suggestions for failing criteria ---
Modes
| Mode | Usage | Output | |------|-------|--------| | Quick Audit | Top-5 critical criteria only | Fast pass/fail (3-5 min for 20 files) | | Full Audit | All 16 criteria per file | Detailed scores + recommendations (10-15 min) | | Comparative | Full + benchmark vs templates | Analysis + gap identification (15-20 min) | Default: Full Audit (recommended for first run) ---
Why These Criteria?
The 16-criteria framework is derived from: • Claude Code Best Practices (Ultimate Guide line 4921: Agent Validation Checklist) • Industry Data (LangChain Agent Report 2026: evaluation gaps) • Production Failures (Community feedback on hardcoded paths, missing error handling) • Composition Patterns (Skills should reference other skills, agents should be modular)
Discussion
Health Signals
My Fox Den
Community Rating
Sign in to rate this booster