Skip to content
Skill

audit-agents-skills

by FlorianBruniaux

AI Summary

Comprehensive quality audit system for Claude Code agents, skills, and commands. Provides quantitative scoring, comparative analysis, and production readiness grading based on industry best practices. The 16-criteria framework is derived from: 1. Claude Code Best Practices (Ultimate Guide line 4921:

Install

Copy this and paste it into Claude Code, Cursor, or any AI assistant:

I want to install the "audit-agents-skills" skill in my project.

Please run this command in my terminal:
# Install skill into your project (2 files)
mkdir -p .claude/skills/audit-agents-skills && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/audit-agents-skills/SKILL.md "https://raw.githubusercontent.com/FlorianBruniaux/claude-code-ultimate-guide/main/examples/skills/audit-agents-skills/SKILL.md" && mkdir -p .claude/skills/audit-agents-skills/scoring && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/audit-agents-skills/scoring/criteria.yaml "https://raw.githubusercontent.com/FlorianBruniaux/claude-code-ultimate-guide/main/examples/skills/audit-agents-skills/scoring/criteria.yaml"

Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.

Description

Audit Claude Code agents, skills, and commands for quality and production readiness. Use when evaluating skill quality, checking production readiness scores, or comparing agents against best-practice templates.

Audit Agents/Skills/Commands (Advanced Skill)

Comprehensive quality audit system for Claude Code agents, skills, and commands. Provides quantitative scoring, comparative analysis, and production readiness grading based on industry best practices.

Purpose

Problem: Manual validation of agents/skills is error-prone and inconsistent. According to the LangChain Agent Report 2026, 29.5% of organizations deploy agents without systematic evaluation, leading to "agent bugs" as the top challenge (18% of teams). Solution: Automated quality scoring across 16 weighted criteria with production readiness thresholds (80% = Grade B minimum for production deployment). Key Features: • Quantitative scoring (32 points for agents/skills, 20 for commands) • Weighted criteria (Identity 3x, Prompt 2x, Validation 1x, Design 2x) • Production readiness grading (A-F scale with 80% threshold) • Comparative analysis vs reference templates • JSON/Markdown dual output for programmatic integration • Fix suggestions for failing criteria ---

Modes

| Mode | Usage | Output | |------|-------|--------| | Quick Audit | Top-5 critical criteria only | Fast pass/fail (3-5 min for 20 files) | | Full Audit | All 16 criteria per file | Detailed scores + recommendations (10-15 min) | | Comparative | Full + benchmark vs templates | Analysis + gap identification (15-20 min) | Default: Full Audit (recommended for first run) ---

Why These Criteria?

The 16-criteria framework is derived from: • Claude Code Best Practices (Ultimate Guide line 4921: Agent Validation Checklist) • Industry Data (LangChain Agent Report 2026: evaluation gaps) • Production Failures (Community feedback on hardcoded paths, missing error handling) • Composition Patterns (Skills should reference other skills, agents should be modular)

Discussion

0/2000
Loading comments...

Health Signals

MaintenanceCommitted Today
Active
Adoption1K+ stars on GitHub
3.2k ★ · Popular
DocsREADME + description
Well-documented

GitHub Signals

Stars3.2k
Forks439
Issues4
UpdatedToday
View on GitHub
CC-BY-SA-4.0 License

My Fox Den

Community Rating

Sign in to rate this booster

Works With

Claude Code