How do I install skill-optimizer?

skill-optimizer is a Skill hosted on GitHub at https://github.com/fastxyz/skill-optimizer. Visit the ImAiFox page at https://imaifox.com/boosters/fastxyz-skill-optimizer-skill-optimizer for the AI-ready install prompt you can copy directly into Claude Code, Cursor, or Windsurf.

How popular is skill-optimizer?

skill-optimizer has 66 GitHub stars and 10 forks. It is actively maintained with recent commits.

Is skill-optimizer free?

Yes — skill-optimizer is open source and free to use under the MIT license. The source code is publicly available on GitHub at https://github.com/fastxyz/skill-optimizer.

Skill

skill-optimizer

Name: skill-optimizer
Author: fastxyz

by fastxyz

AI Summary

is an eval workbench for agent skills. It runs a model in an isolated Docker directory, provides skills/references as normal workspace files, captures an agent trace, and grades deterministic local outcomes. Use this skill as the source of truth for authoring eval suites in this repo. Detailed sche

Install

Copy this and paste it into Claude Code, Cursor, or any AI assistant:

I want to install the "skill-optimizer" skill in my project.

Please run this command in my terminal:
# Install skill into your project
mkdir -p .claude/skills/skill-optimizer && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/skill-optimizer/SKILL.md "https://raw.githubusercontent.com/fastxyz/skill-optimizer/development/skills/skill-optimizer/SKILL.md"

Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.

ai ai-agent ai-skill benchmark cli eval evals evaluation evaluation-framework llm llm-eval llm-evals llm-evaluation-framework mcp openrouter

Description

Use when creating, running, debugging, or documenting skill-optimizer workbench evals; working with agent skill cases, suites, graders, traces, Docker workspaces, OpenRouter model matrices, or the skill-optimizer SDK/CLI.

Examples

Tracked demos live in examples/ (the same repo path users may refer to as @examples/). Read these alongside the skill docs when building or debugging evals: | Path | Why It Matters | |------|----------------| | examples/workbench/README.md | Short command walkthrough for demos | | examples/workbench/pdf/README.md | Explains the PDF demo cases and expected outputs | | examples/workbench/pdf/suite.yml | Concrete suite using models, setup, env, graders, and append prompt | | examples/workbench/pdf/references/pdf-skill/SKILL.md | Example skill copied into /work for the agent | | examples/workbench/pdf/checks/*.mjs | Deterministic grader and setup helper patterns | | examples/workbench/mcp/suite.yml | Hidden-service MCP calculator example | | examples/workbench/mcp/mcp/calculator-server.mjs | Example MCP server with add/subtract/multiply/divide tools | `bash npx tsx src/cli.ts run-suite examples/workbench/pdf/suite.yml --trials 1 npx tsx src/cli.ts run-suite examples/workbench/mcp/suite.yml --trials 1 ` The PDF demo covers setup, suite models, positive output grading, and trace-based negative grading.

skill-optimizer

skill-optimizer is an eval workbench for agent skills. It runs a model in an isolated Docker /work directory, provides skills/references as normal workspace files, captures an agent trace, and grades deterministic local outcomes. Use this skill as the source of truth for authoring eval suites in this repo. Detailed schema and patterns are in references/workbench.md.

Core Model

• A case is one user-like task plus one or more deterministic graders. • A suite is a set of cases and OpenRouter models to run as a matrix. • references are copied into /work before the agent starts; this is where eval skills live. • The agent phase sees /work only. It cannot see /case, /results, graders, hidden answers, or hidden metadata. • Cases can define mcpServers; these are exposed through a workbench mcp command during the agent phase. • Graders run after the agent with /case, /work, and /results mounted. • trace.jsonl is the debugging source for what the agent saw, said, and did.

Commands

| Goal | Command | |------|---------| | Install deps | npm install | | Build CLI | npm run build | | Run one case | npx tsx src/cli.ts run-case <case.yml> | | Run one case across models | npx tsx src/cli.ts run-case <case.yml> --models openrouter/google/gemini-2.5-flash,openrouter/openai/gpt-5.4 | | Run a suite | npx tsx src/cli.ts run-suite <suite.yml> | | CLI help | npx tsx src/cli.ts --help | Rules: • Use only openrouter/... model refs. • OPENROUTER_API_KEY is required for real model runs. • run-suite uses models: from suite.yml; it has no model override flag. • run-case can use its case model: or --model / --models. • Docker image default is skill-optimizer-workbench:local.

Discussion

0/2000

Loading comments...

Health Signals

MaintenanceCommitted 23d ago

● Active

AdoptionUnder 100 stars

66 ★ · Niche

DocsREADME + description

Well-documented

GitHub Signals

Stars66

Forks10

Issues14

Updated23d ago

View on GitHub

MIT License

My Fox Den

Community Rating

Works With

Claude Code

Cursor

Related Skills

Openclaw MCP Server

MCP Server

using-superpowers

Skill

finishing-a-development-branch

Skill

test-driven-development

Skill

View all Skills →