AI SummaryThis skill is for running evaluations against models on the Hugging Face Hub on local hardware. It does not cover: If the user wants to run the same eval remotely on Hugging Face Jobs, hand off to the skill and pass it one of the local scripts in this skill.
Install
Copy this and paste it into Claude Code, Cursor, or any AI assistant:
I want to install the "huggingface-community-evals" skill in my project. Please run this command in my terminal: # Install skill into your project (5 files) mkdir -p .claude/skills/huggingface-community-evals && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-community-evals/SKILL.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-community-evals/SKILL.md" && mkdir -p .claude/skills/huggingface-community-evals/examples && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-community-evals/examples/USAGE_EXAMPLES.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-community-evals/examples/USAGE_EXAMPLES.md" && mkdir -p .claude/skills/huggingface-community-evals/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-community-evals/scripts/inspect_eval_uv.py "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-community-evals/scripts/inspect_eval_uv.py" && mkdir -p .claude/skills/huggingface-community-evals/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-community-evals/scripts/inspect_vllm_uv.py "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-community-evals/scripts/inspect_vllm_uv.py" && mkdir -p .claude/skills/huggingface-community-evals/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-community-evals/scripts/lighteval_vllm_uv.py "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-community-evals/scripts/lighteval_vllm_uv.py" Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.
Description
Run evaluations for Hugging Face Hub models using inspect-ai and lighteval on local hardware. Use for backend selection, local GPU evals, and choosing between vLLM / Transformers / accelerate. Not for HF Jobs orchestration, model-card PRs, .eval_results publication, or community-evals automation.
Overview
This skill is for running evaluations against models on the Hugging Face Hub on local hardware. It covers: • inspect-ai with local inference • lighteval with local inference • choosing between vllm, Hugging Face Transformers, and accelerate • smoke tests, task selection, and backend fallback strategy It does not cover: • Hugging Face Jobs orchestration • model-card or model-index edits • README table extraction • Artificial Analysis imports • .eval_results generation or publishing • PR creation or community-evals automation If the user wants to run the same eval remotely on Hugging Face Jobs, hand off to the hugging-face-jobs skill and pass it one of the local scripts in this skill. If the user wants to publish results into the community evals workflow, stop after generating the evaluation run and hand off that publishing step to ~/code/community-evals. > All paths below are relative to the directory containing this SKILL.md.
Prerequisites
• Prefer uv run for local execution. • Set HF_TOKEN for gated/private models. • For local GPU runs, verify GPU access before starting: `bash uv --version printenv HF_TOKEN >/dev/null nvidia-smi ` If nvidia-smi is unavailable, either: • use scripts/inspect_eval_uv.py for lighter provider-backed evaluation, or • hand off to the hugging-face-jobs skill if the user wants remote compute.
Examples
See: • examples/USAGE_EXAMPLES.md for local command patterns • scripts/inspect_eval_uv.py • scripts/inspect_vllm_uv.py • scripts/lighteval_vllm_uv.py
When To Use Which Script
| Use case | Script | |---|---| | Local inspect-ai eval on a Hub model via inference providers | scripts/inspect_eval_uv.py | | Local GPU eval with inspect-ai using vllm or Transformers | scripts/inspect_vllm_uv.py | | Local GPU eval with lighteval using vllm or accelerate | scripts/lighteval_vllm_uv.py | | Extra command patterns | examples/USAGE_EXAMPLES.md |
Discussion
Health Signals
My Fox Den
Community Rating
Sign in to rate this booster