Skip to content
Skill

huggingface-community-evals

by huggingface

AI Summary

This skill is for running evaluations against models on the Hugging Face Hub on local hardware. It does not cover: If the user wants to run the same eval remotely on Hugging Face Jobs, hand off to the skill and pass it one of the local scripts in this skill.

Install

Copy this and paste it into Claude Code, Cursor, or any AI assistant:

I want to install the "huggingface-community-evals" skill in my project.

Please run this command in my terminal:
# Install skill into your project (5 files)
mkdir -p .claude/skills/huggingface-community-evals && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-community-evals/SKILL.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-community-evals/SKILL.md" && mkdir -p .claude/skills/huggingface-community-evals/examples && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-community-evals/examples/USAGE_EXAMPLES.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-community-evals/examples/USAGE_EXAMPLES.md" && mkdir -p .claude/skills/huggingface-community-evals/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-community-evals/scripts/inspect_eval_uv.py "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-community-evals/scripts/inspect_eval_uv.py" && mkdir -p .claude/skills/huggingface-community-evals/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-community-evals/scripts/inspect_vllm_uv.py "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-community-evals/scripts/inspect_vllm_uv.py" && mkdir -p .claude/skills/huggingface-community-evals/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-community-evals/scripts/lighteval_vllm_uv.py "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-community-evals/scripts/lighteval_vllm_uv.py"

Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.

Description

Run evaluations for Hugging Face Hub models using inspect-ai and lighteval on local hardware. Use for backend selection, local GPU evals, and choosing between vLLM / Transformers / accelerate. Not for HF Jobs orchestration, model-card PRs, .eval_results publication, or community-evals automation.

Overview

This skill is for running evaluations against models on the Hugging Face Hub on local hardware. It covers: • inspect-ai with local inference • lighteval with local inference • choosing between vllm, Hugging Face Transformers, and accelerate • smoke tests, task selection, and backend fallback strategy It does not cover: • Hugging Face Jobs orchestration • model-card or model-index edits • README table extraction • Artificial Analysis imports • .eval_results generation or publishing • PR creation or community-evals automation If the user wants to run the same eval remotely on Hugging Face Jobs, hand off to the hugging-face-jobs skill and pass it one of the local scripts in this skill. If the user wants to publish results into the community evals workflow, stop after generating the evaluation run and hand off that publishing step to ~/code/community-evals. > All paths below are relative to the directory containing this SKILL.md.

Prerequisites

• Prefer uv run for local execution. • Set HF_TOKEN for gated/private models. • For local GPU runs, verify GPU access before starting: `bash uv --version printenv HF_TOKEN >/dev/null nvidia-smi ` If nvidia-smi is unavailable, either: • use scripts/inspect_eval_uv.py for lighter provider-backed evaluation, or • hand off to the hugging-face-jobs skill if the user wants remote compute.

Examples

See: • examples/USAGE_EXAMPLES.md for local command patterns • scripts/inspect_eval_uv.py • scripts/inspect_vllm_uv.py • scripts/lighteval_vllm_uv.py

When To Use Which Script

| Use case | Script | |---|---| | Local inspect-ai eval on a Hub model via inference providers | scripts/inspect_eval_uv.py | | Local GPU eval with inspect-ai using vllm or Transformers | scripts/inspect_vllm_uv.py | | Local GPU eval with lighteval using vllm or accelerate | scripts/lighteval_vllm_uv.py | | Extra command patterns | examples/USAGE_EXAMPLES.md |

Discussion

0/2000
Loading comments...

Health Signals

MaintenanceCommitted Today
Active
Adoption1K+ stars on GitHub
10.0k ★ · Popular
DocsREADME + description
Well-documented

GitHub Signals

Stars10.0k
Forks610
Issues26
UpdatedToday
View on GitHub
Apache-2.0 License

My Fox Den

Community Rating

Sign in to rate this booster

Works With

Claude Code