AI SummaryTrain object detection, image classification, and SAM/SAM2 segmentation models on managed cloud GPUs. No local GPU setup required—results are automatically saved to the Hugging Face Hub. Use this skill when users want to: Helper scripts use PEP 723 inline dependencies. Run them with :
Install
Copy this and paste it into Claude Code, Cursor, or any AI assistant:
I want to install the "huggingface-vision-trainer" skill in my project. Please run this command in my terminal: # Install skill into your project (12 files) mkdir -p .claude/skills/huggingface-vision-trainer && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/SKILL.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/SKILL.md" && mkdir -p .claude/skills/huggingface-vision-trainer/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/references/finetune_sam2_trainer.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/references/finetune_sam2_trainer.md" && mkdir -p .claude/skills/huggingface-vision-trainer/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/references/hub_saving.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/references/hub_saving.md" && mkdir -p .claude/skills/huggingface-vision-trainer/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/references/image_classification_training_notebook.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/references/image_classification_training_notebook.md" && mkdir -p .claude/skills/huggingface-vision-trainer/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/references/object_detection_training_notebook.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/references/object_detection_training_notebook.md" && mkdir -p .claude/skills/huggingface-vision-trainer/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/references/reliability_principles.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/references/reliability_principles.md" && mkdir -p .claude/skills/huggingface-vision-trainer/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/references/timm_trainer.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/references/timm_trainer.md" && mkdir -p .claude/skills/huggingface-vision-trainer/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/scripts/dataset_inspector.py "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/scripts/dataset_inspector.py" && mkdir -p .claude/skills/huggingface-vision-trainer/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/scripts/estimate_cost.py "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/scripts/estimate_cost.py" && mkdir -p .claude/skills/huggingface-vision-trainer/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/scripts/image_classification_training.py "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/scripts/image_classification_training.py" && mkdir -p .claude/skills/huggingface-vision-trainer/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/scripts/object_detection_training.py "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/scripts/object_detection_training.py" && mkdir -p .claude/skills/huggingface-vision-trainer/scripts && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/huggingface-vision-trainer/scripts/sam_segmentation_training.py "https://raw.githubusercontent.com/huggingface/skills/main/skills/huggingface-vision-trainer/scripts/sam_segmentation_training.py" Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.
Description
Trains and fine-tunes vision models for object detection (D-FINE, RT-DETR v2, DETR, YOLOS), image classification (timm models — MobileNetV3, MobileViT, ResNet, ViT/DINOv3 — plus any Transformers classifier), and SAM/SAM2 segmentation using Hugging Face Transformers on Hugging Face Jobs cloud GPUs. Covers COCO-format dataset preparation, Albumentations augmentation, mAP/mAR evaluation, accuracy metrics, SAM segmentation with bbox/point prompts, DiceCE loss, hardware selection, cost estimation, Trackio monitoring, and Hub persistence. Use when users mention training object detection, image classification, SAM, SAM2, segmentation, image matting, DETR, D-FINE, RT-DETR, ViT, timm, MobileNet, ResNet, bounding box models, or fine-tuning vision models on Hugging Face Jobs.
Prerequisites Checklist
Before starting any training job, verify:
Dataset Requirements — Object Detection
• Dataset must exist on Hub • Annotations must use the objects column with bbox, category (and optionally area) sub-fields • Bboxes can be in xywh (COCO) or xyxy (Pascal VOC) format — auto-detected and converted • Categories can be integers or strings — strings are auto-remapped to integer IDs • image_id column is optional — generated automatically if missing • ALWAYS validate unknown datasets before GPU training (see Dataset Validation section)
Dataset Requirements — Image Classification
• Dataset must exist on Hub • Must have an image column (PIL images) and a label column (integer class IDs or strings) • The label column can be ClassLabel type (with names) or plain integers/strings — strings are auto-remapped • Common column names auto-detected: label, labels, class, fine_label • ALWAYS validate unknown datasets before GPU training (see Dataset Validation section)
Dataset Requirements — SAM/SAM2 Segmentation
• Dataset must exist on Hub • Must have an image column (PIL images) and a mask column (binary ground-truth segmentation mask) • Must have a prompt — either: • A prompt column with JSON containing {"bbox": [x0,y0,x1,y1]} or {"point": [x,y]} • OR a dedicated bbox column with [x0,y0,x1,y1] values • OR a dedicated point column with [x,y] or [[x,y],...] values • Bboxes should be in xyxy format (absolute pixel coordinates) • Example dataset: merve/MicroMat-mini (image matting with bbox prompts) • ALWAYS validate unknown datasets before GPU training (see Dataset Validation section)
Discussion
Health Signals
My Fox Den
Community Rating
Sign in to rate this booster