Skill

hugging-face-datasets

by huggingface

AI Summary

This skill enables AI assistants to create, configure, and manage datasets on Hugging Face Hub with SQL-based querying and transformation capabilities. It's valuable for developers building data workflows and ML projects that require programmatic dataset management.

Install

# Add to your project root as SKILL.md
curl -o SKILL.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/hugging-face-datasets/SKILL.md"

Description

Create and manage datasets on Hugging Face Hub. Supports initializing repos, defining configs/system prompts, streaming row updates, and SQL-based dataset querying/transformation. Designed to work alongside HF MCP server for comprehensive dataset workflows.

Overview

This skill provides tools to manage datasets on the Hugging Face Hub with a focus on creation, configuration, content management, and SQL-based data manipulation. It is designed to complement the existing Hugging Face MCP server by providing dataset editing and querying capabilities.

Scripts auto-install requirements when run with: uv run scripts/script_name.py

• uv (Python package manager) • Getting Started: See "Usage Instructions" below for PEP 723 usage

4. Quality Assurance Features

• JSON Validation: Ensures data integrity during uploads • Batch Processing: Efficient handling of large datasets • Error Recovery: Graceful handling of upload failures and conflicts

Usage Instructions

The skill includes two Python scripts that use PEP 723 inline dependency management: > **All paths are relative to the directory containing this SKILL.md file.** > Scripts are run with: uv run scripts/script_name.py [arguments] • scripts/dataset_manager.py - Dataset creation and management • scripts/sql_manager.py - SQL-based dataset querying and transformation

Quality Score

C

Acceptable

72/100

Standard Compliance45
Documentation Quality65
Usefulness72
Maintenance Signal100
Community Signal100
Scored Today

GitHub Signals

Stars7.5k
Forks438
Issues19
UpdatedYesterday
View on GitHub

Trust & Transparency

Open Source — Apache-2.0

Source code publicly auditable

Verified Open Source

Hosted on GitHub — publicly auditable

Actively Maintained

Last commit Yesterday

7.5k stars — Strong Community

438 forks

My Fox Den

Community Rating

Works With

Claude Code