AI SummaryThis skill enables AI assistants to create, configure, and manage datasets on Hugging Face Hub with SQL-based querying and transformation capabilities. It's valuable for developers building data workflows and ML projects that require programmatic dataset management.
Install
# Add to your project root as SKILL.md curl -o SKILL.md "https://raw.githubusercontent.com/huggingface/skills/main/skills/hugging-face-datasets/SKILL.md"
Description
Create and manage datasets on Hugging Face Hub. Supports initializing repos, defining configs/system prompts, streaming row updates, and SQL-based dataset querying/transformation. Designed to work alongside HF MCP server for comprehensive dataset workflows.
Overview
This skill provides tools to manage datasets on the Hugging Face Hub with a focus on creation, configuration, content management, and SQL-based data manipulation. It is designed to complement the existing Hugging Face MCP server by providing dataset editing and querying capabilities.
Scripts auto-install requirements when run with: uv run scripts/script_name.py
• uv (Python package manager) • Getting Started: See "Usage Instructions" below for PEP 723 usage
4. Quality Assurance Features
• JSON Validation: Ensures data integrity during uploads • Batch Processing: Efficient handling of large datasets • Error Recovery: Graceful handling of upload failures and conflicts
Usage Instructions
The skill includes two Python scripts that use PEP 723 inline dependency management: > **All paths are relative to the directory containing this SKILL.md file.** > Scripts are run with: uv run scripts/script_name.py [arguments] • scripts/dataset_manager.py - Dataset creation and management • scripts/sql_manager.py - SQL-based dataset querying and transformation
Quality Score
Acceptable
72/100
Trust & Transparency
Open Source — Apache-2.0
Source code publicly auditable
Verified Open Source
Hosted on GitHub — publicly auditable
Actively Maintained
Last commit Yesterday
7.5k stars — Strong Community
438 forks