5 boosters for "grpo" — open source, verified from GitHub, ready to install
Train language models using TRL (Transformer Reinforcement Learning) on fully managed Hugging Face infrastructure. No local GPU setup required—models train on cloud GPUs and results are automatically saved to the Hugging Face Hub. Use this skill when users want to: Use Unsloth () instead of standard
A skill for fine-tuning and training language models on Hugging Face's cloud GPU infrastructure using TRL, supporting SFT, DPO, GRPO methods and GGUF conversion for local deployment. Developers and ML engineers working with cloud-based model training benefit from this comprehensive guidance.
An orchestrator booster that automatically fetches GitHub issues, spawns AI sub-agents to implement fixes, opens pull requests, and manages review feedback. Ideal for teams looking to automate bug triage and fix workflows.
A system prompt for training AI agents on terminal/coding tasks using GRPO, enabling autonomous task completion in containerized Linux environments. Ideal for developers building advanced AI coding assistants and automation systems.
This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.