How do I install hqq-quantization?

hqq-quantization is a Skill hosted on GitHub at https://github.com/Orchestra-Research/AI-Research-SKILLs. Visit the ImAiFox page at https://imaifox.com/boosters/orchestra-research-ai-research-skills-hqq-quantization for the AI-ready install prompt you can copy directly into Claude Code, Cursor, or Windsurf.

How popular is hqq-quantization?

hqq-quantization has 9,889 GitHub stars and 739 forks. It is actively maintained with recent commits.

Is hqq-quantization free?

Yes — hqq-quantization is open source and free to use under the MIT license. The source code is publicly available on GitHub at https://github.com/Orchestra-Research/AI-Research-SKILLs.

Skill

hqq-quantization

Name: hqq-quantization
Author: Orchestra-Research

by Orchestra-Research

AI Summary

Fast, calibration-free weight quantization supporting 8/4/3/2/1-bit precision with multiple optimized backends. HQQ uses to define quantization parameters: The core quantized layer that replaces :

Install

Copy this and paste it into Claude Code, Cursor, or any AI assistant:

I want to install the "hqq-quantization" skill in my project.

Please run this command in my terminal:
# Install skill into your project (3 files)
mkdir -p .claude/skills/hqq && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/hqq/SKILL.md "https://raw.githubusercontent.com/Orchestra-Research/AI-Research-SKILLs/main/10-optimization/hqq/SKILL.md" && mkdir -p .claude/skills/hqq/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/hqq/references/advanced-usage.md "https://raw.githubusercontent.com/Orchestra-Research/AI-Research-SKILLs/main/10-optimization/hqq/references/advanced-usage.md" && mkdir -p .claude/skills/hqq/references && curl --retry 3 --retry-delay 2 --retry-all-errors -o .claude/skills/hqq/references/troubleshooting.md "https://raw.githubusercontent.com/Orchestra-Research/AI-Research-SKILLs/main/10-optimization/hqq/references/troubleshooting.md"

Then restart Claude Code (or reload the window in Cursor) so the skill is picked up.

ai ai-research claude claude-code claude-skills codex gemini gpt-5 grpo huggingface machine-leanring megatron skills vllm llm

Description

Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when deploying with vLLM or HuggingFace Transformers.

HQQ - Half-Quadratic Quantization

Fast, calibration-free weight quantization supporting 8/4/3/2/1-bit precision with multiple optimized backends.

When to use HQQ

Use HQQ when: • Quantizing models without calibration data (no dataset needed) • Need fast quantization (minutes vs hours for GPTQ/AWQ) • Deploying with vLLM or HuggingFace Transformers • Fine-tuning quantized models with LoRA/PEFT • Experimenting with extreme quantization (2-bit, 1-bit) Key advantages: • No calibration: Quantize any model instantly without sample data • Multiple backends: PyTorch, ATEN, TorchAO, Marlin, BitBlas for optimized inference • Flexible precision: 8/4/3/2/1-bit with configurable group sizes • Framework integration: Native HuggingFace and vLLM support • PEFT compatible: Fine-tune quantized models with LoRA Use alternatives instead: • AWQ: Need calibration-based accuracy, production serving • GPTQ: Maximum accuracy with calibration data available • bitsandbytes: Simple 8-bit/4-bit without custom backends • llama.cpp/GGUF: CPU inference, Apple Silicon deployment

Installation

`bash pip install hqq

With specific backend

pip install hqq[torch] # PyTorch backend pip install hqq[torchao] # TorchAO int4 backend pip install hqq[bitblas] # BitBlas backend pip install hqq[marlin] # Marlin backend `

Discussion

0/2000

Loading comments...

Health Signals

MaintenanceCommitted 4d ago

● Active

Adoption1K+ stars on GitHub

9.9k ★ · Popular

DocsREADME + description

Well-documented

GitHub Signals

Stars9.9k

Forks739

Issues6

Updated4d ago

View on GitHub

MIT License

My Fox Den

Community Rating

Works With

Claude Code

Related Skills

ecc

Plugin

Everything Claude Code

Plugin

Everything Claude Code

Plugin

claude-api

Skill

View all Skills →