Skip to content

Add OneComp/QEP to Quantization reading list#5

Open
Yuma-Ichikawa wants to merge 1 commit intogpu-mode:mainfrom
Yuma-Ichikawa:add-onecomp
Open

Add OneComp/QEP to Quantization reading list#5
Yuma-Ichikawa wants to merge 1 commit intogpu-mode:mainfrom
Yuma-Ichikawa:add-onecomp

Conversation

@Yuma-Ichikawa
Copy link
Copy Markdown

Summary

This PR adds OneComp (Fujitsu Research, arXiv:2603.28845) and its accompanying NeurIPS 2025 paper QEP to this awesome list.

OneComp is an open-source, Apache-2.0 licensed post-training quantization (PTQ) framework for LLMs that unifies several recent research directions behind a single Runner.auto_run() API:

  • QEP (Quantization Error Propagation, NeurIPS 2025) — layer-wise PTQ with propagated error compensation.
  • AutoBit — ILP-based mixed-precision bitwidth assignment derived from available VRAM.
  • JointQ — joint weight–scale optimization (group-wise 4-bit, etc.).
  • Rotation preprocessing — SpinQuant/OstQuant-style learned rotations absorbed into weights, with online Hadamard hooks at load time.
  • LoRA SFT post-process — accuracy recovery / knowledge injection after quantization.
  • vLLM plugin — first-class DBF & Mixed-GPTQ serving of OneComp checkpoints.

Verified on Llama (TinyLlama / Llama-2 / Llama-3) and Qwen3 (0.6B — 32B). PyPI: pip install onecomp.

Links:

Happy to adjust section placement, wording, or formatting to better match the list's conventions — thanks for maintaining this resource!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant