Skip to main content
← Back to Bibliography

AWQ: Activation-Aware Weight Quantization for On-Device LLM Compression and Acceleration

Authors: Ji Lin, Jiaming Tang, Haotian Tang

Year: 2024

Venue: MLSys

Type: article

URL: https://arxiv.org/abs/2306.00978

arXiv: 2306.00978

Cite as: [@lin2024awq]

Raw Files

No raw files yet. Run node scripts/fetch-bibliography-raw.mjs --only lin2024awq to populate, or drop files into raw/bibliography/lin2024awq/.

BibTeX

@inproceedings{lin2024awq,
  title = {AWQ: Activation-Aware Weight Quantization for On-Device LLM Compression and Acceleration},
  author = {Ji Lin and Jiaming Tang and Haotian Tang},
  year = {2024},
  booktitle = {MLSys},
  url = {https://arxiv.org/abs/2306.00978}
}

Notes

No notes yet. Create notes/lin2024awq.md to add notes.