Authors: Tianyi Zhang, Mohsen Hariri, Shaochen Zhong, Vipin Chaudhary, Yang Sui, Xia Hu, Anshumali Shrivastava
Year: 2025
Venue: NeurIPS
Type: article
URL: https://arxiv.org/abs/2504.11651
arXiv: 2504.11651
Cite as: [@zhang2025dfloat11]
No raw files yet. Run node scripts/fetch-bibliography-raw.mjs --only zhang2025dfloat11 to populate, or drop files into raw/bibliography/zhang2025dfloat11/.
@inproceedings{zhang2025dfloat11,
title = {70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11)},
author = {Tianyi Zhang and Mohsen Hariri and Shaochen Zhong and Vipin Chaudhary and Yang Sui and Xia Hu and Anshumali Shrivastava},
year = {2025},
booktitle = {NeurIPS},
url = {https://arxiv.org/abs/2504.11651}
}No notes yet. Create notes/zhang2025dfloat11.md to add notes.