Authors: Zichang Liu, Aditya Desai, Fangshuo Liao, Weitao Wang, Victor Xie, Zhaozhuo Xu, Anastasios Kyrillidis, Anshumali Shrivastava
Year: 2024
Venue: NeurIPS
Type: article
URL: https://arxiv.org/abs/2305.17118
arXiv: 2305.17118
Cite as: [@liu2024scissorhands]
No raw files yet. Run node scripts/fetch-bibliography-raw.mjs --only liu2024scissorhands to populate, or drop files into raw/bibliography/liu2024scissorhands/.
@inproceedings{liu2024scissorhands,
title = {Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time},
author = {Zichang Liu and Aditya Desai and Fangshuo Liao and Weitao Wang and Victor Xie and Zhaozhuo Xu and Anastasios Kyrillidis and Anshumali Shrivastava},
year = {2024},
booktitle = {NeurIPS},
url = {https://arxiv.org/abs/2305.17118}
}No notes yet. Create notes/liu2024scissorhands.md to add notes.