Authors: Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D. Manning, Stefano Ermon, Chelsea Finn
Year: 2023
Venue: NeurIPS
Type: article
URL: https://arxiv.org/abs/2305.18290
arXiv: 2305.18290
Cite as: [@rafailov2023direct]
No raw files yet. Run node scripts/fetch-bibliography-raw.mjs --only rafailov2023direct to populate, or drop files into raw/bibliography/rafailov2023direct/.
@inproceedings{rafailov2023direct,
title = {Direct Preference Optimization: Your Language Model is Secretly a Reward Model},
author = {Rafael Rafailov and Archit Sharma and Eric Mitchell and Christopher D. Manning and Stefano Ermon and Chelsea Finn},
year = {2023},
booktitle = {NeurIPS},
url = {https://arxiv.org/abs/2305.18290}
}No notes yet. Create notes/rafailov2023direct.md to add notes.