Authors: Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn
Year: 2023
Venue: NeurIPS
Type: article
URL: https://arxiv.org/abs/2305.18290
arXiv: 2305.18290
Cite as: [@rafailov2023dpo]
No raw files yet. Run node scripts/fetch-bibliography-raw.mjs --only rafailov2023dpo to populate, or drop files into raw/bibliography/rafailov2023dpo/.
@inproceedings{rafailov2023dpo,
title = {Direct Preference Optimization: Your Language Model is Secretly a Reward Model},
author = {Rafael Rafailov and Archit Sharma and Eric Mitchell and Stefano Ermon and Christopher D. Manning and Chelsea Finn},
year = {2023},
booktitle = {NeurIPS},
url = {https://arxiv.org/abs/2305.18290}
}No notes yet. Create notes/rafailov2023dpo.md to add notes.