Authors: Arash Ahmadian, Chris Cremer, Matthias Gallee, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Ustun, Sara Hooker
Year: 2024
Venue: ACL
Type: article
URL: https://arxiv.org/abs/2402.14740
arXiv: 2402.14740
Cite as: [@ahmadian2024rloo]
No raw files yet. Run node scripts/fetch-bibliography-raw.mjs --only ahmadian2024rloo to populate, or drop files into raw/bibliography/ahmadian2024rloo/.
@inproceedings{ahmadian2024rloo,
title = {Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs},
author = {Arash Ahmadian and Chris Cremer and Matthias Gallee and Marzieh Fadaee and Julia Kreutzer and Olivier Pietquin and Ahmet Ustun and Sara Hooker},
year = {2024},
booktitle = {ACL},
url = {https://arxiv.org/abs/2402.14740}
}No notes yet. Create notes/ahmadian2024rloo.md to add notes.