Skip to main content
← Back to Bibliography

Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs

Authors: Arash Ahmadian, Chris Cremer, Matthias Gallee, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Ustun, Sara Hooker

Year: 2024

Venue: ACL

Type: article

URL: https://arxiv.org/abs/2402.14740

arXiv: 2402.14740

Cite as: [@ahmadian2024rloo]

Raw Files

No raw files yet. Run node scripts/fetch-bibliography-raw.mjs --only ahmadian2024rloo to populate, or drop files into raw/bibliography/ahmadian2024rloo/.

BibTeX

@inproceedings{ahmadian2024rloo,
  title = {Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs},
  author = {Arash Ahmadian and Chris Cremer and Matthias Gallee and Marzieh Fadaee and Julia Kreutzer and Olivier Pietquin and Ahmet Ustun and Sara Hooker},
  year = {2024},
  booktitle = {ACL},
  url = {https://arxiv.org/abs/2402.14740}
}

Notes

No notes yet. Create notes/ahmadian2024rloo.md to add notes.