Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM

Authors: Deepak Narayanan, Mohammad Shoeybi, Jared Casper

Year: 2021

Venue: SC

Type: article

URL: https://arxiv.org/abs/2104.04473

Cite as: [@narayanan2021efficient]

Raw Files

No raw files yet. Run node scripts/fetch-bibliography-raw.mjs --only narayanan2021efficient to populate, or drop files into raw/bibliography/narayanan2021efficient/.

BibTeX

@article{narayanan2021efficient,
  title = {Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM},
  author = {Deepak Narayanan and Mohammad Shoeybi and Jared Casper},
  year = {2021},
  journal = {SC},
  url = {https://arxiv.org/abs/2104.04473}
}

Notes

No notes yet. Create notes/narayanan2021efficient.md to add notes.