V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

Authors: Mido Assran, Adrien Bardes, David Fan, Quentin Garrido, Russell Howes, Matthew Muckley, Ammar Rizvi, Claire Roberts, Koustuv Sinha, Artem Zholus, Sergio Arnaud, Abha Gejji, Ada Martin, Francois Robert Hogan, Daniel Dugas, Piotr Bojanowski, Vasil Khalidov, Patrick Labatut, Francisco Massa, Marc Szafraniec, Kapil Krishnakumar, Yong Li, Xiaodong Ma, Sarath Chandar, Franziska Meier, Yann LeCun, Michael Rabbat, Nicolas Ballas

Year: 2025

Venue: arXiv preprint arXiv:2506.09985

Type: article

URL: https://arxiv.org/abs/2506.09985

arXiv: 2506.09985

Cite as: [@assran2025vjepa2]

Raw Files

No raw files yet. Run node scripts/fetch-bibliography-raw.mjs --only assran2025vjepa2 to populate, or drop files into raw/bibliography/assran2025vjepa2/.

BibTeX

@article{assran2025vjepa2,
  title = {V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning},
  author = {Mido Assran and Adrien Bardes and David Fan and Quentin Garrido and Russell Howes and Matthew Muckley and Ammar Rizvi and Claire Roberts and Koustuv Sinha and Artem Zholus and Sergio Arnaud and Abha Gejji and Ada Martin and Francois Robert Hogan and Daniel Dugas and Piotr Bojanowski and Vasil Khalidov and Patrick Labatut and Francisco Massa and Marc Szafraniec and Kapil Krishnakumar and Yong Li and Xiaodong Ma and Sarath Chandar and Franziska Meier and Yann LeCun and Michael Rabbat and Nicolas Ballas},
  year = {2025},
  journal = {arXiv preprint arXiv:2506.09985},
  url = {https://arxiv.org/abs/2506.09985}
}

Notes

No notes yet. Create notes/assran2025vjepa2.md to add notes.