AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

Jan 1, 2025·
Ahmed Masry
,
Juan A. Rodriguez
,
Tianyu Zhang
,
Suyuchen Wang
,
Chao Wang
,
Aarash Feizi
,
Akshay Kalkunte Suresh
,
Abhay Puri
,
Xiangru Jian
,
Pierre-André Noël
,
Sathwik Tejaswi Madhusudhan
,
Marco Pedersoli
,
Bang Liu
,
Nicolas Chapados
,
Yoshua Bengio
,
Enamul Hoque
,
Christopher Pal
,
Issam H. Laradji
,
David Vázquez
,
Perouz Taslakian
,
Spandana Gella
,
Sai Rajeswar
· 0 min read
Type
Publication
CoRR