Background
Since 2024, I am an Associate Professor at the University of Modena and Reggio Emilia, where I work on Deep Learning, Vision-and-Language integration, Large-Scale models and Multimedia. I teach in the courses of "Computer Vision and Cognitive Systems," Scalable AI, and Computer Architecture. My research interests span various areas, including Vision-and-Language integration, Multimodal Retrieval, Image and Video Captioning, Visual-Semantic alignment, Large-Scale model development, HPC and Embodied AI.
I have authored more than 140 publications in international journals and conferences. Currently, I serve as an Associate Editor for Computer Vision and Image Understanding and Pattern Recognition and act as an Area Chair for ICCV and major multimedia conferences. I am also a Scholar in the ELLIS society (European Laboratory for Learning and Intelligent Systems), where I coordinate the Modena ELLIS Unit.
Since 2021, I have held the position of deputy director at the Interdepartmental Center on Digital Humanities at the University of Modena and Reggio Emilia. Earlier in my career, in 2017, I worked at the Facebook AI Research laboratory in Paris under the supervision of Hervé Jégou. During that time, I worked on the development of a video-matching algorithm that was adopted in production on the social network to detect abusive content.
News
We have open research collaborator and post-doc positions within the MINERVA EU project, and the PRIN projects "MUCES - a Multimedia platform for Content Enrichment and Search in audiovisual archives" and "MUSMA: Multimedia Understanding meets Social Media Analysis". If you are interested, please get in touch!
Full Professor Habilitation
🎉 Happy to share that I've obtained the National Italian Habilitation (ASN - Abilitazione Scientifica Nazionale) as Full Professor (Fascia I) in Information Processing Systems (09/H1) 🎉
EuroHPC Extreme Scale grant
We are pleased to announce that our project "VISTA - Versatile Intelligent Systems for Tailored and Adaptive Next-Generation Multimodal AI" was accepted for the EuroHPC Extreme Scale grant, with an allocation of almost 1M GPU hours.
Paper accepted to NeurIPS 2025
Our paper, "vHector and HeisenVec: Scalable Vector Graphics Generation Through Large Language Models", has been accepted to NeurIPS 2025, Datasets and Benchmarks track!
Two papers accepted at BMVC 2025
Our papers "Mitigating Hallucinations in Multimodal LLMs via Object-aware Preference Optimization" and "Verifier Matters: Enhancing Inference-Time Scaling for Video Diffusion Models" have been accepted to BMVC 2025!
Three papers accepted at ICCV 2025!
Our papers "MissRAG: Addressing the Missing Modality Challenge in Multimodal Large Language Models", "What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models" and "Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation" have been accepted to ICCV 2025!
Area Chair for WACV 2026
Happy to share that I will serve as an Area Chair for WACV 2026!
Tutorial on AI and HPC at ICIAP 2025
We are glad to announce that we will organize, together with NVIDIA, a tutorial on AI and HPC at ICIAP 2025. The tutorial is organized as part of the MINERVA European Project.
Three papers accepted at CVPR 2025!
Our papers "Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering", "Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval" and "Hyperbolic Safety-Aware Vision-Language Models" (joint collaboration with UvA) have been accepted to CVPR 2025!
Update: "Hyperbolic Safety-Aware Vision-Language Models" has been selected as highlight paper!
IRCDL 2026 in Modena
Happy to share that we will host the 2026 edition of IRCDL, the Conference on Information and Research Science Connecting to Digital and Library Science. See the website.
Featured publications
vHector and HeisenVec: Scalable Vector Graphics Generation Through Large Language Models
Leonardo Zini, Elia Frigieri, Sebastiano Aloscari, Lorenzo Baraldi
NeurIPS 2025, Datasets and Benchmarks track
What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models
Lorenzo Baraldi, Davide Bucciarelli, Federico Betti, Marcella Cornia, Lorenzo Baraldi, Nicu Sebe, Rita Cucchiara
ICCV 2025
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation
Luca Barsellotti, Lorenzo Bianchi, Nicola Messina, Fabio Carrara, Marcella Cornia, Lorenzo Baraldi, Fabrizio Falchi, Rita Cucchiara
ICCV 2025
MissRAG: Addressing the Missing Modality Challenge in Multimodal Large Language Models
Vittorio Pipoli, Alessia Saporita, Federico Bolelli, Marcella Cornia, Lorenzo Baraldi, Costantino Grana, Rita Cucchiara, Elisa Ficarra
ICCV 2025
Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval
Davide Caffagni, Sara Sarto, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
CVPR 2025
Hyperbolic Safety-Aware Vision-Language Models
Tobia Poppi, Tejaswi Kasarla, Pascal Mettes, Lorenzo Baraldi, Rita Cucchiara
CVPR 2025 Highlight
Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering
Federico Cocchi, Nicholas Moratelli, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
CVPR 2025
Causal Graphical Models for Vision-Language Compositional Understanding
Fiorenzo Parascandolo, Nicholas Moratelli, Enver Sangineto, Lorenzo Baraldi, Rita Cucchiara
ICLR 2025
Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments
Luca Barsellotti, Roberto Bigazzi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
NeurIPS 2024, Datasets and Benchmarks track
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Samuele Poppi, Tobia Poppi, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
ECCV 2024
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
Lorenzo Baraldi, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Alessandro Nicolosi, Rita Cucchiara
ECCV 2024
BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues
Sara Sarto, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
ECCV 2024
Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation
Luca Barsellotti, Roberto Amoroso, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition
The Revolution of Multimodal Large Language Models: A Survey
Davide Caffagni, Federico Cocchi, Luca Barsellotti, Nicholas Moratelli, Sara Sarto, Lorenzo Baraldi, Lorenzo Baraldi, Marcella Cornia, Rita Cucchiara
Findings of the Association for Computational Linguistics: ACL 2024
Courses
Computer Vision and Cognitive Systems (2025/2026)
Laurea Magistrale in Artificial Intelligence Engineering
Lorenzo Baraldi, Vittorio Cuculo
Architettura dei Calcolatori (2025/2026)
Course material
· Upcoming exams
Architettura dei Calcolatori
Rita Cucchiara, Lorenzo Baraldi
Scalable AI (2024/2025)
Course material
· Upcoming exams
Laurea Magistrale in Ingegneria Informatica
Lorenzo Baraldi, Giuseppe Fiameni
Computer Vision and Cognitive Systems (2024/2025)
Course material
· Upcoming exams
Laurea Magistrale in Ingegneria Informatica
Rita Cucchiara, Lorenzo Baraldi
AI for Automotive (2024/2025)
Electronic Engineering for Intelligent Vehicles
Rita Cucchiara, Lorenzo Baraldi
Architettura dei Calcolatori (2024/2025)
Course material
· Upcoming exams
Ingegneria Informatica
Rita Cucchiara, Lorenzo Baraldi