ISARLab
The Intelligent Systems, Automation and Robotics LaboratoryTraditional Computer Vision in the Deep Learning Era
Workshop at the 6th Italian Conference for Robotics and Intelligent Machines (I-RIM)
Gazometro Ostiense – Rome, Italy
October 26, 2024 (17.30 – 19.00)
Hall 21
Workshop Organizers
Gabriele Costante (gabriele.costante@unipg.it) 1
Matteo Matteucci (matteo.matteucci@polimi.it) 2
1 Department of Engineering, Università degli Studi di Perugia
2 Department of Information, Electronics and Bioengineering – Politecnico di Milano
Abstract
The ability to extract information from images is a cornerstone in the development of intelligent robotic solutions. Over the past 50 years, we have witnessed a shift from traditional geometric computer vision to deep learning-based approaches. Recently, this transition has accelerated rapidly towards AI-based methods, raising the question of whether there is still room in industry and research for non-data-driven approaches.
This workshop aims to bring together experts from both traditional computer vision and deep learning fields to discuss this question. Specifically, deep learning and traditional computer vision experts will be invited to present cutting-edge AI-based solutions and applications, as well as to highlight how traditional computer vision can still make a significant impact in specific domains. The goal is to foster a discussion and determine the role, if any, that traditional computer vision still plays in the era of deep learning.
Keywords: Deep Learning for perception – Traditional Computer Vision – Applications of Computer Vision – Perception for Robotics
Schedule and list of Speakers (Title and Abstract)
17:30 - 17:40: Introduction by Matteo Matteucci and Gabriele Costante
Workshop Introduction
17:40 - 17:55: Federica Arrigoni (Department of Electronics, Information and Bioengineering, Politecnico di Milano, Italy)
Title
Theoretical Results on Viewing Graphs
Abstract
Structure from Motion (SfM) is a fundamental task in Computer Vision that aims at recovering both cameras and the 3D scene starting from multiple images. The problem can be conveniently represented as a “viewing graph”: each node corresponds to a camera/image and an edge is present between two nodes if the fundamental (or essential) matrix is available. While several research efforts on SfM have focused on devising accurate and efficient algorithms, much less attention has been devoted to investigate theoretical aspects. In particular, a relevant question is establishing whether a viewing graph is “solvable”, i.e, it uniquely determines a configuration of cameras. This talk will give an overview on existing formulations and different notions of viewing graph solvability.
17:55 - 18:10: Massimiliano Mancini (Department of Information Engineering and Computer Science, Università degli Studi di Trento, Italy)
Title
Vocabulary-free Image Classification
Abstract
Image classification is one of the most traditional computer vision problems, aiming to assign a semantic label to images. Standard approaches for this task require a labeled training set and a list of target class names available beforehand. This talk will show how recent vision language models can be re-purposed to sidestep both needs. Specifically, we will talk about how contrastive vision-language models coupled with a large textual dataset allow us to perform this task without training and without needing a list of classes beforehand. We will conclude the talk by linking this result with the emerging vision-by-language paradigm, generalizing these large models beyond the scope they were designed for.
18:10 - 18:25: Adriano Mancini (Department of Information Engineering, Università Politecnica delle Marche, Italy)
Title
Innovation in Computer Vision: from Feature-Based to Deep Learning. VRAI Research Group’s journey between Past, Present and Future.
18:25 - 18:40: Vito Renò (Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing – National Research Council of Italy, Italy)
Title
Indoor AGV navigation and localization for logistics using both traditional computer vision and modern approaches
Abstract
In the last years, starting from the rise of Industry 4.0 and thanks to the recent advances in the widespread of intelligent systems that enable real-time decision-making processes, new paradigms and concepts regarding to the autonomous robot navigation and localization have been established, especially in the context of logistics and transport. With reference to the manufacturing industry, a new generation of autonomous robots (AGV – Automated Guided Vehicle) can be effectively used to move payloads (either heavy or not) within a warehouse and to enable autonomous and intelligent navigation via smart interaction with unstructured environments, people and/or other things. In this talk different AGV modules designed and developed using either computer vision and image processing techniques or deep learning models is discussed. This case study is an example of integration of both traditional approaches as well as AI-based solutions. The aim of the talk is to open the discussion about the opportunity of using one approach instead of the other, based on the specific domain requirements.
18:40 - 19:00: Round Table and Discussion
Round Table and Discussion