Computer Graphics, Computer Vision and Pattern recognition

Research lines:
- Saliency Prediction, Visual Attention, Action Recognition
- Dense Map Merging and Fusion, Meshing, 3D Surface Reconstruction
- Scene Representation, Interpretation and Understanding
- Terrain Traversability in Rescue Environments
- Recognition of Peri-Urban Areas in X Band SAR Images
- Patterns for Zooming Camera Calibration
- Learning of Visual Object Categories
- Control for Polyarticulated Self-Powered Hand Prostheses
- Adaptive, Flexible Cognitive Control under Task Switching for Rescue Robots
- 3D Motion Planning for Articulated Unmanned Tracked Vehicles
- Visual Media Analysis, Indexing, Classification and Retrieval
- Management of Digital Resources
- Augmented Reality and Computer Animated Virtualization

Members: 
Barbara Caputo, Luca Iocchi, Fiora Pirri (leader), Marco Schaerf.

PhD Students:
Bruno Cafaro, Federico Ferri, Matteo Menna, Valsamis Ntouskos, Manuel Ruiz, Federico Nardi, Novi Patricia,  Ilja Kuzborskij

The problem of Human Action Recognition is investigated, in our research work, within Motion Capture sequences. In this context, we investigated methods based on Gaussian Process Latent Variable Models and Alignment Kernels. We propose a new discriminative latent variable model with back-constraints induced by the similarity of the original sequences. We compare
the proposed method with methods based on Dynamic Time Warping and with V-GPDS models, which are able to model highly dimensional dynamical systems.

In the coherence theory of attention, introduced by Rensink, O’Regan, and Clark (2000), a coherence field is defined by a hierarchy of structures supporting the activities taking place across the different stages of visual attention. At the interface between low level and mid-level attention processing stages are the proto-objects; these are generated in parallel and collect features of the scene at specific location and time. These structures fade away if the region is no further attended by attention. This research work aims to build methods to computationally model these structures, on the basis of data collected in dynamic 3D environments via the Gaze Machine, a gaze measurement framework.

3D Terrain understanding and structure estimation is a crucial issue for robots navigating rescue scenarios. Unfortunately, large scale 3D point clouds provide no information about what is ground, and what is top, what can be surmounted and what can be not, what can be crossed, and what is too deep to be traversed. In this context, this research work mainly concentrated in providing methods for point cloud structuring which can lead to a definition of traversability cost maps. The aim of the research activities, concerning with the analysis of Synthetic Aperture Radar (SAR) images in X-band, is to classify different zones in peri-urban forestries integrating information from different sources. An integration of image segmentation and machine learning methods is studied to classify different zones of peri-urban forestries (e.g., trees canopies, lawns, water pounds, roads), exploiting the relation between the gray level signal properties of X-band images and the smoothness and roughness of the ground.

Camera calibration is a necessary step in order to develop applications that need to establish a relationship between image pixels and real world points. Usually, for non-zooming cameras, the calibration is carried out by using a grid pattern of known dimensions (e.g., a chessboard). However, for cameras with zoom functions, the use of a grid pattern only is not sufficient,
because the calibration has to be effective at multiple zoom levels and some features (e.g., corners) could not be detectable. This research activity focuses on developing calibration methods based on novel calibration patterns, specifically designed for zooming cameras.

Learning a visual object category from few samples is a compelling and challenging problem. In several real-world applications collecting many annotated data is costly and not always possible. However a small training set does not allow to cover the high intraclass variability typical of visual objects. In this condition, machine learning methods provide very few guarantees. This research activity concentrates on discriminative model adaptation algorithms able to proficiently learn a target object with few examples, relying on other previously learned source categories. The main means of control for polyarticulated self-powered hand prostheses is surface electromyography (sEMG). In the clinical setting, data collected from two electrodes are used to guide the hand movements selecting among a finite number of postures. Machine learning has been applied in the past to the sEMG signal (not in the clinical setting) with interesting results, which provide more insight on how these data could be used to improve prosthetic functionality. However, developing a finer control requires a longer training period. A desirable characteristic would be to shorten the time needed by a patient to learn how to use the prosthesis. To this aim, our research work focuses on exploiting methods to reuse past experience, in the form of models synthesized from previous subjects, to boost the adaptivity of the prosthesis.

Modeling cognitive control is a major issue in robot control, and it is about deciding when a task cannot succeed and a new task need to be initiated. These decisions are induced by incoming stimuli alerting of events taking place while the robot is executing its duties. The research work on modeling robot adaptive behaviors, under salient stimuli, exploits the human inspired paradigm of shifting and inhibition, underlying task switching.

Tracked vehicles are currently used in search and rescue, military, agricultural and planetary exploration applications where terrain conditions are difficult and unpredictable. They are better suited for such tasks than wheeled vehicles due to the larger contact area of tracks with the ground, which provides better traction on harsh terrains. These environments are often
inaccessible or considered too dangerous for humans to operate in, thus requiring the tracked vehicle to be endowed with autonomous navigation, safe locomotion and human-robot interaction capabilities to assist humans in complex tasks such as rescue, scouting or transportation. To cope with this challenging task, our research activities pursue to develop control models to
allow articulated tracked vehicles to autonomously follow 3D paths, within cluttered environments, adapting their morphology to the complexity of the terrain.

The research work, concerning the management of digital resources, explores the applicability of the SDL metadata framework to support preservation, management and dissemination of the Sapienza Digital Library (SDL) resources. The applicability study has been proved to be useful to improve the SDL interoperability in the management of the differences in information
granularity, and to fulfil the lack or to avoid the waste of information.

Within the context of our research activities, Augmented Reality is become a compelling technology mainly for the interactive 3D visualization of archaeological sites on hand-held devices and for building of complex planning scenarios for robots, eliminating the need to model the dynamics of both the robot and the real environment as it would be required by whole simulation environments. The latter application constitutes an important research test-bed for robots, meeting the needs to test and experiment complex robot behaviors using such a dynamic and rich perceptual domain.

Projects: 
TRADR - Long-Term Human-Robot Teaming for Robot Assisted Disaster Response - 2014, 2018 FP7 ICT 609763

NIFTi - Natural human-robot cooperation in dynamic environments
2010 - 2014 EU FP7- IP

SecondHands - A Robot Assistant for Industrial Maintenance - H2020-ICT-2014-1

 

 

© Università degli Studi di Roma "La Sapienza" - Piazzale Aldo Moro 5, 00185 Roma