The Gaze Machine PDF Print E-mail

Description and short history

The Gaze Machine (Patent number rm2007a000526: Gaze machine) is a head mounted device which allows to estimate at each instant, where the person who wears it look, in the 3D environment. It is composed of four cameras. Two high speed infrared cameras are used to track the pupils, by using specialized software. The other two cameras are pointing towards the scene that the viewer sees. These are used to provide a reconstruction of the scene. As a result, using specialized software modules, the positions where the user has looked are recovered both in the images of the scene and its 3D reconstruction.

The initial idea came up around 1999 when we, at ALCOR Lab, began to study attention to solve exploration problems with the robot. The first design of the Gaze machine was created by Fiora Pirri, Anna Belardinelli and Andrea Carbone. A new design for calibration and localization has been realized together with Matia Pizzoli and the current design (2014) has been improved by Valsamis Ntouskos.

The Gaze Machine can be used in many applications. Examples include research and scientific studies (e.g. computational attention, psychology, psychophysiology), medical diagnosis of various conditions and syndromes as well as Augmented Reality and Robotics.


Research directions


Ergonomics and hardware

Making the Gaze machine an easy to use wearable device, improving the human factor and keeping under control operator fatigue and discomfort. In the early design (see left picture and movie below) all the required devices to track head motion, eye movements and scene where located on an helmet.



Saliency and eyes movements

We have been working on the eye model for the Gaze Machine design and then used it to studying eye movements, and well known prediction models based either on saliency maps or eye directions.



3D projection and 3D reconstruction

Improving the 3D reconstruction of the environment from the stereocamera, both in terms of accuracy and in terms of required elaboration time. To obtain the current results we have been working on the eye model including the eye movements, and on all the reconstruction problems concerned with the calibration of the four cameras (two for the eyes and two for the scene) and the bundle adjustment to recover subject localization and good reconstruction.



Search tasks

Saliency prediction helps to understand which regions should be better evaluated, so as to avoid to reconstruct useless regions.



Gaze prediction



Rapid vision processing

  • Distinct processing channels feeding an array of specialized GPU and FPGA based processing units performing global dense operations based on energy minimization principles, computing primitive perceptual features.
  • Iconic and visual representation architecture: generating and storing proto-objects in iconic memory arrays.  Depth motion and shape information is combined for grouping and continuous update, discharging incoherent or irrelevant information.
  • Attention for proto-object stabilization (based on Rensik coherence theory) provides the means of selection and consolidation of proto-objects based on the procedural and episodic first-order response. Taking into account time, depth and saliency information, focused attention drives the required eye motion to augment the information of selectable proto-objects. The selected proto-objects outfit with detailed information the higher level vision tasks.
  • Bimodal attention: the visual information channels are integrated with auditory attention and provide suitable motion suggestions in assistance of the performed task  In this way the scene acts as a huge memory space, accessed by the system only when needed, avoiding cluttering its own memory resources.
© 2018 Alcor
Joomla! is Free Software released under the GNU General Public License.