Why using stereo vision for mobile robots?
We are describing a multiresolution stereoscopic vision system for a Pioneer mobile robot, by using low-cost STH-V1 Integrated Stereo Head and the SRI's stereo algorithm running on a conventional PC architecture.
The objective of this work is to implement a stereo vision system and to integrate the range information in the Local Perceptual Space of the robot (in Saphira).
 



 
Reconstruction of the world seen through stereo cameras can be divided in two steps:
The design and the implementation of a stereo vision system must take into account two factors:
The following tables displayed the results of disparity and range computation in the normal case and when lens distortion has been removed and a multiresolution stereo technique has been applied.
Calibration OFF   |
Calibration ON | |
| Multiresolution OFF |
![]()
| ![]()
|
| Multiresolution ON |
![]()
| ![]()
|
Calibration OFF   |
Calibration ON | |
| Multiresolution OFF |
|
|
| Multiresolution ON |
|
|
In fact, calibration is needed in order to remove the distortion caused by the lenses that produces a curvature in the disparity line in the background (that should be straight as all the points in the wall are at the same distance from the cameras). While a multiresolution technique is applied for increasing the depth field of view of the camera.
Multiresolution is often used for improving performance of the matching
algorithm by first looking in the low resolution
pair of image for correspondences and then refine by a local search in a high
resolution pair.
But this method requires a special matching algorithm.
Here we propose a slightly different multiresolution technique that
applies the same matching algorithm to three different pair of images
with resolution 320x60, 160x60 and 80x60, and then combines the disparity
results.
Disparities from high resolution images are used for farther objects,
while disparities from low resolution ones are used for closer objects.
In the following picture the disparity maps from high, medium and low resolution are displayed in the left column, while the correspondent (red, green, and blue) disparity lines are in the right one. The black plot represents the combined line.
Observe the errors in the high resolution disparity map when looking at a close object, that can be recovered with the lower resolution images.
Let us show first a motivating example. The following pictures show a stereo camera, that has been put in front of a straight wall, and the disparities computed by the matching algorithm (i.e. before triangulation). The last picture is the plot of the disparities along one of the rows of the diparity map.
![]() |
![]() |
![]() |
![]() |
Observe that the radial distortion (that is pretty high for these lenses)
affects both the correct determination of point correspondences and
the computation of disparities between points.
In fact, the disparity line, that should be straight since we are looking at a straight
wall, presents a curvature because of the radial distortion of the lenses.
Furthermore this error increases when range values are extracted from disparities.
A calibration method is needed in order to find the internal parameters of the camera and then undistort the images before applying the stereo algorithm.
![]() |
![]() |
![]() |
![]() |
An easy semi-automatic calibration procedure for stereo cameras is presented here.
The following picture shows the architectural schema of the implemented system.

The images coming from the camera are first warped to remove lens distortion, then they are splitted in three pairs with different resolution and only a horizontal portion of the images is considered, as we are mapping range data in a 2D environment. The three pair of images are processed by the stereo algorithm that returns three disparity maps. These are sampled and integrated in order to obtain a single 1D disparity line. Finally triangulation is applied for retrieving range information.
The following table illustrates time performance evaluation obtained with a Pentium II 233 MHz processor. The overall rate of the system is above 10 Hz, even with no heavy code optimization.
| Function | Time |
| Acquisition and warping | 19 ms |
| Multiresolution stereo | 53 ms |
| Triangulation and Display | 8 ms |
| Overall time | 80 ms |
    

| Project | Organization | Robot model | Vision hardware | Performances | Calibration |
| Spinoza (1997) | Univ. of British Columbia | RWI B12 | 3 b/w cameras 2 DSPs 2 transputers |
2 Hz (128x120x20) |
Extension of Tsai method |
| RobotVis (1993) | INRIA | 3 cameras 3 DSPs |
1 - 5 Hz | No radial distortion correction (1997). |
Thanks to SRI Int. AI Center
for welcoming me, and in particular to
Kurt Konolige
for his valuable support.
I would have never learned so much about stereo vision and camera calibration
without attending a course in Stanford given by
Carlo Tomasi.
 
Go to:
Stereo triangulation
Stereo Camera Calibration
Hough Internal Calibration
Implementation