Università di Roma "La Sapienza", Italy.

The pinhole camera model is often used for 3D reconstruction.

The relation between the world coordinates of a point P(X,Y,Z) and the coordinates on the image plane (x,y) in a pinhole camera is

x = f * X / Z

y = f * Y / Z

where f is the focal distance of the lens.
A digitalized image is usually stored in a frame buffer, that can be seen as
a matrix of pixels with W columns and H rows.
Let (i,j) be the discrete frame coordinates of the image with origin in the
upperleft corner, (Ox,Oy) be the focal point of the lens (the intersection
between the optical axes and the image plane) in the frame coordinates,
and (x,y) be the image coordinates.

Image coordinates relate to frame coordinates in the following way:

x = ( i - Ox ) * Sx

y = ( j - Oy ) * Sy

where Sx,Sy are the horizontal and vertical distances of two adjacent pixels in the frame buffer.x = xd * ( 1 + k1 * rd^2 )

y = yd * ( 1 + k1 * rd^2 )

where (xd,yd) are the image coordinates of the distorted image, rd = SQRT(xd^2 + yd^2), and k1 is a constant depending on the distortion of the lens.The relation between image and frame coordinates in presence of lens distortion is:

x = ( id - Ox ) * Sx * ( 1 + k1 * rd^2 )

y = ( jd - Oy ) * Sy * ( 1 + k1 * rd^2 )

If the distances of two adjacent cells in the camera digitalization device are known (Px,Py), then the distances of pixels in the image are

Sx = ax * Px

Sy = Py

where ax is a scale factor due to the displacement in horizontal scanning and sampling frequencies. On the contrary, there is no displacement in vertical sampling.

We choose the 3D world reference system to be the left camera reference system.
The right camera is translated and rotated with respect from the left one,
therefore six parameters describe this transformation.

The simplest case arise when the optical axes of two cameras are parallel, and
the translation of the right camera is only along the X axis.

Let us consider the optical setting in the figure, that is also called
*standard model*.

- L and R are two pinhole cameras with parallel optical axes. Let f be the focal length of both cameras.
- The baseline (that is the line connecting the two lens centers) is perpendicular to the optical axes. Let b be the distance between the two lens centers.
- XZ is the plane where the optical axes lie, XY plane is parallel to the image plane of both cameras, X axis equals the baseline and the origin O of (X,Y,Z) world reference system is the lens center of the left camera.

In this setting the equations of stereo triangulation are:

Z = ( b * f ) / ( x1 - x2 )

X = x1 * Z / f

Y = y1 * Z / f

**Rotation around Y axis ( theta)**

If

Under small angle approximation, we can still assume the right image plane to be parallel to the left image plane and hence to XY plane.

In this case we have:

Z = ( b * f ) / ( x1 - x2 + f * b / Z0 )

X = x1 * Z / f

Y = y1 * Z / f

**Rotation around X axis ( phi)**

Rotation around X axis only affects the Y coordinate in reconstruction.
Let *phi* be the rotation angle, then stereo triangulation is

Z = ( b * f ) / ( x1 - x2 )

X = x1 * Z / f

Y = y1 * Z / f + tan(*phi*) * Z

**Rotation around Z axis ( psi)**

Rotation around optical axis is usually dealt with by rotating the image
before applying matching and triangulation.

In the following the rotation angle of the right camera around its optical axis
will be called *psi*.

p' = RT ( p - T )

where p and p' are the coordinates of P in the left and right camera coordinates respectively, and RT is the transpose (or the inverse) matrix of R.

Go to:

Multiresolution Stereo Vision System

Stereo Camera Calibration

Hough Internal Calibration