POSIT tutorial
Pose Estimation
In this tutorial we will see how to estimate the pose of a 3d object in a single image using the function cvPOSIT. This function implements the POSIT algorithm (DeMenthon & Davis 1995). Also we will make some tests and see the result of the algorithm using OpenGL.
The pose M of a 3d object is a combination of its orientation R (a 3d rotation matrix) and its position T (a 3d translation vector) respect to the camera. So the pose M = [ R | T ] is a 3x4 matrix.
Given some 3D points (object coordinates system) of the object, at least four non-coplanar points, and their corresponding 2D projections in the image, the algorithm is able to estimate the pose.
We will estimate the pose of a virtual cube. As the real pose of the cube is already known we can calculate the projections of the corners, then estimate the pose with POSIT and compare it with the real one.
Model Points
First of all, the posit object must be created with the model points, we will use the eight corners of the cube. The first point of the array passed to cvCreatePOSITObject must be ( 0, 0, 0 ). This point is known as the reference point of the object. POSIT returns the translation from the camera to this point.
float cubeSize = 10.0; std::vector<CvPoint3D32f> modelPoints; modelPoints.push_back(cvPoint3D32f(0.0f, 0.0f, 0.0f)); modelPoints.push_back(cvPoint3D32f(0.0f, 0.0f, cubeSize)); modelPoints.push_back(cvPoint3D32f(0.0f, cubeSize, cubeSize)); modelPoints.push_back(cvPoint3D32f(0.0f, cubeSize, 0.0f)); modelPoints.push_back(cvPoint3D32f(cubeSize, 0.0f, 0.0f)); modelPoints.push_back(cvPoint3D32f(cubeSize, cubeSize, 0.0f)); modelPoints.push_back(cvPoint3D32f(cubeSize, cubeSize, cubeSize)); modelPoints.push_back(cvPoint3D32f(cubeSize, 0.0f, cubeSize)); CvPOSITObject *positObject = cvCreatePOSITObject( &modelPoints[0], static_cast<int>(modelPoints.size()) );
Image Points
We must create an array with the corresponding 2d image points. The image points must be placed in the array in the same order as the model points. In other words, the first point of this array must correspond to the projection of the first model point. The origin of the coordinates of the image is situated at the middle.
For each model point, its coordinates in the camera space are calculated, i.e. they are tranformed by the real pose. Then the projection is calculated using the perspective model.
std::vector<CvPoint2D32f> imagePoints;
for ( size_t p=0; p<modelPoints.size(); ++p )
{
CvPoint3D32f point3D;
//Transform the 3D points with the real pose
//apply the rotation
point3D.x = poseReal[0] * modelPoints[p].x
+ poseReal[4]*modelPoints[p].y
+ poseReal[8]*modelPoints[p].z;
//add the translation
point3D.x = point3D.x + poseReal[12];
point3D.y = poseReal[1] * modelPoints[p].x
+ poseReal[5]*modelPoints[p].y
+ poseReal[9]*modelPoints[p].z;
point3D.y = point3D.y + poseReal[13];
point3D.z = poseReal[2] * modelPoints[p].x
+ poseReal[6]*modelPoints[p].y
+ poseReal[10]*modelPoints[p].z;
point3D.z = point3D.z + poseReal[14];
//Project the transformed 3D points
CvPoint2D32f point2D;
//The central point is not add because POSIT needs the image point coordinates related to the middle point of the image
point2D.x = focalLength * point3D.x / (-point3D.z); //z negative
point2D.y = focalLength * point3D.y / (-point3D.z);
imagePoints.push_back( point2D );
}The real pose is float[16] array representing a 4x4 matrix in OpenGL format (column-major order).
The rotation matrix is:
poseReal[0] |
poseReal[4] |
poseReal[8] |
poseReal[1] |
poseReal[5] |
poseReal[9] |
poseReal[2] |
poseReal[6] |
poseReal[10] |
and the translation vector is:
poseReal[12] |
poseReal[13] |
poseReal[14] |
Pose Estimation
Now that we have the model and image points we can compute the pose:
CvMatr32f rotation_matrix = new float[9]; CvVect32f translation_vector = new float[3]; //set posit termination criteria: 100 max iterations, convergence epsilon 1.0e-5 CvTermCriteria criteria = cvTermCriteria(CV_TERMCRIT_EPS, 100, 1.0e-5 ); cvPOSIT( positObject, &imagePoints[0], FOCAL_LENGTH, criteria, rotation_matrix, translation_vector ); createOpenGLMatrixFrom( rotation_matrix, translation_vector);
OpenGL
In order to draw the model using OpenGL we must build the modelView (pose) matrix and the projection matrix.
OpenGl ModelView Matrix
for (int f=0; f<3; f++)
{
for (int c=0; c<3; c++)
{
posePOSIT[c*4+f] = rotation_matrix[f*3+c]; //transposed
}
}
posePOSIT[3] = 0.0;
posePOSIT[7] = 0.0;
posePOSIT[11] = 0.0;
posePOSIT[12] = translation_vector[0];
posePOSIT[13] = translation_vector[1];
posePOSIT[14] = -translation_vector[2]; //negative
posePOSIT[15] = 1.0; //homogeneous
OpenGl Projection Matrix
This is a standard perspective projection matrix built with the intrinsic parameters(focalLength(X,Y),principal point(pX,pY)), image resolution(iamgeWidth,imageHeight) and far and near plane values.
Remember that is must be stored in column-major order.
2.0 * focalX / imageWidth |
0 |
2.0 * ( pX / imageWidth ) - 1.0 |
0 |
0 |
2.0 * focalY / imageHeight |
2.0 * ( pY / imageHeight ) - 1.0 |
0 |
0 |
0 |
-( farPlane+nearPlane ) / ( farPlane - nearPlane ) |
-2.0 * farPlane * nearPlane / ( farPlane - nearPlane ) |
0 |
0 |
-1 |
0 |
Drawing the Model
glClear( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT ); glViewport(0, 0, imageWidth, imageHeight ); glMatrixMode( GL_PROJECTION ); glLoadMatrixd( projectionMatrix ); glMatrixMode( GL_MODELVIEW ); glLoadMatrixd( posePOSIT ); drawModel();
References
The algorithm is described in
D. DeMenthon and L.S. Davis, "Model-Based Object Pose in 25 Lines of Code", International Journal of Computer Vision, 15, pp. 123-141, June 1995. (see http://www.cfar.umd.edu/~daniel/)
Code
The code doesn't work correctly, I don't know where is the problem. Please help!
As you can see the model (red) is not correctly projected
