Quick Links : Main Page - CxCore - CvReference - CvAux - HighGui - FAQ
CV Reference Manual
Image Processing
Note: The chapter describes functions for image processing and analysis. Most of the functions work with 2d arrays of pixels. We refer to the arrays as "images"; however, they do not have to be of type IplImage, they may be CvMat or CvMatND as well.
Gradients, Edges and Corners
Sobel
Calculates first, second, third or mixed image derivatives using extended Sobel operator
void cvSobel( const CvArr* src, CvArr* dst, int xorder, int yorder, int aperture_size=3 );
- src
Source image of type CvArr*.
- dst
- Destination image.
- xorder
- Order of the derivative x .
- yorder
- Order of the derivative y .
- aperture_size
- Size of the extended Sobel kernel, must be 1, 3, 5 or 7.
In all cases except 1, aperture_size ×aperture_size separable kernel will be used to calculate the derivative. For aperture_size=1 3x1 or 1x3 kernel is used (Gaussian smoothing is not done). There is also special value CV_SCHARR (=-1) that corresponds to 3x3 Scharr filter that may give more accurate results than 3x3 Sobel. Scharr aperture is:
| -3 0 3| |-10 0 10| | -3 0 3|
- for x-derivative or transposed for y-derivative.
The function cvSobel calculates the image derivative by convolving the image with the appropriate kernel:
dst(x,y) = d^xorder+yoder^src/dx^xorder^•dy^yorder^ |,,(x,y),,
The Sobel operators combine Gaussian smoothing and differentiation so the result is more or less robust to the noise. Most often, the function is called with (xorder=1, yorder=0, aperture_size=3) or (xorder=0, yorder=1, aperture_size=3) to calculate first x- or y- image derivative. The first case corresponds to
|-1 0 1| |-2 0 2| |-1 0 1|
kernel and the second one corresponds to
|-1 -2 -1| | 0 0 0| | 1 2 1| or | 1 2 1| | 0 0 0| |-1 -2 -1|
kernel, depending on the image origin (origin field of IplImage structure). No scaling is done, so the destination image usually has larger by absolute value numbers than the source image. To avoid overflow, the function requires 16-bit destination image if the source image is 8-bit. The result can be converted back to 8-bit using cvConvertScale or cvConvertScaleAbs functions. Besides 8-bit images the function can process 32-bit floating-point images. Both source and destination must be single-channel images of equal size or ROI size.
Laplace
Calculates Laplacian of the image
void cvLaplace( const CvArr* src, CvArr* dst, int aperture_size=3 );
- src
- Source image.
- dst
- Destination image.
- aperture_size
Aperture size (it has the same meaning as in cvSobel).
The function cvLaplace calculates Laplacian of the source image by summing second x- and y- derivatives calculated using Sobel operator:
dst(x,y) = d^2^src/dx^2^ + d^2^src/dy^2^
Specifying aperture_size=1 gives the fastest variant that is equal to convolving the image with the following kernel:
|0 1 0| |1 -4 1| |0 1 0|
Similar to cvSobel function, no scaling is done and the same combinations of input and output formats are supported.
Canny
Implements Canny algorithm for edge detection
void cvCanny( const CvArr* image, CvArr* edges, double threshold1,
double threshold2, int aperture_size=3 );- image
- Input image.
- edges
- Image to store the edges found by the function.
- threshold1
- The first threshold.
- threshold2
- The second threshold.
- aperture_size
Aperture parameter for Sobel operator (see cvSobel).
The function cvCanny finds the edges on the input image image and marks them in the output image edges using the Canny algorithm. The smallest of threshold1 and threshold2 is used for edge linking, the largest - to find initial segments of strong edges.
PreCornerDetect
Calculates feature map for corner detection
void cvPreCornerDetect( const CvArr* image, CvArr* corners, int aperture_size=3 );
- image
- Input image.
- corners
- Image to store the corner candidates.
- aperture_size
Aperture parameter for Sobel operator (see cvSobel).
The function cvPreCornerDetect calculates the function Dx2Dyy+Dy2Dxx - 2DxDyDxy where D? denotes one of the first image derivatives and D?? denotes a second image derivative. The corners can be found as local maximums of the function:
// assume that the image is floating-point IplImage* corners = cvCloneImage(image); IplImage* dilated_corners = cvCloneImage(image); IplImage* corner_mask = cvCreateImage( cvGetSize(image), 8, 1 ); cvPreCornerDetect( image, corners, 3 ); cvDilate( corners, dilated_corners, 0, 1 ); cvSubS( corners, dilated_corners, corners ); cvCmpS( corners, 0, corner_mask, CV_CMP_GE ); cvReleaseImage( &corners ); cvReleaseImage( &dilated_corners );
CornerEigenValsAndVecs
Calculates eigenvalues and eigenvectors of image blocks for corner detection
void cvCornerEigenValsAndVecs( const CvArr* image, CvArr* eigenvv,
int block_size, int aperture_size=3 );- image
- Input image.
- eigenvv
- Image to store the results. It must be 6 times wider than the input image.
- block_size
- Neighborhood size (see discussion).
- aperture_size
Aperture parameter for Sobel operator (see cvSobel).
For every pixel The function cvCornerEigenValsAndVecs considers block_size × block_size neigborhood S(p). It calcualtes covariation matrix of derivatives over the neigborhood as:
| sum,,S(p),,(dI/dx)^2^ sum,,S(p),,(dI/dx•dI/dy)|
M = | |
| sum,,S(p),,(dI/dx•dI/dy) sum,,S(p),,(dI/dy)^2^ |After that it finds eigenvectors and eigenvalues of the matrix and stores them into destination image in form (λ1, λ2, x1, y1, x2, y2), where
λ1, λ2 - eigenvalues of M; not sorted
(x1, y1) - eigenvector corresponding to λ1
(x2, y2) - eigenvector corresponding to λ2
CornerMinEigenVal
Calculates minimal eigenvalue of gradient matrices for corner detection
void cvCornerMinEigenVal( const CvArr* image, CvArr* eigenval, int block_size, int aperture_size=3 );
- image
- Input image.
- eigenval
Image to store the minimal eigen values. Should have the same size as image
- block_size
Neighborhood size (see discussion of cvCornerEigenValsAndVecs).
- aperture_size
Aperture parameter for Sobel operator (see cvSobel). format. In the case of floating-point input format this parameter is the number of the fixed float filter used for differencing.
The function cvCornerMinEigenVal is similar to cvCornerEigenValsAndVecs but it calculates and stores only the minimal eigen value of derivative covariation matrix for every pixel, i.e. min(?1, ?2) in terms of the previous function.
CornerHarris
Harris edge detector
void cvCornerHarris( const CvArr* image, CvArr* harris_responce,
int block_size, int aperture_size=3, double k=0.04 );- image
- Input image.
- harris_responce
Image to store the Harris detector responces. Should have the same size as image
- block_size
Neighborhood size (see discussion of cvCornerEigenValsAndVecs).
- aperture_size
Aperture parameter for Sobel operator (see cvSobel). format. In the case of floating-point input format this parameter is the number of the fixed float filter used for differencing.
- k
- Harris detector free parameter. See the formula below.
The function cvCornerHarris runs the Harris edge detector on image. Similarly to cvCornerMinEigenVal and cvCornerEigenValsAndVecs, for each pixel it calculates 2x2 gradient covariation matrix M over block_size×block_size neighborhood. Then, it stores
det(M) - k*trace(M)^2^
to the destination image. Corners in the image can be found as local maxima of the destination image.
FindCornerSubPix
Refines corner locations
void cvFindCornerSubPix( const CvArr* image, CvPoint2D32f* corners,
int count, CvSize win, CvSize zero_zone,
CvTermCriteria criteria );- image
- Input image.
- corners
- Initial coordinates of the input corners and refined coordinates on output.
- count
- Number of corners.
- win
Half sizes of the search window. For example, if win=(5,5) then 5*2+1 × 5*2+1 = 11 × 11 search window is used.
- zero_zone
- Half size of the dead region in the middle of the search zone over which the summation in formulae below is not done. It is used sometimes to avoid possible singularities of the autocorrelation matrix. The value of (-1,-1) indicates that there is no such size.
- criteria
Criteria for termination of the iterative process of corner refinement. That is, the process of corner position refinement stops either after certain number of iteration or when a required accuracy is achieved. The criteria may specify either of or both the maximum number of iteration and the required accuracy.
The function cvFindCornerSubPix iterates to find the sub-pixel accurate location of corners, or radial saddle points, as shown in on the picture below.
Sub-pixel accurate corner locator is based on the observation that every vector from the center q to a point p located within a neighborhood of q is orthogonal to the image gradient at p subject to image and measurement noise. Consider the expression:
e,,i,,=DI,,p,,i,,,,^T^•(q-p,,i,,)
where DI,,p,,i,,,, is the image gradient at the one of the points p,,i,, in a neighborhood of q. The value of q is to be found such that e,,i,, is minimized. A system of equations may be set up with e,,i,,' set to zero:
sum,,i,,(DI,,p,,i,,,,•DI,,p,,i,,,,^T^)•q - sum,,i,,(DI,,p,,i,,,,•DI,,p,,i,,,,^T^•p,,i,,) = 0
where the gradients are summed within a neighborhood ("search window") of q. Calling the first gradient term G and the second gradient term b gives:
q=G^-1^•b
The algorithm sets the center of the neighborhood window at this new center q and then iterates until the center keeps within a set threshold.
GoodFeaturesToTrack
Determines strong corners on image
void cvGoodFeaturesToTrack( const CvArr* image, CvArr* eig_image, CvArr* temp_image,
CvPoint2D32f* corners, int* corner_count,
double quality_level, double min_distance,
const CvArr* mask=NULL, int block_size=3,
int use_harris=0, double k=0.04 );- image
- The source 8-bit or floating-point 32-bit, single-channel image.
- eig_image
Temporary floating-point 32-bit image of the same size as image.
- temp_image
Another temporary image of the same size and same format as eig_image.
- corners
- Output parameter. Detected corners.
- corner_count
- Output parameter. Number of detected corners.
- quality_level
- Multiplier for the maxmin eigenvalue; specifies minimal accepted quality of image corners.
- min_distance
- Limit, specifying minimum possible distance between returned corners; Euclidian distance is used.
- mask
- Region of interest. The function selects points either in the specified region or in the whole image if the mask is NULL.
- block_size
Size of the averaging block, passed to underlying cvCornerMinEigenVal or cvCornerHarris used by the function.
- use_harris
If nonzero, Harris operator (cvCornerHarris) is used instead of default cvCornerMinEigenVal.
- k
Free parameter of Harris detector; used only if use_harris?0
The function cvGoodFeaturesToTrack finds corners with big eigenvalues in the image. The function first calculates the minimal eigenvalue for every source image pixel using cvCornerMinEigenVal function and stores them in eig_image. Then it performs non-maxima suppression (only local maxima in 3x3 neighborhood remain). The next step is rejecting the corners with the minimal eigenvalue less than quality_level•max(eig_image(x,y)). Finally, the function ensures that all the corners found are distanced enough one from another by considering the corners (the most strongest corners are considered first) and checking that the distance between the newly considered feature and the features considered earlier is larger than min_distance. So, the function removes the features than are too close to the stronger features.
Sampling, Interpolation and Geometrical Transforms
SampleLine
Reads raster line to buffer
int cvSampleLine( const CvArr* image, CvPoint pt1, CvPoint pt2,
void* buffer, int connectivity=8 );- image
- Image to sample the line from.
- pt1
- Starting the line point.
- pt2
- Ending the line point.
- buffer
Buffer to store the line points; must have enough size to store max( |pt2.x-pt1.x|+1, |pt2.y-pt1.y|+1 ) points in case of 8-connected line and |pt2.x-pt1.x|+|pt2.y-pt1.y|+1 in case of 4-connected line.
- connectivity
- The line connectivity, 4 or 8.
The function cvSampleLine implements a particular case of application of line iterators. The function reads all the image points lying on the line between pt1 and pt2, including the ending points, and stores them into the buffer.
GetRectSubPix
Retrieves pixel rectangle from image with sub-pixel accuracy
void cvGetRectSubPix( const CvArr* src, CvArr* dst, CvPoint2D32f center );
- src
- Source image.
- dst
- Extracted rectangle.
- center
- Floating point coordinates of the extracted rectangle center within the source image. The center must be inside the image.
The function cvGetRectSubPix extracts pixels from src:
dst(x, y) = src(x + center.x - (width(dst)-1)*0.5, y + center.y - (height(dst)-1)*0.5)
where the values of pixels at non-integer coordinates are retrieved using bilinear interpolation. Every channel of multiple-channel images is processed independently. Whereas the rectangle center must be inside the image, the whole rectangle may be partially occluded. In this case, the replication border mode is used to get pixel values beyond the image boundaries.
GetQuadrangleSubPix
Retrieves pixel quadrangle from image with sub-pixel accuracy
void cvGetQuadrangleSubPix( const CvArr* src, CvArr* dst, const CvMat* map_matrix );
- src
- Source image.
- dst
- Extracted quadrangle.
- map_matrix
The transformation 2 × 3 matrix [A|b] (see the discussion).
The function cvGetQuadrangleSubPix extracts pixels from src at sub-pixel accuracy and stores them to dst as follows:
dst(x, y)= src( A,,11,,x'+A,,12,,y'+b,,1,,, A,,21,,x'+A,,22,,y'+b,,2,,),
where `A` and `b` are taken from `map_matrix`
| A,,11,, A,,12,, b,,1,, |
map_matrix = | |
| A,,21,, A,,22,, b,,2,, |,
x'=x-(width(dst)-1)*0.5, y'=y-(height(dst)-1)*0.5where the values of pixels at non-integer coordinates A•(x,y)T+b are retrieved using bilinear interpolation. When the function needs pixels outside of the image, it uses replication border mode to reconstruct the values. Every channel of multiple-channel images is processed independently.
Resize
Resizes image
void cvResize( const CvArr* src, CvArr* dst, int interpolation=CV_INTER_LINEAR );
- src
- Source image.
- dst
- Destination image.
- interpolation
- Interpolation method:
- CV_INTER_NN - nearest-neigbor interpolation,
- CV_INTER_LINEAR - bilinear interpolation (used by default)
CV_INTER_AREA - resampling using pixel area relation. It is preferred method for image decimation that gives moire-free results. In case of zooming it is similar to CV_INTER_NN method.
- CV_INTER_CUBIC - bicubic interpolation.
The function cvResize resizes image src so that it fits exactly to dst. If ROI is set, the function consideres the ROI as supported as usual.
WarpAffine
Applies affine transformation to the image
void cvWarpAffine( const CvArr* src, CvArr* dst, const CvMat* map_matrix,
int flags=CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS,
CvScalar fillval=cvScalarAll(0) );- src
- Source image.
- dst
- Destination image.
- map_matrix
- 2×3 transformation matrix.
- flags
- A combination of interpolation method and the following optional flags:
CV_WARP_FILL_OUTLIERS - fill all the destination image pixels. If some of them correspond to outliers in the source image, they are set to fillval.
CV_WARP_INVERSE_MAP - indicates that matrix is inverse transform from destination image to source and, thus, can be used directly for pixel interpolation. Otherwise, the function finds the inverse transform from map_matrix.
- fillval
- A value used to fill outliers.
The function cvWarpAffine transforms source image using the specified matrix:
dst(x?,y?)<-src(x,y) (x?,y?)^T^=map_matrix•(x,y,1)^T^+b if CV_WARP_INVERSE_MAP is not set, (x, y)^T^=map_matrix•(x?,y',1)^T^+b otherwise
The function is similar to cvGetQuadrangleSubPix but they are not exactly the same. cvWarpAffine requires input and output image have the same data type, has larger overhead (so it is not quite suitable for small images) and can leave part of destination image unchanged. While cvGetQuadrangleSubPix may extract quadrangles from 8-bit images into floating-point buffer, has smaller overhead and always changes the whole destination image content.
To transform a sparse set of points, use cvTransform function from cxcore.
2DRotationMatrix
Calculates affine matrix of 2d rotation
CvMat* cv2DRotationMatrix( CvPoint2D32f center, double angle,
double scale, CvMat* map_matrix );- center
- Center of the rotation in the source image.
- angle
- The rotation angle in degrees. Positive values mean couter-clockwise rotation (the coordiate origin is assumed at top-left corner).
- scale
- Isotropic scale factor.
- map_matrix
- Pointer to the destination 2×3 matrix.
The function cv2DRotationMatrix calculates matrix:
[ a ß | (1-a)*center.x - ß*center.y ] [ -ß a | ß*center.x + (1-a)*center.y ] where a=scale*cos(angle), ß=scale*sin(angle)
The transformation maps the rotation center to itself. If this is not the purpose, the shift should be adjusted.
WarpPerspective
Applies perspective transformation to the image
void cvWarpPerspective( const CvArr* src, CvArr* dst, const CvMat* map_matrix,
int flags=CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS,
CvScalar fillval=cvScalarAll(0) );- src
- Source image.
- dst
- Destination image.
- map_matrix
- 3×3 transformation matrix.
- flags
- A combination of interpolation method and the following optional flags:
CV_WARP_FILL_OUTLIERS - fill all the destination image pixels. If some of them correspond to outliers in the source image, they are set to fillval.
CV_WARP_INVERSE_MAP - indicates that matrix is inverse transform from destination image to source and, thus, can be used directly for pixel interpolation. Otherwise, the function finds the inverse transform from map_matrix.
- fillval
- A value used to fill outliers.
The function cvWarpPerspective transforms source image using the specified matrix:
dst(x?,y?)<-src(x,y) (t•x?,t•y?,t)^T^=map_matrix•(x,y,1)^T^+b if CV_WARP_INVERSE_MAP is not set, (t•x, t•y, t)^T^=map_matrix•(x?,y',1)^T^+b otherwise
For a sparse set of points use cvPerspectiveTransform function from cxcore.
WarpPerspectiveQMatrix
Calculates perspective transform from 4 corresponding points
CvMat* cvWarpPerspectiveQMatrix( const CvPoint2D32f* src,
const CvPoint2D32f* dst,
CvMat* map_matrix );- src
- Coordinates of 4 quadrangle vertices in the source image.
- dst
- Coordinates of the 4 corresponding quadrangle vertices in the destination image.
- map_matrix
- Pointer to the destination 3×3 matrix.
The function cvWarpPerspectiveQMatrix calculates matrix of perspective transform such that:
(t,,i,,•x',,i,,,t,,i,,•y',,i,,,t,,i,,)^T^=map_matrix•(x,,i,,,y,,i,,,1)^T^
where dst(i)=(x',,i,,,y',,i,,), src(i)=(x,,i,,,y,,i,,), i=0..3.
Remap
Applies generic geometrical transformation to the image
void cvRemap( const CvArr* src, CvArr* dst,
const CvArr* mapx, const CvArr* mapy,
int flags=CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS,
CvScalar fillval=cvScalarAll(0) );- src
- Source image.
- dst
- Destination image.
- mapx
- The map of x-coordinates (32fC1 image).
- mapy
- The map of y-coordinates (32fC1 image).
- flags
- A combination of interpolation method and the following optional flag(s):
CV_WARP_FILL_OUTLIERS - fill all the destination image pixels. If some of them correspond to outliers in the source image, they are set to fillval.
- fillval
- A value used to fill outliers.
The function cvRemap transforms source image using the specified map:
dst(x,y)<-src(mapx(x,y),mapy(x,y))
Similar to other geometrical transformations, some interpolation method (specified by user) is used to extract pixels with non-integer coordinates.
LogPolar
Remaps image to log-polar space
void cvLogPolar( const CvArr* src, CvArr* dst,
CvPoint2D32f center, double M,
int flags=CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS );- src
- Source image.
- dst
- Destination image.
- center
- The transformation center, where the output precision is maximal.
- M
- Magnitude scale parameter. See below.
- flags
- A combination of interpolation method and the following optional flags:
- CV_WARP_FILL_OUTLIERS - fill all the destination image pixels. If some of them correspond to outliers in the source image, they are set to zeros.
CV_WARP_INVERSE_MAP - indicates that matrix is inverse transform from destination image to source and, thus, can be used directly for pixel interpolation. Otherwise, the function finds the inverse transform from map_matrix.
- fillval
- A value used to fill outliers.
The function cvLogPolar transforms source image using the following transformation:
Forward transformation (`CV_WARP_INVERSE_MAP` is not set):
dst(phi,rho)<-src(x,y)
Inverse transformation (`CV_WARP_INVERSE_MAP` is set):
dst(x,y)<-src(phi,rho),
where rho=M*log(sqrt(x^2^+y^2^))
phi=atan(y/x)The function emulates the human "foveal" vision and can be used for fast scale and rotation-invariant template matching, for object tracking etc.
Example. Log-polar transformation.
#include <cv.h>
#include <highgui.h>
int main(int argc, char** argv)
{
IplImage* src;
if( argc == 2 && (src=cvLoadImage(argv[1],1) != 0 )
{
IplImage* dst = cvCreateImage( cvSize(256,256), 8, 3 );
IplImage* src2 = cvCreateImage( cvGetSize(src), 8, 3 );
cvLogPolar( src, dst, cvPoint2D32f(src->width/2,src->height/2), 40, CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS );
cvLogPolar( dst, src2, cvPoint2D32f(src->width/2,src->height/2), 40, CV_INTER_LINEAR+CV_WARP_FILL_OUTLIERS+CV_WARP_INVERSE_MAP );
cvNamedWindow( "log-polar", 1 );
cvShowImage( "log-polar", dst );
cvNamedWindow( "inverse log-polar", 1 );
cvShowImage( "inverse log-polar", src2 );
cvWaitKey();
}
return 0;
}And this is what the program displays when opencv/samples/c/fruits.jpg is passed to it
Morphological Operations
CreateStructuringElementEx
Creates structuring element
IplConvKernel* cvCreateStructuringElementEx( int cols, int rows, int anchor_x, int anchor_y,
int shape, int* values=NULL );- cols
- Number of columns in the structuring element.
- rows
- Number of rows in the structuring element.
- anchor_x
- Relative horizontal offset of the anchor point.
- anchor_y
- Relative vertical offset of the anchor point.
- shape
- Shape of the structuring element; may have the following values:
CV_SHAPE_RECT, a rectangular element;
CV_SHAPE_CROSS, a cross-shaped element;
CV_SHAPE_ELLIPSE, an elliptic element;
CV_SHAPE_CUSTOM, a user-defined element. In this case the parameter values specifies the mask, that is, which neighbors of the pixel must be considered.
- values
Pointer to the structuring element data, a plane array, representing row-by-row scanning of the element matrix. Non-zero values indicate points that belong to the element. If the pointer is NULL, then all values are considered non-zero, that is, the element is of a rectangular shape. This parameter is considered only if the shape is CV_SHAPE_CUSTOM .
The function CreateStructuringElementEx cv CreateStructuringElementEx allocates and fills the structure IplConvKernel, which can be used as a structuring element in the morphological operations.
ReleaseStructuringElement
Deletes structuring element
void cvReleaseStructuringElement( IplConvKernel** element );
- element
- Pointer to the deleted structuring element.
The function cvReleaseStructuringElement releases the structure IplConvKernel that is no longer needed. If *element is NULL, the function has no effect.
Erode
Erodes image by using arbitrary structuring element
void cvErode( const CvArr* src, CvArr* dst, IplConvKernel* element=NULL, int iterations=1 );
- src
- Source image.
- dst
- Destination image.
- element
Structuring element used for erosion. If it is NULL, a 3×3 rectangular structuring element is used.
- iterations
- Number of times erosion is applied.
The function cvErode erodes the source image using the specified structuring element that determines the shape of a pixel neighborhood over which the minimum is taken:
dst=erode(src,element): dst(x,y)=min,,((x',y') in element),,)src(x+x',y+y')
The function supports the in-place mode. Erosion can be applied several (iterations) times. For color images, each channel is processed independently.
Dilate
Dilates image by using arbitrary structuring element
void cvDilate( const CvArr* src, CvArr* dst, IplConvKernel* element=NULL, int iterations=1 );
- src
- Source image.
- dst
- Destination image.
- element
Structuring element used for dilation. If it is NULL, a 3×3 rectangular structuring element is used.
- iterations
- Number of times dilation is applied.
The function cvDilate dilates the source image using the specified structuring element that determines the shape of a pixel neighborhood over which the maximum is taken:
dst=dilate(src,element): dst(x,y)=max,,((x',y') in element),,)src(x+x',y+y')
The function supports the in-place mode. Dilation can be applied several (iterations) times. For color images, each channel is processed independently.
MorphologyEx
Performs advanced morphological transformations
void cvMorphologyEx( const CvArr* src, CvArr* dst, CvArr* temp,
IplConvKernel* element, int operation, int iterations=1 );- src
- Source image.
- dst
- Destination image.
- temp
- Temporary image, required in some cases.
- element
- Structuring element.
- operation
Type of morphological operation, one of:
CV_MOP_OPEN - opening
CV_MOP_CLOSE - closing
CV_MOP_GRADIENT - morphological gradient
CV_MOP_TOPHAT - "top hat"
CV_MOP_BLACKHAT - "black hat"
- iterations
- Number of times erosion and dilation are applied.
The function cvMorphologyEx can perform advanced morphological transformations using erosion and dilation as basic operations.
Opening: dst=open(src,element)=dilate(erode(src,element),element) Closing: dst=close(src,element)=erode(dilate(src,element),element) Morphological gradient: dst=morph_grad(src,element)=dilate(src,element)-erode(src,element) "Top hat": dst=tophat(src,element)=src-open(src,element) "Black hat": dst=blackhat(src,element)=close(src,element)-src
The temporary image temp is required for morphological gradient and, in case of in-place operation, for "top hat" and "black hat".
Filters and Color Conversion
Smooth
Smooths the image in one of several ways
void cvSmooth( const CvArr* src, CvArr* dst,
int smoothtype=CV_GAUSSIAN,
int param1=3, int param2=0, double param3=0 );- src
- The source image.
- dst
- The destination image.
- smoothtype
- Type of the smoothing:
CV_BLUR_NO_SCALE (simple blur with no scaling) - summation over a pixel param1×param2 neighborhood. If the neighborhood size may vary, one may precompute integral image with cvIntegral function.
CV_BLUR (simple blur) - summation over a pixel param1×param2 neighborhood with subsequent scaling by 1/(param1•param2).
CV_GAUSSIAN (gaussian blur) - convolving image with param1×param2 Gaussian kernel.
CV_MEDIAN (median blur) - finding median of param1×param1 neighborhood (i.e. the neighborhood is square).
CV_BILATERAL (bilateral filter) - applying bilateral 3x3 filtering with color sigma=param1 and space sigma=param2. Information about bilateral filtering can be found at http://www.dai.ed.ac.uk/CVonline/LOCAL_COPIES/MANDUCHI1/Bilateral_Filtering.html
- param1
- The first parameter of smoothing operation.
- param2
The second parameter of smoothing operation. In case of simple scaled/non-scaled and Gaussian blur if param2 is zero, it is set to param1.
- param3
In case of Gaussian parameter this parameter may specify Gaussian sigma (standard deviation). If it is zero, it is calculated from the kernel size:
sigma = (n/2 - 1)*0.3 + 0.8, where n=param1 for horizontal kernel,
n=param2 for vertical kernel.Using standard sigma for small kernels (3×3 to 7×7) gives better speed. If param3 is not zero, while param1 and param2 are zeros, the kernel size is calculated from the sigma (to provide accurate enough operation).
The function cvSmooth smooths image using one of several methods. Every of the methods has some features and restrictions listed below
Blur with no scaling works with single-channel images only and supports accumulation of 8-bit to 16-bit format (similar to cvSobel and cvLaplace) and 32-bit floating point to 32-bit floating-point format.
Simple blur and Gaussian blur support 1- or 3-channel, 8-bit and 32-bit floating point images. These two methods can process images in-place.
Median and bilateral filters work with 1- or 3-channel 8-bit images and can not process images in-place.
Filter2D
Convolves image with the kernel
void cvFilter2D( const CvArr* src, CvArr* dst,
const CvMat* kernel,
CvPoint anchor=cvPoint(-1,-1));- src
- The source image.
- dst
- The destination image.
- kernel
Convolution kernel, single-channel floating point matrix. If you want to apply different kernels to different channels, split the image using cvSplit into separate color planes and process them individually.
- anchor
- The anchor of the kernel that indicates the relative position of a filtered point within the kernel. The anchor shoud lie within the kernel. The special default value (-1,-1) means that it is at the kernel center.
The function cvFilter2D applies arbitrary linear filter to the image. In-place operation is supported. When the aperture is partially outside the image, the function interpolates outlier pixel values from the nearest pixels that is inside the image.
CopyMakeBorder
Copies image and makes border around it
void cvCopyMakeBorder( const CvArr* src, CvArr* dst, CvPoint offset,
int bordertype, CvScalar value=cvScalarAll(0) );- src
- The source image.
- dst
- The destination image.
- offset
- Coordinates of the top-left corner (or bottom-left in case of images with bottom-left origin) of the destination image rectangle where the source image (or its ROI) is copied. Size of the rectanlge matches the source image size/ROI size.
- bordertype
Type of the border to create around the copied source image rectangle:
IPL_BORDER_CONSTANT - border is filled with the fixed value, passed as last parameter of the function.
IPL_BORDER_REPLICATE - the pixels from the top and bottom rows, the left-most and right-most columns are replicated to fill the border.
(The other two border types from IPL, IPL_BORDER_REFLECT and IPL_BORDER_WRAP, are currently unsupported).- value
Value of the border pixels if bordertype=IPL_BORDER_CONSTANT.
The function cvCopyMakeBorder copies the source 2D array into interior of destination array and makes a border of the specified type around the copied area. The function is useful when one needs to emulate border type that is different from the one embedded into a specific algorithm implementation. For example, morphological functions, as well as most of other filtering functions in OpenCV, internally use replication border type, while the user may need zero border or a border, filled with 1's or 255's.
Integral
Calculates integral images
void cvIntegral( const CvArr* image, CvArr* sum, CvArr* sqsum=NULL, CvArr* tilted_sum=NULL );
- image
The source image, W×H, 8-bit or floating-point (32f or 64f) image.
- sum
The integral image, W+1×H+1, 32-bit integer or double precision floating-point (64f).
- sqsum
The integral image for squared pixel values, W+1×H+1, double precision floating-point (64f).
- tilted_sum
The integral for the image rotated by 45 degrees, W+1×H+1, the same data type as sum.
The function cvIntegral calculates one or more integral images for the source image as following:
sum(X,Y)=sum,,x<X,y<Y,,image(x,y) sqsum(X,Y)=sum,,x<X,y<Y,,image(x,y)^2^ tilted_sum(X,Y)=sum,,y<Y,abs(x-X)<y,,image(x,y)
Using these integral images, one may calculate sum, mean, standard deviation over arbitrary up-right or rotated rectangular region of the image in a constant time, for example:
sum,,x1<=x<x2,y1<=y<y2,,image(x,y)=sum(x2,y2)-sum(x1,y2)-sum(x2,y1)+sum(x1,x1)
It makes possible to do a fast blurring or fast block correlation with variable window size etc. In case of multi-channel images sums for each channel are accumulated independently.
CvtColor
Converts image from one color space to another
void cvCvtColor( const CvArr* src, CvArr* dst, int code );
- src
- The source 8-bit (8u), 16-bit (16u) or single-precision floating-point (32f) image.
- dst
- The destination image of the same data type as the source one. The number of channels may be different.
- code
Color conversion operation that can be specifed using CV_<src_color_space>2<dst_color_space> constants (see below).
The function cvCvtColor converts input image from one color space to another. The function ignores colorModel and channelSeq fields of IplImage header, so the source image color space should be specified correctly (including order of the channels in case of RGB space, e.g. BGR means 24-bit format with B,, G R B1 G1 R1,, ... layout, whereas RGB means 24-format with R,, G B R1 G1 B1,, ... layout).
The conventional range for R,G,B channel values is:
- 0..255 for 8-bit images
- 0..65535 for 16-bit images and
- 0..1 for floating-point images.
Of course, in case of linear transformations the range can be arbitrary, but in order to get correct results in case of non-linear transformations, the input image should be scaled if necessary.
The function can do the following transformations:
- Transformations within RGB space like adding/removing alpha channel, reversing the channel order, conversion to/from 16-bit RGB color (R5:G6:B5 or R5:G5:B5) color, as well as conversion to/from grayscale using:
RGB[A]->Gray: Y<-0.299*R + 0.587*G + 0.114*B Gray->RGB[A]: R<-Y G<-Y B<-Y A<-0
The conversion from a RGB image to gray is done with:
RGB<=>gray cvCvtColor(src ,bwsrc, CV_RGB2GRAY)
RGB<=>CIE XYZ.Rec 709 with D65 white point (CV_BGR2XYZ, CV_RGB2XYZ, CV_XYZ2BGR, CV_XYZ2RGB):
|X| |0.412453 0.357580 0.180423| |R| |Y| <- |0.212671 0.715160 0.072169|*|G| |Z| |0.019334 0.119193 0.950227| |B| |R| | 3.240479 -1.53715 -0.498535| |X| |G| <- |-0.969256 1.875991 0.041556|*|Y| |B| | 0.055648 -0.204043 1.057311| |Z| X, Y and Z cover the whole value range (in case of floating-point images Z may exceed 1).
RGB<=>YCrCb JPEG (a.k.a. YCC) (CV_BGR2YCrCb, CV_RGB2YCrCb, CV_YCrCb2BGR, CV_YCrCb2RGB)
Y <- 0.299*R + 0.587*G + 0.114*B Cr <- (R-Y)*0.713 + delta Cb <- (B-Y)*0.564 + delta R <- Y + 1.403*(Cr - delta) G <- Y - 0.344*(Cr - delta) - 0.714*(Cb - delta) B <- Y + 1.773*(Cb - delta), { 128 for 8-bit images, where delta = { 32768 for 16-bit images { 0.5 for floating-point images Y, Cr and Cb cover the whole value range.RGB<=>HSV (CV_BGR2HSV, CV_RGB2HSV, CV_HSV2BGR, CV_HSV2RGB)
// In case of 8-bit and 16-bit images // R, G and B are converted to floating-point format and scaled to fit 0..1 range V <- max(R,G,B) S <- (V-min(R,G,B))/V if V?0, 0 otherwise (G - B)*60/S, if V=R H <- 120+(B - R)*60/S, if V=G 240+(R - G)*60/S, if V=B if H<0 then H<-H+360 On output 0=V=1, 0=S=1, 0=H=360. The values are then converted to the destination data type: 8-bit images: V <- V*255, S <- S*255, H <- H/2 (to fit to 0..255) 16-bit images (currently not supported): V <- V*65535, S <- S*65535, H <- H 32-bit images: H, S, V are left as isRGB<=>HLS (CV_BGR2HLS, CV_RGB2HLS, CV_HLS2BGR, CV_HLS2RGB)
// In case of 8-bit and 16-bit images // R, G and B are converted to floating-point format and scaled to fit 0..1 range V,,max,, <- max(R,G,B) V,,min,, <- min(R,G,B) L <- (V,,max,, + V,,min,,)/2 S <- (V,,max,, - V,,min,,)/(V,,max,, + V,,min,,) if L < 0.5 (V,,max,, - V,,min,,)/(2 - (V,,max,, + V,,min,,)) if L = 0.5 (G - B)*60/S, if V,,max,,=R H <- 180+(B - R)*60/S, if V,,max,,=G 240+(R - G)*60/S, if V,,max,,=B if H<0 then H<-H+360 On output 0=L=1, 0=S=1, 0=H=360. The values are then converted to the destination data type: 8-bit images: L <- L*255, S <- S*255, H <- H/2 16-bit images (currently not supported): L <- L*65535, S <- S*65535, H <- H 32-bit images: H, L, S are left as isRGB<=>CIE L*a*b* (CV_BGR2Lab, CV_RGB2Lab, CV_Lab2BGR, CV_Lab2RGB)
// In case of 8-bit and 16-bit images // R, G and B are converted to floating-point format and scaled to fit 0..1 range // convert R,G,B to CIE XYZ |X| |0.412453 0.357580 0.180423| |R| |Y| <- |0.212671 0.715160 0.072169|*|G| |Z| |0.019334 0.119193 0.950227| |B| X <- X/Xn, where Xn = 0.950456 Z <- Z/Zn, where Zn = 1.088754 L <- 116*Y^1/3-16 for Y>0.008856 L <- 903.3*Y for Y<=0.008856 a <- 500*(f(X)-f(Y)) + delta b <- 200*(f(Y)-f(Z)) + delta where f(t)=t^1/3^ for t>0.008856 f(t)=7.787*t+16/116 for t<=0.008856 where delta = 128 for 8-bit images, 0 for floating-point images On output 0=L=100, -127=a=127, -127=b=127 The values are then converted to the destination data type: 8-bit images: L <- L*255/100, a <- a + 128, b <- b + 128 16-bit images are currently not supported 32-bit images: L, a, b are left as isRGB<=>CIE L*u*v* (CV_BGR2Luv, CV_RGB2Luv, CV_Luv2BGR, CV_Luv2RGB)
// In case of 8-bit and 16-bit images // R, G and B are converted to floating-point format and scaled to fit 0..1 range // convert R,G,B to CIE XYZ |X| |0.412453 0.357580 0.180423| |R| |Y| <- |0.212671 0.715160 0.072169|*|G| |Z| |0.019334 0.119193 0.950227| |B| L <- 116*Y^1/3^ for Y>0.008856 L <- 903.3*Y for Y<=0.008856 u' <- 4*X/(X + 15*Y + 3*Z) v' <- 9*Y/(X + 15*Y + 3*Z) u <- 13*L*(u' - u,,n,,), where u,,n,,=0.19793943 v <- 13*L*(v' - v,,n,,), where v,,n,,=0.46831096 On output 0=L=100, -134=u=220, -140=v=122 The values are then converted to the destination data type: 8-bit images: L <- L*255/100, u <- (u + 134)*255/354, v <- (v + 140)*255/256 16-bit images are currently not supported 32-bit images: L, u, v are left as isThe above formulae for converting RGB to/from various color spaces have been taken from multiple sources on Web, primarily from Color Space Conversions ([Ford98)] document at Charles Poynton site.
Bayer=>RGB (CV_BayerBG2BGR, CV_BayerGB2BGR, CV_BayerRG2BGR, CV_BayerGR2BGR,[[BR]] CV_BayerBG2RGB, CV_BayerGB2RGB, CV_BayerRG2RGB, CV_BayerGR2RGB) Bayer pattern is widely used in CCD and CMOS cameras. It allows to get color picture out of a single plane where R,G and B pixels (sensors of a particular component) are interleaved like this:
R
G
R
G
R
G
B
G
B
G
R
G
R
G
R
G
B
G
B
G
R
G
R
G
R
G
B
G
B
G
The output RGB components of a pixel are interpolated from 1, 2 or 4 neighbors of the pixel having the same color. There are several modifications of the above pattern that can be achieved by shifting the pattern one pixel left and/or one pixel up. The two letters C1 and C2 in the conversion constants CV_BayerC1C22{BGR|RGB} indicate the particular pattern type - these are components from the second row, second and third columns, respectively. For example, the above pattern has very popular "BG" type.
Threshold
Applies fixed-level threshold to array elements
void cvThreshold( const CvArr* src, CvArr* dst, double threshold,
double max_value, int threshold_type );- src
- Source array (single-channel, 8-bit of 32-bit floating point).
- dst
Destination array; must be either the same type as src or 8-bit.
- threshold
- Threshold value.
- max_value
Maximum value to use with CV_THRESH_BINARY and CV_THRESH_BINARY_INV thresholding types.
- threshold_type
- Thresholding type (see the discussion).
The function cvThreshold applies fixed-level thresholding to single-channel array. The function is typically used to get bi-level (binary) image out of grayscale image (cvCmpS could be also used for this purpose) or for removing a noise, i.e. filtering out pixels with too small or too large values. There are several types of thresholding the function supports that are determined by threshold_type:
threshold_type=CV_THRESH_BINARY:
dst(x,y) = max_value, if src(x,y)>threshold
0, otherwise
threshold_type=CV_THRESH_BINARY_INV:
dst(x,y) = 0, if src(x,y)>threshold
max_value, otherwise
threshold_type=CV_THRESH_TRUNC:
dst(x,y) = threshold, if src(x,y)>threshold
src(x,y), otherwise
threshold_type=CV_THRESH_TOZERO:
dst(x,y) = src(x,y), if src(x,y)>threshold
0, otherwise
threshold_type=CV_THRESH_TOZERO_INV:
dst(x,y) = 0, if src(x,y)>threshold
src(x,y), otherwiseAnd this is the visual description of thresholding types:
AdaptiveThreshold
Applies adaptive threshold to array
void cvAdaptiveThreshold( const CvArr* src, CvArr* dst, double max_value,
int adaptive_method=CV_ADAPTIVE_THRESH_MEAN_C,
int threshold_type=CV_THRESH_BINARY,
int block_size=3, double param1=5 );- src
- Source image.
- dst
- Destination image.
- max_value
Maximum value that is used with CV_THRESH_BINARY and CV_THRESH_BINARY_INV.
- adaptive_method
Adaptive thresholding algorithm to use: CV_ADAPTIVE_THRESH_MEAN_C or CV_ADAPTIVE_THRESH_GAUSSIAN_C (see the discussion).
- threshold_type
- Thresholding type; must be one of
CV_THRESH_BINARY,
CV_THRESH_BINARY_INV
- block_size
- The size of a pixel neighborhood that is used to calculate a threshold value for the pixel: 3, 5, 7, ...
- param1
The method-dependent parameter. For the methods CV_ADAPTIVE_THRESH_MEAN_C and CV_ADAPTIVE_THRESH_GAUSSIAN_C it is a constant subtracted from mean or weighted mean (see the discussion), though it may be negative.
The function cvAdaptiveThreshold transforms grayscale image to binary image according to the formulae:
threshold_type=`CV_THRESH_BINARY`:
dst(x,y) = max_value, if src(x,y)>T(x,y)
0, otherwise
threshold_type=`CV_THRESH_BINARY_INV`:
dst(x,y) = 0, if src(x,y)>T(x,y)
max_value, otherwisewhere TI is a threshold calculated individually for each pixel.
For the method CV_ADAPTIVE_THRESH_MEAN_C it is a mean of block_size × block_size pixel neighborhood, subtracted by param1.
For the method CV_ADAPTIVE_THRESH_GAUSSIAN_C it is a weighted sum (gaussian) of block_size × block_size pixel neighborhood, subtracted by param1.
