VS2022配置OpenCV环境
- 关于OpenCV在VS2022上配置的教程可以参考:VS2022 配置OpenCV开发环境详细教程
图像处理
图像处理是一个广泛的领域,它涉及到对图像数据进行分析、修改和改进的各种技术。以下是一些基本的图像处理操作,这些操作通常可以在编程语言如Python、C++等中使用相应的图像处理库来实现:
-
图像读取:使用
imread
函数(如在MATLAB或OpenCV中)读取图像文件到内存。 -
图像显示:使用
imshow
(在MATLAB或OpenCV中)显示图像。 -
图像转换:
- 颜色空间转换:例如从RGB转换到灰度(Grayscale)。
- 数据类型转换:例如将图像数据从
uint8
转换为float
。
-
图像调整:
- 亮度和对比度调整。
- 色彩平衡调整。
-
图像滤波:
- 使用各种滤波器,如高斯滤波、中值滤波、锐化滤波等,来改善图像质量或去除噪声。
-
边缘检测:
- 使用Sobel算子、Canny算子等方法检测图像中的边缘。
-
特征提取:
- 识别图像中的特定特征,如角点、线条等。
-
图像分割:
- 将图像分割成多个区域或对象。
-
图像合成:
- 将多个图像合并为一个图像。
-
图像变换:
- 旋转、缩放、平移等几何变换。
-
图像压缩:
- 减少图像数据的大小,以便于存储和传输。
-
机器学习和深度学习:
- 使用机器学习算法或深度神经网络进行图像分类、目标检测、图像分割等高级任务。
读取并显示图像
- 函数:读取imread()、显示imshow()
- 读取imread()函数声明
/** @brief Loads an image from a file.
@anchor imread
The function imread loads an image from the specified file and returns it. If the image cannot be
read (because of missing file, improper permissions, unsupported or invalid format), the function
returns an empty matrix ( Mat::data==NULL ).
Currently, the following file formats are supported:
- Windows bitmaps - \*.bmp, \*.dib (always supported)
- JPEG files - \*.jpeg, \*.jpg, \*.jpe (see the *Note* section)
- JPEG 2000 files - \*.jp2 (see the *Note* section)
- Portable Network Graphics - \*.png (see the *Note* section)
- WebP - \*.webp (see the *Note* section)
- AVIF - \*.avif (see the *Note* section)
- Portable image format - \*.pbm, \*.pgm, \*.ppm \*.pxm, \*.pnm (always supported)
- PFM files - \*.pfm (see the *Note* section)
- Sun rasters - \*.sr, \*.ras (always supported)
- TIFF files - \*.tiff, \*.tif (see the *Note* section)
- OpenEXR Image files - \*.exr (see the *Note* section)
- Radiance HDR - \*.hdr, \*.pic (always supported)
- Raster and Vector geospatial data supported by GDAL (see the *Note* section)
@note
- The function determines the type of an image by the content, not by the file extension.
- In the case of color images, the decoded images will have the channels stored in **B G R** order.
- When using IMREAD_GRAYSCALE, the codec's internal grayscale conversion will be used, if available.
Results may differ to the output of cvtColor()
- On Microsoft Windows\* OS and MacOSX\*, the codecs shipped with an OpenCV image (libjpeg,
libpng, libtiff, and libjasper) are used by default. So, OpenCV can always read JPEGs, PNGs,
and TIFFs. On MacOSX, there is also an option to use native MacOSX image readers. But beware
that currently these native image loaders give images with different pixel values because of
the color management embedded into MacOSX.
- On Linux\*, BSD flavors and other Unix-like open-source operating systems, OpenCV looks for
codecs supplied with an OS image. Install the relevant packages (do not forget the development
files, for example, "libjpeg-dev", in Debian\* and Ubuntu\*) to get the codec support or turn
on the OPENCV_BUILD_3RDPARTY_LIBS flag in CMake.
- In the case you set *WITH_GDAL* flag to true in CMake and @ref IMREAD_LOAD_GDAL to load the image,
then the [GDAL](http://www.gdal.org) driver will be used in order to decode the image, supporting
the following formats: [Raster](http://www.gdal.org/formats_list.html),
[Vector](http://www.gdal.org/ogr_formats.html).
- If EXIF information is embedded in the image file, the EXIF orientation will be taken into account
and thus the image will be rotated accordingly except if the flags @ref IMREAD_IGNORE_ORIENTATION
or @ref IMREAD_UNCHANGED are passed.
- Use the IMREAD_UNCHANGED flag to keep the floating point values from PFM image.
- By default number of pixels must be less than 2^30. Limit can be set using system
variable OPENCV_IO_MAX_IMAGE_PIXELS
@param filename Name of file to be loaded.
@param flags Flag that can take values of cv::ImreadModes
*/
CV_EXPORTS_W Mat imread( const String& filename, int flags = IMREAD_COLOR );
- 显示imshow()函数声明
/** @brief Displays an image in the specified window.
The function imshow displays an image in the specified window. If the window was created with the
cv::WINDOW_AUTOSIZE flag, the image is shown with its original size, however it is still limited by the screen resolution.
Otherwise, the image is scaled to fit the window. The function may scale the image, depending on its depth:
- If the image is 8-bit unsigned, it is displayed as is.
- If the image is 16-bit unsigned, the pixels are divided by 256. That is, the
value range [0,255\*256] is mapped to [0,255].
- If the image is 32-bit or 64-bit floating-point, the pixel values are multiplied by 255. That is, the
value range [0,1] is mapped to [0,255].
- 32-bit integer images are not processed anymore due to ambiguouty of required transform.
Convert to 8-bit unsigned matrix using a custom preprocessing specific to image's context.
If window was created with OpenGL support, cv::imshow also support ogl::Buffer , ogl::Texture2D and
cuda::GpuMat as input.
If the window was not created before this function, it is assumed creating a window with cv::WINDOW_AUTOSIZE.
If you need to show an image that is bigger than the screen resolution, you will need to call namedWindow("", WINDOW_NORMAL) before the imshow.
@note This function should be followed by a call to cv::waitKey or cv::pollKey to perform GUI
housekeeping tasks that are necessary to actually show the given image and make the window respond
to mouse and keyboard events. Otherwise, it won't display the image and the window might lock up.
For example, **waitKey(0)** will display the window infinitely until any keypress (it is suitable
for image display). **waitKey(25)** will display a frame and wait approximately 25 ms for a key
press (suitable for displaying a video frame-by-frame). To remove the window, use cv::destroyWindow.
@note [__Windows Backend Only__] Pressing Ctrl+C will copy the image to the clipboard. Pressing Ctrl+S will show a dialog to save the image.
@param winname Name of the window.
@param mat Image to be shown.
*/
CV_EXPORTS_W void imshow(const String& winname, InputArray mat);
- c++ demo:
#include <opencv2/opencv.hpp>
#include <opencv2/highgui/highgui.hpp>
int main() {
cv::Mat image = cv::imread("amy.png", cv::IMREAD_COLOR);
if (!image.data) {
std::cout << "Could not open or find the image" << std::endl;
return -1;
}
cv::namedWindow("Display Image", cv::WINDOW_AUTOSIZE); // 创建窗口
cv::imshow("Display Image", image); // 显示图像
cv::waitKey(0); // 等待用户按键
return 0;
}
- 结果:
转换图像颜色空间
- 函数:cvtColor()
- cvtColor()函数声明
/** @brief Converts an image from one color space to another.
The function converts an input image from one color space to another. In case of a transformation
to-from RGB color space, the order of the channels should be specified explicitly (RGB or BGR). Note
that the default color format in OpenCV is often referred to as RGB but it is actually BGR (the
bytes are reversed). So the first byte in a standard (24-bit) color image will be an 8-bit Blue
component, the second byte will be Green, and the third byte will be Red. The fourth, fifth, and
sixth bytes would then be the second pixel (Blue, then Green, then Red), and so on.
The conventional ranges for R, G, and B channel values are:
- 0 to 255 for CV_8U images
- 0 to 65535 for CV_16U images
- 0 to 1 for CV_32F images
In case of linear transformations, the range does not matter. But in case of a non-linear
transformation, an input RGB image should be normalized to the proper value range to get the correct
results, for example, for RGB \f$\rightarrow\f$ L\*u\*v\* transformation. For example, if you have a
32-bit floating-point image directly converted from an 8-bit image without any scaling, then it will
have the 0..255 value range instead of 0..1 assumed by the function. So, before calling #cvtColor ,
you need first to scale the image down:
@code
img *= 1./255;
cvtColor(img, img, COLOR_BGR2Luv);
@endcode
If you use #cvtColor with 8-bit images, the conversion will have some information lost. For many
applications, this will not be noticeable but it is recommended to use 32-bit images in applications
that need the full range of colors or that convert an image before an operation and then convert
back.
If conversion adds the alpha channel, its value will set to the maximum of corresponding channel
range: 255 for CV_8U, 65535 for CV_16U, 1 for CV_32F.
@param src input image: 8-bit unsigned, 16-bit unsigned ( CV_16UC... ), or single-precision
floating-point.
@param dst output image of the same size and depth as src.
@param code color space conversion code (see #ColorConversionCodes).
@param dstCn number of channels in the destination image; if the parameter is 0, the number of the
channels is derived automatically from src and code.
@see @ref imgproc_color_conversions
*/
CV_EXPORTS_W void cvtColor( InputArray src, OutputArray dst, int code, int dstCn = 0 );
- c++ demo:
#include <opencv2/opencv.hpp>
int main() {
cv::Mat image, gray_image;
image = cv::imread("amy.png", cv::IMREAD_COLOR);
cv::cvtColor(image, gray_image, cv::COLOR_BGR2GRAY); // 转换为灰度图像
cv::imshow("gray image",gray_image);
cv::waitKey(0); // 等待用户按键
return 0;
}
- 结果:
图像滤波
- 函数:GaussianBlur()、medianBlur()、bilateralFilter()、boxFilter()、sqrBoxFilter()等。
- GaussianBlur()函数声明:
/** @brief Blurs an image using a Gaussian filter.
The function convolves the source image with the specified Gaussian kernel. In-place filtering is
supported.
@param src input image; the image can have any number of channels, which are processed
independently, but the depth should be CV_8U, CV_16U, CV_16S, CV_32F or CV_64F.
@param dst output image of the same size and type as src.
@param ksize Gaussian kernel size. ksize.width and ksize.height can differ but they both must be
positive and odd. Or, they can be zero's and then they are computed from sigma.
@param sigmaX Gaussian kernel standard deviation in X direction.
@param sigmaY Gaussian kernel standard deviation in Y direction; if sigmaY is zero, it is set to be
equal to sigmaX, if both sigmas are zeros, they are computed from ksize.width and ksize.height,
respectively (see #getGaussianKernel for details); to fully control the result regardless of
possible future modifications of all this semantics, it is recommended to specify all of ksize,
sigmaX, and sigmaY.
@param borderType pixel extrapolation method, see #BorderTypes. #BORDER_WRAP is not supported.
@sa sepFilter2D, filter2D, blur, boxFilter, bilateralFilter, medianBlur
*/
CV_EXPORTS_W void GaussianBlur( InputArray src, OutputArray dst, Size ksize,
double sigmaX, double sigmaY = 0,
int borderType = BORDER_DEFAULT );
- c++ demo :
#include <opencv2/opencv.hpp>
int main() {
cv::Mat image, blurred_image;
image = cv::imread("amy.png", cv::IMREAD_COLOR);
cv::GaussianBlur(image, blurred_image, cv::Size(5, 5), 0); // 高斯模糊
cv::imshow("gray image", blurred_image);
cv::waitKey(0); // 等待用户按键
return 0;
}
- 结果:
图像边缘检测
- 函数:Canny()等
- Canny()函数声明:
/** @brief Finds edges in an image using the Canny algorithm @cite Canny86 .
The function finds edges in the input image and marks them in the output map edges using the
Canny algorithm. The smallest value between threshold1 and threshold2 is used for edge linking. The
largest value is used to find initial segments of strong edges. See
<http://en.wikipedia.org/wiki/Canny_edge_detector>
@param image 8-bit input image.
@param edges output edge map; single channels 8-bit image, which has the same size as image .
@param threshold1 first threshold for the hysteresis procedure.
@param threshold2 second threshold for the hysteresis procedure.
@param apertureSize aperture size for the Sobel operator.
@param L2gradient a flag, indicating whether a more accurate \f$L_2\f$ norm
\f$=\sqrt{(dI/dx)^2 + (dI/dy)^2}\f$ should be used to calculate the image gradient magnitude (
L2gradient=true ), or whether the default \f$L_1\f$ norm \f$=|dI/dx|+|dI/dy|\f$ is enough (
L2gradient=false ).
*/
CV_EXPORTS_W void Canny( InputArray image, OutputArray edges,
double threshold1, double threshold2,
int apertureSize = 3, bool L2gradient = false );
- c++ demo:
#include <opencv2/opencv.hpp>
int main() {
cv::Mat image, edges;
image = cv::imread("amy.png", cv::IMREAD_COLOR);
cv::Canny(image, edges, 100, 200); // Canny边缘检测
cv::imshow("edges image", edges);
cv::waitKey(0); // 等待用户按键
return 0;
}
- 结果:
图像缩放
- 函数:resize()
- resize()函数声明
/** @brief Resizes an image.
The function resize resizes the image src down to or up to the specified size. Note that the
initial dst type or size are not taken into account. Instead, the size and type are derived from
the `src`,`dsize`,`fx`, and `fy`. If you want to resize src so that it fits the pre-created dst,
you may call the function as follows:
@code
// explicitly specify dsize=dst.size(); fx and fy will be computed from that.
resize(src, dst, dst.size(), 0, 0, interpolation);
@endcode
If you want to decimate the image by factor of 2 in each direction, you can call the function this
way:
@code
// specify fx and fy and let the function compute the destination image size.
resize(src, dst, Size(), 0.5, 0.5, interpolation);
@endcode
To shrink an image, it will generally look best with #INTER_AREA interpolation, whereas to
enlarge an image, it will generally look best with #INTER_CUBIC (slow) or #INTER_LINEAR
(faster but still looks OK).
@param src input image.
@param dst output image; it has the size dsize (when it is non-zero) or the size computed from
src.size(), fx, and fy; the type of dst is the same as of src.
@param dsize output image size; if it equals zero (`None` in Python), it is computed as:
\f[\texttt{dsize = Size(round(fx*src.cols), round(fy*src.rows))}\f]
Either dsize or both fx and fy must be non-zero.
@param fx scale factor along the horizontal axis; when it equals 0, it is computed as
\f[\texttt{(double)dsize.width/src.cols}\f]
@param fy scale factor along the vertical axis; when it equals 0, it is computed as
\f[\texttt{(double)dsize.height/src.rows}\f]
@param interpolation interpolation method, see #InterpolationFlags
@sa warpAffine, warpPerspective, remap
*/
CV_EXPORTS_W void resize( InputArray src, OutputArray dst,
Size dsize, double fx = 0, double fy = 0,
int interpolation = INTER_LINEAR );
- c++ demo:
#include <opencv2/opencv.hpp>
int main() {
cv::Mat image, resized_image;
image = cv::imread("amy.png", cv::IMREAD_COLOR);
cv::resize(image, resized_image, cv::Size(), 0.5, 0.5, cv::INTER_LINEAR); // 缩放图像到原来的一半大小
cv::imshow("resized_image", resized_image);
cv::waitKey(0); // 等待用户按键
return 0;
}
- 结果:
图像旋转
- 函数:warpAffine()
- warpAffine()函数声明:
/** @brief Applies an affine transformation to an image.
The function warpAffine transforms the source image using the specified matrix:
\f[\texttt{dst} (x,y) = \texttt{src} ( \texttt{M} _{11} x + \texttt{M} _{12} y + \texttt{M} _{13}, \texttt{M} _{21} x + \texttt{M} _{22} y + \texttt{M} _{23})\f]
when the flag #WARP_INVERSE_MAP is set. Otherwise, the transformation is first inverted
with #invertAffineTransform and then put in the formula above instead of M. The function cannot
operate in-place.
@param src input image.
@param dst output image that has the size dsize and the same type as src .
@param M \f$2\times 3\f$ transformation matrix.
@param dsize size of the output image.
@param flags combination of interpolation methods (see #InterpolationFlags) and the optional
flag #WARP_INVERSE_MAP that means that M is the inverse transformation (
\f$\texttt{dst}\rightarrow\texttt{src}\f$ ).
@param borderMode pixel extrapolation method (see #BorderTypes); when
borderMode=#BORDER_TRANSPARENT, it means that the pixels in the destination image corresponding to
the "outliers" in the source image are not modified by the function.
@param borderValue value used in case of a constant border; by default, it is 0.
@sa warpPerspective, resize, remap, getRectSubPix, transform
*/
CV_EXPORTS_W void warpAffine( InputArray src, OutputArray dst,
InputArray M, Size dsize,
int flags = INTER_LINEAR,
int borderMode = BORDER_CONSTANT,
const Scalar& borderValue = Scalar());
- c++ demo:
#include <opencv2/opencv.hpp>
int main() {
cv::Mat image, rotated_image;
image = cv::imread("amy.png", cv::IMREAD_COLOR);
cv::Mat rotation_matrix = cv::getRotationMatrix2D(cv::Point2f(image.cols / 2.0, image.rows / 2.0), 45, 1.0); // 旋转矩阵
cv::warpAffine(image, rotated_image, rotation_matrix, image.size()); // 应用旋转
cv::imshow("rotated_image", rotated_image);
cv::waitKey(0); // 等待用户按键
return 0;
}
- 结果:
图像阈值处理
- 函数:threshold()
- threshold()函数声明:
/** @brief Applies a fixed-level threshold to each array element.
The function applies fixed-level thresholding to a multiple-channel array. The function is typically
used to get a bi-level (binary) image out of a grayscale image ( #compare could be also used for
this purpose) or for removing a noise, that is, filtering out pixels with too small or too large
values. There are several types of thresholding supported by the function. They are determined by
type parameter.
Also, the special values #THRESH_OTSU or #THRESH_TRIANGLE may be combined with one of the
above values. In these cases, the function determines the optimal threshold value using the Otsu's
or Triangle algorithm and uses it instead of the specified thresh.
@note Currently, the Otsu's and Triangle methods are implemented only for 8-bit single-channel images.
@param src input array (multiple-channel, 8-bit or 32-bit floating point).
@param dst output array of the same size and type and the same number of channels as src.
@param thresh threshold value.
@param maxval maximum value to use with the #THRESH_BINARY and #THRESH_BINARY_INV thresholding
types.
@param type thresholding type (see #ThresholdTypes).
@return the computed threshold value if Otsu's or Triangle methods used.
@sa adaptiveThreshold, findContours, compare, min, max
*/
CV_EXPORTS_W double threshold( InputArray src, OutputArray dst,
double thresh, double maxval, int type );
- c++ demo:
#include <opencv2/opencv.hpp>
int main() {
cv::Mat image, thresholded_image;
image = cv::imread("amy.png", cv::IMREAD_COLOR);
cv::threshold(image, thresholded_image, 127, 255, cv::THRESH_BINARY); // 二值化阈值处理
cv::imshow("thresholded_image", thresholded_image);
cv::waitKey(0); // 等待用户按键
return 0;
}
- 输出结果: