How to scan images, lookup tables and time measurement with OpenCV
- Goal
- 目标
- Our test case
- 我们的测试用例
- How is the image matrix stored in memory?
Goal
We’ll seek answers for the following questions:
How to go through each and every pixel of an image?
How are OpenCV matrix values stored?
How to measure the performance of our algorithm?
What are lookup tables and why use them?
目标
我们将寻求以下问题的答案:
如何遍历图像的每个像素?
如何存储 OpenCV 矩阵值?
如何衡量我们算法的性能?
什么是查找表以及为什么使用它们?
Our test case
Let us consider a simple color reduction method. By using the unsigned char C and C++ type for matrix item storing, a channel of pixel may have up to 256 different values. For a three channel image this can allow the formation of way too many colors (16 million to be exact). Working with so many color shades may give a heavy blow to our algorithm performance. However, sometimes it is enough to work with a lot less of them to get the same final result.
In this cases it’s common that we make a color space reduction. This means that we divide the color space current value with a new input value to end up with fewer colors. For instance every value between zero and nine takes the new value zero, every value between ten and nineteen the value ten and so on.
When you divide an uchar (unsigned char - aka values between zero and 255) value with an int value the result will be also char. These values may only be char values. Therefore, any fraction will be rounded down. Taking advantage of this fact the upper operation in the uchar domain may be expressed as:
我们的测试用例
让我们考虑一种简单的颜色还原方法。通过使用unsigned char C和C++类型来存储矩阵项,一个像素通道最多可以有256个不同的值。对于三通道图像,这可能会形成太多颜色(准确地说是 1600 万种颜色)。使用如此多的色调可能会对我们的算法性能造成沉重打击。然而,有时使用更少的数量就足以获得相同的最终结果。
在这种情况下,我们通常会减少色彩空间。这意味着我们将颜色空间当前值除以新的输入值,最终得到更少的颜色。例如,0 到 9 之间的每个值都采用新值 0,10 到 19 之间的每个值都采用值 10,依此类推。
当您将 uchar(无符号字符 - 又名 0 到 255 之间的值)值除以 int 时value 结果也将是 char。这些值只能是 char 值。因此,任何分数都会向下舍入。利用这一事实,uchar域中的上层操作可以表示为:
I n e w = I o l d 10 ∗ 10 I_{new} = \frac{I_{old}}{10} *10 Inew=10Iold∗10
A simple color space reduction algorithm would consist of just passing through every pixel of an image matrix and applying this formula. It’s worth noting that we do a divide and a multiplication operation. These operations are bloody expensive for a system. If possible it’s worth avoiding them by using cheaper operations such as a few subtractions, addition or in best case a simple assignment. Furthermore, note that we only have a limited number of input values for the upper operation. In case of the uchar system this is 256 to be exact.
简单的色彩空间缩减算法包括仅遍历图像矩阵的每个像素并应用此公式。值得注意的是,我们进行了除法和乘法运算。这些操作对于系统来说非常昂贵。如果可能的话,值得通过使用更便宜的运算来避免它们,例如一些减法、加法或在最好的情况下进行简单的赋值。此外,请注意,对于上层操作,我们只有有限数量的输入值。对于 uchar 系统,准确地说是 256。
Therefore, for larger images it would be wise to calculate all possible values beforehand and during the assignment just make the assignment, by using a lookup table. Lookup tables are simple arrays (having one or more dimensions) that for a given input value variation holds the final output value. Its strength is that we do not need to make the calculation, we just need to read the result.
因此,对于较大的图像,明智的做法是预先计算所有可能的值,并在分配期间使用查找表进行分配。查找表是简单的数组(具有一维或多维),对于给定的输入值变化保存最终的输出值。它的优点是我们不需要进行计算,我们只需要读取结果。
Our test case program (and the code sample below) will do the following: read in an image passed as a command line argument (it may be either color or grayscale) and apply the reduction with the given command line argument integer value. In OpenCV, at the moment there are three major ways of going through an image pixel by pixel. To make things a little more interesting we’ll make the scanning of the image using each of these methods, and print out how long it took.
You can download the full source code here or look it up in the samples directory of OpenCV at the cpp tutorial code for the core section. Its basic usage is:
我们的测试用例程序(以及下面的代码示例)将执行以下操作:读入作为命令行参数传递的图像(可以是彩色或灰度),并使用给定的命令行参数整数值应用缩减。在 OpenCV 中,目前有三种主要方式逐像素浏览图像。为了让事情变得更有趣,我们将使用这些方法中的每一种来扫描图像,并打印出花费的时间。
您可以在此处下载完整的源代码或在示例目录中查找它OpenCV教程中的cpp代码为核心部分。其基本用法是:
int divideWith = 0; // convert our input string to number - C++ style
stringstream s;
s << argv[2];
s >> divideWith;
if (!s || !divideWith)
{
cout << "Invalid number entered for dividing. " << endl;
return -1;
}
uchar table[256];
for (int i = 0; i < 256; ++i)
table[i] = (uchar)(divideWith * (i/divideWith));
Here we first use the C++ stringstream class to convert the third command line argument from text to an integer format. Then we use a simple look and the upper formula to calculate the lookup table. No OpenCV specific stuff here.
Another issue is how do we measure time? Well OpenCV offers two simple functions to achieve this cv::getTickCount() and cv::getTickFrequency() . The first returns the number of ticks of your systems CPU from a certain event (like since you booted your system). The second returns how many times your CPU emits a tick during a second. So, measuring amount of time elapsed between two operations is as easy as:
这里我们首先使用 C++ stringstream 类将第三个命令行参数从文本转换为整数格式。然后我们用简单的查找和上式来计算查找表。这里没有 OpenCV 特定的内容。
另一个问题是我们如何测量时间? OpenCV 提供了两个简单的函数来实现 cv::getTickCount() 和 cv::getTickFrequency() 。第一个返回特定事件(例如自启动系统以来)中系统 CPU 的滴答数。第二个返回 CPU 在一秒钟内发出滴答声的次数。因此,测量两个操作之间经过的时间很简单:
double t = (double)getTickCount();
// do something ...
t = ((double)getTickCount() - t)/getTickFrequency();
cout << "Times passed in seconds: " << t << endl;
How is the image matrix stored in memory?
图像矩阵如何存储在内存中?
As you could already read in my Mat - The Basic Image Container tutorial the size of the matrix depends on the color system used. More accurately, it depends on the number of channels used. In case of a grayscale image we have something like:
正如您已经在我的 Mat - 基本图像容器教程中读到的那样,矩阵的大小取决于所使用的颜色系统。更准确地说,这取决于所使用的通道数量。对于灰度图像,我们有类似的东西:
col 0 | col 1 | … | col m | |
---|---|---|---|---|
row 0 | 0,0 | 0,1 | … | 0,m |
row 1 | 1,0 | |||
row … | …,0 | |||
row n | n,0 | n,m |
For example in case of an BGR color system:
例如,对于 BGR 颜色系统:
col 0 - B | col 0 - G | col 0 - R | |
---|---|---|---|
row 0 | |||
row 1 | |||
row … |
Note that the order of the channels is inverse: BGR instead of RGB. Because in many cases the memory is large enough to store the rows in a successive fashion the rows may follow one after another, creating a single long row. Because everything is in a single place following one after another this may help to speed up the scanning process. We can use the cv::Mat::isContinuous() function to ask the matrix if this is the case. Continue on to the next section to find an example.
请注意,通道的顺序是相反的:BGR 而不是 RGB。因为在许多情况下,内存足够大,可以以连续的方式存储行,所以行可能会一个接一个地跟随,从而创建一个长行。因为所有内容都在一个地方,一个接一个,这可能有助于加快扫描过程。我们可以使用 cv::Mat::isContinously() 函数来询问矩阵是否是这种情况。继续下一节以查找示例。
The efficient way
When it comes to performance you cannot beat the classic C style operator[] (pointer) access. Therefore, the most efficient method we can recommend for making the assignment is:
有效的方法
就性能而言,经典的 C 风格运算符[](指针)访问无可比拟。因此,我们推荐的最有效的分配方法是:
Mat& ScanImageAndReduceC(Mat& I, const uchar* const table)
{
// accept only char type matrices
CV_Assert(I.depth() == CV_8U);
int channels = I.channels();
int nRows = I.rows;
int nCols = I.cols * channels;
if (I.isContinuous())
{
nCols *= nRows;
nRows = 1;
}
int i,j;
uchar* p;
for( i = 0; i < nRows; ++i)
{
p = I.ptr<uchar>(i);
for ( j = 0; j < nCols; ++j)
{
p[j] = table[p[j]];
}
}
return I;
}
Here we basically just acquire a pointer to the start of each row and go through it until it ends. In the special case that the matrix is stored in a continuous manner we only need to request the pointer a single time and go all the way to the end. We need to look out for color images: we have three channels so we need to pass through three times more items in each row.
在这里,我们基本上只是获取一个指向每行开头的指针并遍历它直到它结束。在矩阵以连续方式存储的特殊情况下,我们只需要请求一次指针并一直走到末尾。我们需要留意彩色图像:我们有三个通道,因此我们需要在每行中传递三倍以上的项目。
There’s another way of this. The data data member of a Mat object returns the pointer to the first row, first column. If this pointer is null you have no valid input in that object. Checking this is the simplest method to check if your image loading was a success. In case the storage is continuous we can use this to go through the whole data pointer. In case of a grayscale image this would look like:
还有另一种方法。 Mat 对象的 data 数据成员返回指向第一行、第一列的指针。如果该指针为空,则该对象中没有有效输入。检查这是检查图像加载是否成功的最简单方法。如果存储是连续的,我们可以使用它来遍历整个数据指针。如果是灰度图像,则如下所示:
uchar* p = I.data;
for( unsigned int i = 0; i < ncol*nrows; ++i)
*p++ = table[*p];
You would get the same result. However, this code is a lot harder to read later on. It gets even harder if you have some more advanced technique there. Moreover, in practice I’ve observed you’ll get the same performance result (as most of the modern compilers will probably make this small optimization trick automatically for you).
你会得到相同的结果。然而,这段代码以后很难阅读。如果你有一些更先进的技术,事情就会变得更加困难。此外,在实践中,我观察到您将获得相同的性能结果(因为大多数现代编译器可能会自动为您执行这个小优化技巧)。
The iterator (safe) method
In case of the efficient way making sure that you pass through the right amount of uchar fields and to skip the gaps that may occur between the rows was your responsibility. The iterator method is considered a safer way as it takes over these tasks from the user. All you need to do is to ask the begin and the end of the image matrix and then just increase the begin iterator until you reach the end. To acquire the value pointed by the iterator use the
∗
*
∗ operator (add it before it).
迭代器(安全)方法
如果采用有效的方法,确保您传递正确数量的 uchar 字段并跳过行之间可能出现的间隙是您的责任。迭代器方法被认为是一种更安全的方法,因为它从用户手中接管了这些任务。您需要做的就是询问图像矩阵的开始和结束,然后增加开始迭代器直到到达结束。要获取迭代器指向的值,请使用 * 运算符(将其添加在其前面)。
Mat& ScanImageAndReduceIterator(Mat& I, const uchar* const table)
{
// accept only char type matrices
CV_Assert(I.depth() == CV_8U);
const int channels = I.channels();
switch(channels)
{
case 1:
{
MatIterator_<uchar> it, end;
for( it = I.begin<uchar>(), end = I.end<uchar>(); it != end; ++it)
*it = table[*it];
break;
}
case 3:
{
MatIterator_<Vec3b> it, end;
for( it = I.begin<Vec3b>(), end = I.end<Vec3b>(); it != end; ++it)
{
(*it)[0] = table[(*it)[0]];
(*it)[1] = table[(*it)[1]];
(*it)[2] = table[(*it)[2]];
}
}
}
return I;
}
In case of color images we have three uchar items per column. This may be considered a short vector of uchar items, that has been baptized in OpenCV with the Vec3b name. To access the n-th sub column we use simple operator[] access. It’s important to remember that OpenCV iterators go through the columns and automatically skip to the next row. Therefore in case of color images if you use a simple uchar iterator you’ll be able to access only the blue channel values.
对于彩色图像,每列有三个 uchar 项。这可以被认为是 uchar 项的短向量,它已经在 OpenCV 中以 Vec3b 名称进行了baptized。为了访问第n个子列,我们使用简单的operator[]访问。请务必记住,OpenCV 迭代器会遍历各列并自动跳到下一行。因此,对于彩色图像,如果您使用简单的 uchar 迭代器,您将只能访问蓝色通道值。
On-the-fly address calculation with reference returning
The final method isn’t recommended for scanning. It was made to acquire or modify somehow random elements in the image. Its basic usage is to specify the row and column number of the item you want to access. During our earlier scanning methods you could already notice that it is important through what type we are looking at the image. It’s no different here as you need to manually specify what type to use at the automatic lookup. You can observe this in case of the grayscale images for the following source code (the usage of the + cv::Mat::at() function):
即时地址计算并返回引用
不建议扫描最终方法。它是为了以某种方式获取或修改图像中的随机元素。它的基本用法是指定要访问的项目的行号和列号。在我们早期的扫描方法中,您可能已经注意到,我们查看图像的类型很重要。这里没有什么不同,因为您需要手动指定自动查找时使用的类型。您可以在以下源代码的灰度图像中观察到这一点(使用 + cv::Mat::at() 函数):
Mat& ScanImageAndReduceRandomAccess(Mat& I, const uchar* const table)
{
// accept only char type matrices
CV_Assert(I.depth() == CV_8U);
const int channels = I.channels();
switch(channels)
{
case 1:
{
for( int i = 0; i < I.rows; ++i)
for( int j = 0; j < I.cols; ++j )
I.at<uchar>(i,j) = table[I.at<uchar>(i,j)];
break;
}
case 3:
{
Mat_<Vec3b> _I = I;
for( int i = 0; i < I.rows; ++i)
for( int j = 0; j < I.cols; ++j )
{
_I(i,j)[0] = table[_I(i,j)[0]];
_I(i,j)[1] = table[_I(i,j)[1]];
_I(i,j)[2] = table[_I(i,j)[2]];
}
I = _I;
break;
}
}
return I;
}
The function takes your input type and coordinates and calculates the address of the queried item. Then returns a reference to that. This may be a constant when you get the value and non-constant when you set the value. As a safety step in debug mode only* there is a check performed that your input coordinates are valid and do exist. If this isn’t the case you’ll get a nice output message of this on the standard error output stream. Compared to the efficient way in release mode the only difference in using this is that for every element of the image you’ll get a new row pointer for what we use the C operator[] to acquire the column element.
If you need to do multiple lookups using this method for an image it may be troublesome and time consuming to enter the type and the at keyword for each of the accesses. To solve this problem OpenCV has a cv::Mat_ data type. It’s the same as Mat with the extra need that at definition you need to specify the data type through what to look at the data matrix, however in return you can use the operator() for fast access of items. To make things even better this is easily convertible from and to the usual cv::Mat data type. A sample usage of this you can see in case of the color images of the function above. Nevertheless, it’s important to note that the same operation (with the same runtime speed) could have been done with the cv::Mat::at function. It’s just a less to write for the lazy programmer trick.
该函数采用您的输入类型和坐标并计算所查询项目的地址。然后返回对此的引用。当您获取值时,这可能是常量,而当您设置值时,这可能是非常量。仅作为调试模式下的安全步骤*,会检查您的输入坐标是否有效且确实存在。如果不是这种情况,您将在标准错误输出流上收到一条很好的输出消息。与发布模式下的有效方法相比,使用此方法的唯一区别是,对于图像的每个元素,您将获得一个新的行指针,用于我们使用 C 运算符 [] 获取列元素。
如果您如果需要使用此方法对图像进行多次查找,则为每次访问输入类型和 at 关键字可能会很麻烦且耗时。为了解决这个问题,OpenCV 有一个 cv::Mat_ 数据类型。它与 Mat 相同,但有额外的需要,即在定义时您需要通过查看数据矩阵来指定数据类型,但作为回报,您可以使用 operator() 来快速访问项目。为了让事情变得更好,可以轻松地在通常的 cv::Mat 数据类型之间进行转换。您可以在上面函数的彩色图像中看到此示例的用法。尽管如此,重要的是要注意,可以使用 cv::Mat::at 函数完成相同的操作(具有相同的运行速度)。对于懒惰的程序员来说,这只是一个少写的技巧。
We can conclude a couple of things. If possible, use the already made functions of OpenCV (instead of reinventing these). The fastest method turns out to be the LUT function. This is because the OpenCV library is multi-thread enabled via Intel Threaded Building Blocks. However, if you need to write a simple image scan prefer the pointer method. The iterator is a safer bet, however quite slower. Using the on-the-fly reference access method for full image scan is the most costly in debug mode. In the release mode it may beat the iterator approach or not, however it surely sacrifices for this the safety trait of iterators.
我们可以得出以下几点结论。如果可能,请使用 OpenCV 已有的函数(而不是重新发明这些函数)。最快的方法是 LUT 函数。这是因为 OpenCV 库通过 Intel 线程构建模块启用了多线程。但是,如果您需要编写简单的图像扫描,则更喜欢指针方法。迭代器是一个更安全的选择,但速度相当慢。在调试模式下,使用即时参考访问方法进行全图像扫描成本最高。在发布模式下,它可能会击败迭代器方法,也可能不会,但它肯定会为此牺牲迭代器的安全特性。