C 语言函数指针 {Pointers to Functions, Function Pointers}
- 1. Pointers to Functions (函数指针)
- 2. Function Pointers (函数指针)
- 2.1. Declaring Function Pointers
- 2.2. Assigning Function Pointers
- 2.3. Calling Function Pointers
- 3. Jump Tables (转移表)
- References
1. Pointers to Functions (函数指针)
jump tables and passing a function pointer as an argument in a function call
函数指针最常见的两个用途是转换表和作为参数传递给另一个函数。
Like any other pointer, a pointer to a function must be initialized to point to something before indirection can be performed on it.
和其他指针一样,对函数指针执行间接访问之前必须把它初始化为指向某个函数。
The following code fragment illustrates one way to initialize a pointer to a function.
int f(int);
int(*pf)(int) = &f;
The second declaration creates pf
, a pointer to a function, and initializes it to point to the function f
. The initialization can also be accomplished with an assignment statement.
第 2 个声明创建了函数指针 pf
,并把它初始化为指向函数 f
。函数指针的初始化也可以通过一条赋值语句来完成。
It is important to have a prototype for f
prior to the initialization, for without it the compiler would be unable to check whether the type of f
agreed with that of pf
.
在函数指针的初始化之前具有 f
的原型是很重要的,否则编译器就无法检查 f
的类型是否与 pf
所指向的类型一致。
The ampersand in the initialization is optional, because the compiler always converts function names
to function pointers
wherever they are used. The ampersand does explicitly what the compiler would have done implicitly anyway.
初始化表达式中的 &
操作符是可选的,因为函数名被使用时总是由编译器把它转换为函数指针。&
操作符只是显式地说明了编译器将隐式执行的任务。
ampersand /ˈæmpəsænd/:n. &
After the pointer has been declared and initialized, there are three ways to call the function:
int ans;
ans = f(25);
ans = (*pf)(25);
ans = pf(25);
The first statement simply calls the function f
by name, though its evaluation is probably not what you expected. The function name f
is first converted to a pointer to the function; the pointer specifies where the function is located. The function call operator then invokes the function by executing the code beginning at this address.
第 1 条语句简单地使用名字调用函数 f
,但它的执行过程可能和你想象的不太一样。函数名 f
首先被转换为一个函数指针,该指针指定函数在内存中的位置。然后,函数调用操作符调用该函数,执行开始于这个地址的代码。
The second statement applies indirection to pf
, which converts the function pointer to a function name. This conversion is not really necessary, because the compiler converts it back to a pointer before applying the function call operator. Nevertheless, this statement has exactly the same effect as the first one.
第 2 条语句对 pf
执行间接访问操作,它把函数指针转换为一个函数名。这个转换并不是真正需要的,因为编译器在执行函数调用操作符之前又会把它转换回去。不过,这条语句的效果和第 1 条语句是完全一样的。
The third statement has the same effect as the first two. Indirection is not needed, because the compiler wants a pointer to the function anyway. This example shows how function pointers are usually used.
第 3 条语句和前两条语句的效果是一样的。间接访问操作并非必需,因为编译器需要的是二个函数指针。这个例子显示了函数指针通常是如何使用的。
The two most common uses of pointers to functions are passing a function pointer as an argument in a function call and jump tables.
两个最常见的用途是把函数指针作为参数传递给函数以及用于转换表。
2. Function Pointers (函数指针)
A function name refers to a fixed function. Sometimes it is useful to call a function to be determined at run time; to do this, you can use a function pointer value
that points to the chosen function (see Pointers).
函数名指的是固定函数。
Pointer-to-function types can be used to declare variables and other data, including array elements, structure fields, and union alternatives. They can also be used for function arguments and return values. These types have the peculiarity that they are never converted automatically to void *
or vice versa. However, you can do that conversion with a cast.
指向函数的指针类型可用于声明变量和其他数据,包括数组元素、结构字段和联合替代项。它们还可用于函数参数和返回值。这些类型的特性是它们永远不会自动转换为 void *
。但是,你可以使用强制类型转换来完成这种转换。
2.1. Declaring Function Pointers
The declaration of a function pointer variable (or structure field) looks almost like a function declaration, except it has an additional *
just before the variable name. Proper nesting requires a pair of parentheses around the two of them. For instance, int (*a) ();
says, “Declare a
as a pointer such that *a
is an int
-returning function.”
正确的嵌套需要用一对括号括住它们两个。
Contrast these three declarations:
// Declare a function returning char *
char *a (char *);
// Declare a pointer to a function returning char
char (*a) (char *);
// Declare a pointer to a function returning char *
char *(*a) (char *);
The possible argument types of the function pointed to are the same as in a function declaration. You can write a prototype that specifies all the argument types:
rettype (*function) (arguments…);
or one that specifies some and leaves the rest unspecified:
rettype (*function) (arguments…, ...);
or one that says there are no arguments:
rettype (*function) (void);
You can also write a non-prototype declaration that says nothing about the argument types:
rettype (*function) ();
For example, here’s a declaration for a variable that should point to some arithmetic function that operates on two double
s:
double (*binary_op) (double, double);
Structure fields, union alternatives, and array elements can be function pointers; so can parameter variables. The function pointer declaration construct can also be combined with other operators allowed in declarations. For instance,
int **(*foo)();
declares foo
as a pointer to a function that returns type int **
, and
int **(*foo[30])();
declares foo
as an array of 30 pointers to functions that return type int **
.
int **(**foo)();
declares foo
as a pointer to a pointer to a function that returns type int **
.
2.2. Assigning Function Pointers
Assuming we have declared the variable binary_op
as in the previous section, giving it a value requires a suitable function to use. So let’s define a function suitable for the variable to point to. Here’s one:
double double_add(double a, double b) {
return a + b;
}
Now we can give it a value:
binary_op = double_add;
The target type of the function pointer must be upward compatible with the type of the function.
There is no need for &
in front of double_add
. Using a function name such as double_add
as an expression automatically converts it to the function’s address, with the appropriate function pointer type. However, it is ok to use &
if you feel that is clearer:
binary_op = &double_add;
2.3. Calling Function Pointers
To call the function specified by a function pointer, just write the function pointer value in a function call. For instance, here’s a call to the function binary_op
points to:
binary_op (x, 5)
Since the data type of binary_op
explicitly specifies type double
for the arguments, the call converts x
and 5
to double
.
The call conceptually dereferences the pointer binary_op
to “get” the function it points to, and calls that function. If you wish, you can explicitly represent the dereference by writing the *
operator:
(*binary_op) (x, 5)
The *
reminds people reading the code that binary_op
is a function pointer rather than the name of a specific function.
*
提醒阅读代码的人 binary_op
是一个函数指针,而不是特定函数的名称。
3. Jump Tables (转移表)
The following code fragment is from a program that implements a pocket calculator. Other parts of the program have already read in two numbers (op1
and op2
) and an operator (oper
). This code tests the operator to determine which function to invoke.
下面的代码段取自一个程序,它用于实现一个袖珍式计算器。程序的其他部分已经读入两个数 (op1
and op2
) 和一个操作符 (oper
)。下面的代码对操作符进行测试,然后决定调用哪个函数。
switch (oper) {
case ADD:
result = add(op1, op2);
break;
case SUB:
result = sub(op1, op2);
break;
case MUL:
result = mul(op1, op2);
break;
case DIV:
result = div(op1, op2);
break;
...
}
It is good design to separate the operations from the code that chooses among them. The more complex operations will certainly be implemented as separate functions because of their size, but even the simple operations may have side effects, such as saving a constant value for later operations.
把具体操作和选择操作的代码分开是一种良好的设计方案。更为复杂的操作将肯定以独立的函数来实现,因为它们的长度可能很长。但即使是简单的操作也可能具有副作用,例如保存一个常量值用于以后的操作。
In order to use a switch
, the codes that represent the operators must be integers. If they are consecutive integers starting with zero, we can use a jump table to accomplish the same thing. A jump table is just an array of pointers to functions.
为了使用 switch
语句,表示操作符的代码必须是整数。如果它们是从零开始连续的整数,我们可以使用转换表来实现相同的任务。转换表就是一个函数指针数组。
There are two steps in creating a jump table. First, an array of pointers to functions is declared and initialized. The only trick is to make sure that the prototypes for the functions appear before the array declaration.
创建一个转换表需要两个步骤。首先,声明并初始化一个函数指针数组。唯一需要留心之处就是确保这些函数的原型出现在这个数组的声明之前。
double add(double, double);
double sub(double, double);
double mul(double, double);
double div(double, double);
...
double (*oper_func[])(double, double) = {
add, sub, mul, div, ...
};
The proper order for the functionsʹ names in the initializer list is determined by the integer codes used to represent each operator in the program. This example assumes that ADD
is zero, SUB
is one, MUL
is two, and so forth.
初始化列表中各个函数名的正确顺序取决于程序中用于表示每个操作符的整型代码。这个例子假定 ADD
是 0,SUB
是 1,MUL
是 2,接下去以此类推。
The second step is to replace the entire switch
statement with this one!
第 2 个步骤是用下面这条语句替换前面整条 switch
语句。
result = oper_func[oper](op1, op2);
oper
selects the correct pointer from the array, and the function call operator executes it.
oper
从数组中选择正确的函数指针,而函数调用操作符将执行这个函数。
An out‐of‐bounds subscript is just as illegal on a jump table as it is on any other array, but it is much more difficult to diagnose.
在转换表中,越界下标引用就像在其他任何数组中一样是不合法的。但一旦出现这种情况,把它诊断出来要困难得多。当这种错误发生时,程序有可能在三个地方终止。首先,如果下标值远远越过了数组的边界,它所标识的位置可能在分配给该程序的内存之外。有些操作系统能检测到这个错误并终止程序,但有些操作系统并不这样做。如果程序被终止,这个错误将在靠近转换表语句的地方被报告,问题相对而言较易诊断。
如果程序并未终止,非法下标所标识的值被提取,处理器跳到该位置。这个不可预测的值可能代表程序中一个有效的地址,但也可能不是这样。如果它不代表一个有效地址,程序此时也会终止,但错误所报告的地址从本质上说是一个随机数。此时,问题的调试就极为困难。
If the random address is in an area in memory that contains data, the program usually aborts very quickly due to an illegal instruction or an illegal operand address (although data values sometimes represent valid instructions, they do not often make any sense).
如果程序此时还未失败,机器将开始执行根据非法下标所获得的虚假地址的指令,此时要调试出问题根源就更为困难了。如果这个随机地址位于一块存储数据的内存中,程序通常会很快终止,这通常是由于非法指令或非法的操作数地址所致 (尽管数据值有时也能代表有效的指令,但并不总是这样)。要想知道机器为什么会到达那个地方,唯一的线索是转移表调用函数时存储于堆栈中的返回地址。如果任何随机指令在执行时修改了堆栈或堆栈指针,那么连这个线索也消失了。
更糟的是,如果这个随机地址恰好位于一个函数的内部,那么该函数就会快乐地执行,修改谁也不知道的数据,直到它运行结束。但是,函数的返回地址并不是该函数所期望的保存于堆栈上的地址,而是另一个随机值。这个值就成为下一个指令的执行地址,计算机将在各个随机地址间跳转,执行位于那里的指令。
问题在于指令破坏了机器如何到达错误最后发生地点的线索。没有了这方面的信息,要查明问题的根源简直难如登天。如果你怀疑转移表有问题,可以在那个函数调用之前和之后各打印一条信息。如果被调用函数不再返回,用这种方法就可以看得很清楚。但困难在于人们很难认识到程序某个部分的失败可以是位于程序中相隔甚远的且不相关部分的一个转移表错误所引起的。
It is much easier to make sure that the subscript used in a jump table is within range in the first place.
一开始,保证转移表所使用的下标位于合法的范围是很容易做到的。
References
[1] Yongqiang Cheng, https://yongqiang.blog.csdn.net/
[2] Pointers on C (C 和指针), https://www.cs.rit.edu/~kar/pointers.on.c/index.html
[3] 22.5 Function Pointers, https://www.gnu.org/software/c-intro-and-ref/manual/html_node/Function-Pointers.html