经过长期的发展, kprobes/uprobes
机制在事件(events)的基础上分别为内核态和用户态提供了追踪调试的功能, 这也构成了 tracepoint 机制的基础, 后期的很多工具, 比如 perf_events
, ftrace
等都是在其基础上演化而来. 参考由 Brendan Gregg 提供的资料来看, kprobes/uprobes
在 Linux 动态追踪层面起到了基石的作用, 如下所示:
uprobe与krobe对应,动态附加到用户态调用函数的切入点称为uprobe,相比如kprobe 内核函数的稳定性,uprobe 的函数由开发者定义。uprobe是用户态的探针,它和kprobe是相对应的,kprobe是内核态的探针。uprobe需要制定用户态探针在执行文件中的位置,插入探针的原理和kprobe类似。
下面是对uprobe使用方法的简单介绍。比如跟踪下面程序中的callee函数调用:
#include <stdio.h>
int callee(void)
{
printf("hello world.\n");
return 0;
}
int main(void)
{
callee();
return 0;
}
Makefile
分别编译地址加载无关和地址相关两个版本的应用程序,分别抓取其PROBE结果。
all:
gcc -O0 -no-pie -fno-pic -g main.c -o main-nopie
gcc -O0 -g main.c -o main
clean:
rm -fr main main-nopie
增加一个新的uprobe event,命令如下(在可执行文件main的0x63a偏移处增加一个uprobe探针),当callee函数被调用时,截获调用事件信息:
$ readelf -s main|grep callee
64: 000000000000063a 23 FUNC GLOBAL DEFAULT 14 callee
$
# echo 'p:callee /home/zlcao/uprobe/main:0x63a' > /sys/kernel/debug/tracing/uprobe_events
增加一个返回事件信息:
echo 'r:callee_ret /home/zlcao/uprobe/main:0x63a' >> /sys/kernel/debug/tracing/uprobe_events
之后使用如下命令查看注册的EVENT事件
cat /sys/kernel/debug/tracing/uprobe_events
定义以后,使能所有的events:
echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable
之后执行测试用例,执行完毕后,通过如下命令查看探测记录。
cat /sys/kernel/debug/tracing/trace
可以看到,callee函数的调用和返回事件都被触发了,输出显示给我们uprobe被触发时:main-27731、程序PC 0x55f1b22cb63a,uretprobe被触发时:PC从函数入口0x55f1b22cb63a返回0x55f1b22cb65a。
之后,通过如下命令关闭探测。
echo 0 > /sys/kernel/debug/tracing/events/uprobes/enable
最后,使用如下命令清除掉所有注册的events。
echo > /sys/kernel/debug/tracing/uprobe_events
添加探测点的步骤比较麻烦,perf很贴心地添加了一键添加探测点的功能,只需要执行一个简单的命令即可:
# perf probe -x ./main-nopie callee
# perf probe -x ./main-nopie callee%return
之后使能探测,运行用例。
# echo 1 > /sys/kernel/debug/tracing/events/probe_main/enable
# ./main-nopie
或者直接执行如下指令运行:
root@zlcao-Vostro-3268:/home/zlcao/uprobe# perf record -e probe_main:callee -aR ./main-nopie
hello world.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.537 MB perf.data (1 samples) ]
root@zlcao-Vostro-3268:/home/zlcao/uprobe#
使用命令perf report查看生成的perf.data文件
查看当前所有的EVENT 探测点
root@zlcao-Vostro-3268:/home/zlcao/uprobe# perf probe -l
kprobes:myprobe (on _do_fork)
kprobes:myretprobe (on _do_fork%return with arg1)
probe_libc:memcpy (on __new_memcpy_ifunc@x86_64/multiarch/memcpy.c in /lib/x86_64-linux-gnu/libc-2.27.so)
probe_main:callee (on callee@/home/zlcao/uprobe/main.c in /home/zlcao/uprobe/main-nopie)
probe_main:callee__return (on callee%return@/home/zlcao/uprobe/main.c in /home/zlcao/uprobe/main-nopie)
root@zlcao-Vostro-3268:/home/zlcao/uprobe#
PIE和NO-PIE应用的区别
通过上面的例子可以看到,默认编译的PIE应用和非PIE应用都可以被事件机制探测调试,那么它们的区被是什么呢?通过下面几幅图可以看出,PIE应用的符号地址是相对的,加载地址是随机的,而非PIE应用的加载地址和运行地址完全是由应用符号表提供的,运行时加载器必须遵守,不能更改。
通过traceing_pipe查看探测结果,/sys/kernel/debug/tracing/trace_pipe文件为同步阻塞式读取文件,执行cat /sys/kernel/debug/tracing/trace_pipe后控制台处于阻塞状态,当运行测试用例时,实时更新探测结果。
通过内核模块截获uprobe事件
#include <linux/module.h>
#include <linux/ptrace.h>
#include <linux/uprobes.h>
#include <linux/namei.h>
#include <linux/moduleparam.h>
MODULE_AUTHOR("john doe");
MODULE_LICENSE("GPL v2");
static char *filename;
module_param(filename, charp, S_IRUGO);
static long offset;
module_param(offset, long, S_IRUGO);
static int handler_pre(struct uprobe_consumer *self, struct pt_regs *regs)
{
pr_info("handler: arg0 = %d arg1 =%d \n", (int)regs->di, (int)regs->si);
return 0;
}
static int handler_ret(struct uprobe_consumer *self,
unsigned long func,
struct pt_regs *regs)
{
pr_info("ret_handler ret = %d \n", (int)regs->ax);
return 0;
}
static struct uprobe_consumer uc = {
.handler = handler_pre,
.ret_handler = handler_ret,
};
static struct inode *inode;
static int __init uprobe_init(void)
{
struct path path;
int ret;
ret = kern_path(filename, LOOKUP_FOLLOW, &path);
if (ret < 0) {
pr_err("kern_path failed, returned %d\n", ret);
return ret;
}
inode = igrab(path.dentry->d_inode);
path_put(&path);
ret = uprobe_register(inode, offset, &uc);
if (ret < 0) {
pr_err("register_uprobe failed, returned %d\n", ret);
return ret;
}
return 0;
}
static void __exit uprobe_exit(void)
{
uprobe_unregister(inode, offset, &uc);
}
module_init(uprobe_init);
module_exit(uprobe_exit);
测试用例和MAKEFILE
#include <stdio.h>
int callee(int a, int b)
{
printf("hello world %d.\n", a + b);
return a+b;
}
int main(void)
{
callee(8, 9);
return 0;
}
ifneq ($(KERNELRELEASE),)
obj-m:=uprobe.o
else
KERNELDIR:=/lib/modules/$(shell uname -r)/build
PWD:=$(shell pwd)
all:
$(MAKE) -C $(KERNELDIR) M=$(PWD) modules
gcc -O0 -no-pie -fno-pic -g main.c -o main-nopie
gcc -O0 -g main.c -o main
clean:
rm -rf *.o *.mod.c *.mod.o *.ko *.symvers *.mod .*.cmd *.order
format:
astyle --options=linux.astyle *.[ch]
endif
测试
uprobe原理
uprobe基于int3 debug指令实现,以下是回调CALLSTACK。
设置uprobe,通过GDB查看callee函数入口,发现其指令已经变为INT3。
探测过程如下图所示,指定位置上的指令,头部修改为软件中断指令(同时原指令存档他处):
- 当执行到该位置时,触发软件中断,陷入内核
- 在内核,执行以 注入的 Handler
- 单步执行原指令
- 修正寄存器和栈,回到原有指令流
如果指令地址对应的页面还没有映射,会通过:
uprobe_write_opcode->get_user_pages_remote->__get_user_pages_locked->__get_user_pages->faultin_page->handle_mm_fault ...
进行PAGE FAULT的处理,之后在写入INT3指令。
Kprobe实践
root@zlcao-Vostro-3268:~# echo 'p:myprobe _do_fork' > /sys/kernel/debug/tracing/kprobe_events
root@zlcao-Vostro-3268:~# echo 'r:myretprobe _do_fork $retval' >> /sys/kernel/debug/tracing/kprobe_events
root@zlcao-Vostro-3268:~# echo 1 > /sys/kernel/debug/tracing/events/kprobes/myprobe/enable
root@zlcao-Vostro-3268:~# echo 1 > /sys/kernel/debug/tracing/events/kprobes/myretprobe/enable
root@zlcao-Vostro-3268:~# cat /sys/kernel/debug/tracing/kprobe_events
p:kprobes/myprobe _do_fork
r4:kprobes/myretprobe _do_fork arg1=$retval
root@zlcao-Vostro-3268:~# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 4/4 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
bash-8783 [002] .... 150455.854930: myprobe: (_do_fork+0x0/0x3e0)
bash-8783 [002] d... 150455.855143: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2322
bash-8783 [001] .... 150464.091940: myprobe: (_do_fork+0x0/0x3e0)
bash-8783 [001] d... 150464.092151: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2323
root@zlcao-Vostro-3268:~# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 80/80 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
bash-8783 [002] .... 150455.854930: myprobe: (_do_fork+0x0/0x3e0)
bash-8783 [002] d... 150455.855143: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2322
bash-8783 [001] .... 150464.091940: myprobe: (_do_fork+0x0/0x3e0)
bash-8783 [001] d... 150464.092151: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2323
gsd-media-keys-2620 [003] .... 150466.748575: myprobe: (_do_fork+0x0/0x3e0)
gsd-media-keys-2620 [003] d... 150466.749024: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2324
gsd-media-keys-8996 [001] .... 150466.749133: myprobe: (_do_fork+0x0/0x3e0)
gsd-media-keys-8996 [001] d... 150466.749647: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2325
gsd-media-keys-2620 [003] .... 150466.750599: myprobe: (_do_fork+0x0/0x3e0)
gsd-media-keys-2620 [003] d... 150466.750610: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2326
gdbus-2656 [000] .... 150466.750718: myprobe: (_do_fork+0x0/0x3e0)
gdbus-2656 [000] d... 150466.750743: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2327
gnome-terminal-8997 [001] .... 150466.797071: myprobe: (_do_fork+0x0/0x3e0)
gnome-terminal-8997 [001] d... 150466.797090: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2328
gnome-terminal-8997 [001] .... 150466.797825: myprobe: (_do_fork+0x0/0x3e0)
gnome-terminal-8997 [001] d... 150466.798023: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2329
gnome-terminal.-9001 [003] .... 150466.812982: myprobe: (_do_fork+0x0/0x3e0)
gnome-terminal.-9001 [003] d... 150466.813005: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232a
gnome-terminal.-9001 [003] .... 150466.813730: myprobe: (_do_fork+0x0/0x3e0)
gnome-terminal.-9001 [003] d... 150466.813742: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232b
gnome-terminal.-9001 [003] .... 150466.834816: myprobe: (_do_fork+0x0/0x3e0)
gnome-terminal.-9001 [003] d... 150466.834839: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232c
gnome-terminal--3324 [000] .... 150466.852493: myprobe: (_do_fork+0x0/0x3e0)
gnome-terminal--3324 [000] d... 150466.852528: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232d
gnome-terminal--3324 [000] .... 150466.852532: myprobe: (_do_fork+0x0/0x3e0)
gnome-terminal--3324 [000] d... 150466.852544: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232e
pool-9006 [003] .... 150466.853913: myprobe: (_do_fork+0x0/0x3e0)
pool-9006 [003] d... 150466.854893: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x232f
xdg-desktop-por-2796 [003] .... 150466.858576: myprobe: (_do_fork+0x0/0x3e0)
xdg-desktop-por-2796 [003] d... 150466.858605: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2330
xdg-desktop-por-2796 [003] .... 150466.858609: myprobe: (_do_fork+0x0/0x3e0)
xdg-desktop-por-2796 [003] d... 150466.858617: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2331
bash-9007 [001] .... 150466.859319: myprobe: (_do_fork+0x0/0x3e0)
bash-9007 [001] d... 150466.859386: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2332
bash-9010 [002] .... 150466.859621: myprobe: (_do_fork+0x0/0x3e0)
bash-9010 [002] d... 150466.859678: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2334
lesspipe-9012 [001] .... 150466.860217: myprobe: (_do_fork+0x0/0x3e0)
lesspipe-9012 [001] d... 150466.860247: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2335
lesspipe-9012 [002] .... 150466.861003: myprobe: (_do_fork+0x0/0x3e0)
lesspipe-9012 [002] d... 150466.861039: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2336
lesspipe-9014 [003] .... 150466.864089: myprobe: (_do_fork+0x0/0x3e0)
lesspipe-9014 [003] d... 150466.864147: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2337
bash-9007 [001] .... 150466.865518: myprobe: (_do_fork+0x0/0x3e0)
bash-9007 [001] d... 150466.865572: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2338
bash-9016 [002] .... 150466.865788: myprobe: (_do_fork+0x0/0x3e0)
bash-9016 [002] d... 150466.865841: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x2339
bash-9007 [000] .... 150467.633920: myprobe: (_do_fork+0x0/0x3e0)
bash-9007 [000] d... 150467.634127: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x233a
vim-9018 [000] .... 150467.760802: myprobe: (_do_fork+0x0/0x3e0)
vim-9018 [000] d... 150467.760936: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x233b
bash-9019 [001] .... 150467.761927: myprobe: (_do_fork+0x0/0x3e0)
bash-9019 [001] d... 150467.761959: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x233c
vim-9018 [000] .... 150468.007578: myprobe: (_do_fork+0x0/0x3e0)
vim-9018 [000] d... 150468.007916: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x233d
vim-9018 [000] .... 150468.069323: myprobe: (_do_fork+0x0/0x3e0)
vim-9018 [000] d... 150468.069689: myretprobe: (__x64_sys_clone+0x84/0xb0 <- _do_fork) arg1=0x233e
参考资料
Linux内核文档,Documentation/trace/uprobetracer.rst
Uprobe-tracer: Uprobe-based Event Tracing — The Linux Kernel documentation
Linux uprobe: User-Level Dynamic Tracing
在kernel中如何实现应用层程序执行时候的指令监控和打印 - 简书
Linux 系统动态追踪技术介绍 | arstercz’s blog
Linux K/Uprobe 使用指南 · GitBook
深入ftrace uprobe原理和功能介绍_uprobe 原理_奇小葩的博客-CSDN博客