前言
在之前 ls 命令 中我们可以看到, ls 命令的执行也是依赖于 opendir, readdir, stat, lstat 等相关操作系统提供的相关系统调用来处理业务
因此 我们这里来进一步看一下 更细节的这些 系统调用
我们这里关注的是 readdir 这个函数, 入口系统调用是 getdents
如下调试基于命令 "ls -l /jerry"
如下调试基于 linux 4.10
readdir 的词条
getdents
封装 getdents_callback, 然后 迭代 f 中的各个文件, getdents_callback 中藏有回调
将数据最终存放于 buf 中
ext4_dx_readdir
如下图 第一个 if block 为填充 info 用于迭代, 大致的工作是 将当前文件夹下的各个文件的相关信息填充到 info.root 中
然后 后面的处理为 迭代 info.root 的整棵树, 然后调用 call_filldir 来填充各个文件的信息, call_filldir 中会委托调用上面的 getdents_callback.ctx.actor
htree_dirblock_to_tree
如下会 迭代 dir 中的各个文件, 然后调用 ext4_htree_store_dirent 将各个文件的信息放到 info.root 中
ext4_htree_store_dirent
复制给定的文件的相关信息到 info.root
这个 info.root 是基于 file->private_data 进行传输的, 具体的外面处理是在 ext4_dx_readdir 函数中
这里是 获取当前节点的 hash, minor_hash, inode, name, name_len, file_type 封装到 fname 中, 然后插入到 info.root[红黑树], 根据 hash, minor_hash 进行排序
关于这个顺序, 我们待会儿会有一个 case 来论证
call_filldir
接着来到外部 ext4_dx_readdir 中, 迭代目录中的各个文件信息, 调用回调填充 数据到 buf
filldir
调用 filldir 向 buf.current_dir 中填充当前 dir 的各个文件信息
这个 buf.current_dir 是从参数传入的一个 用户空间的 dirent, 因此 这里使用了 __put_user 函数
这里向 dirent 中填充了 inode_no, record_len, d->name, 0[字符串结束符], file_type, offset 等相关信息
我们来看一下 填充之后的相关信息
内存中的数据 可以对号入座一下, 这里 省略
(gdb) x /30bc 0xaa3a40
0xaa3a40: 18 '\022' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000'
0xaa3a48: 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000' 0 '\000'
0xaa3a50: 32 ' ' 0 '\000' 84 'T' 101 'e' 115 's' 116 't' 48 '0' 50 '2'
0xaa3a58: 77 'M' 97 'a' 108 'l' 108 'l' 111 'o' 99 'c'
确认一下 Test02Malloc 的 inode_no, 确实是 18
(initramfs) ls -ail /jerry/
29 -rwxr-xr-x 1 8912 Test19UdpClient02
32 -rw-r--r-- 1 860 Test19UdpServer.c
21 -rw-r--r-- 1 1036 Test05SocketClient.c
12 -rw-r--r-- 1 7 1.txt
35 -rw-r--r-- 1 2 4.txt
16 -rwxr-xr-x 1 913944 Test01SumStatic
22 -rwxr-xr-x 1 9232 Test05SocketServer
14 -rwxr-xr-x 1 9784 Test01Sum
11 drwx------ 2 12288 lost+found
2 drwxr-xr-x 3 1024 .
25 -rw-r--r-- 1 2213 Test18UdpClient.c
30 -rw-r--r-- 1 790 Test19UdpClient.c
28 -rwxr-xr-x 1 8912 Test19UdpClient
24 -rwxr-xr-x 1 13656 Test18UdpClient
13 -rwxr-xr-x 1 44168 ping
23 -rw-r--r-- 1 1839 Test05SocketServer.c
34 -rw-r--r-- 1 2 3.txt
17 -rw-r--r-- 1 8828 Test01Sum.txt
19 -rw-r--r-- 1 112 Test02Malloc.c
36 -rw-r--r-- 1 0 2.xml
1 drwxr-xr-x 18 0 ..
33 -rw-r--r-- 1 2 2.txt
31 -rwxr-xr-x 1 9008 Test19UdpServer
20 -rwxr-xr-x 1 9208 Test05SocketClient
15 -rw-r--r-- 1 127 Test01Sum.c
27 -rw-r--r-- 1 2139 Test18UdpServer.c
26 -rwxr-xr-x 1 13656 Test18UdpServer
18 -rwxr-xr-x 1 9898 Test02Malloc
(initramfs)
回顾一下 ls 中的使用
使用的是 系统调用获取到的 file_type, file_name, inode_no 等等
readdir 中获取的文件顺序
如下 摘录出 /jerry 中各个文件, 以及其 hash
然后根据 hash 进行排序, 输出各个文件的顺序, 我们比较一下 和 "ls -l /jerry" 的顺序的一下关系, 联系
/**
* Test13ResolveFileAndHash
*
* @author Jerry.X.He <970655147@qq.com>
* @version 1.0
* @date 2022-08-06 10:58
*/
public class Test13ResolveFileAndHash {
// Test13ResolveFileAndHash
public static void main(String[] args) {
String lines = "(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
". 2361201130\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=1798131950, minor_hash=3795156168, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
".. 1798131950\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2638309314, minor_hash=220112255, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
"lost+found 2638309314\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=4147007512, minor_hash=1467808689, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
"1.txt 4147007512\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2218817754, minor_hash=2900089684, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
"ping\u000E 2218817754\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2722591116, minor_hash=3228507950, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
"Test01Sum 2722591116\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=582633220, minor_hash=3262287479, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
"Test01Sum.c 582633220\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=3631608018, minor_hash=2725415301, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
"Test01SumStatic 3631608018\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2142992528, minor_hash=928223017, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
"Test01Sum.txt 2142992528\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=220227180, minor_hash=2471538305, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
"Test02Malloc\u0013 220227180\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=1840751322, minor_hash=137392396, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"(gdb) printf \"%s %ld\", ent_name->name, hash\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=857331192, minor_hash=434642936, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test05SocketClient 857331192\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=4158704484, minor_hash=312643960, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test05SocketClient.c\u0016 4158704484\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=3462659828, minor_hash=2883930437, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test05SocketServer 3462659828\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2205009970, minor_hash=314116339, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test05SocketServer.c\u0018 2205009970\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2263127060, minor_hash=2266183803, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test18UdpClient 2263127060\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2342703042, minor_hash=2944388140, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test18UdpClient.c 2342703042\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=428460190, minor_hash=2134201002, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test18UdpServer 428460190\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=493987328, minor_hash=2169830099, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test18UdpServer.c 493987328\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2302229242, minor_hash=322069301, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test19UdpClient 2302229242\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=4262479554, minor_hash=1848801320, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test19UdpClient02 4262479554\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2304662458, minor_hash=1793028086, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test19UdpClient.c 2304662458\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=888054094, minor_hash=304564512, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test19UdpServer 888054094\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=4198063328, minor_hash=3673609025, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"Test19UdpServer.c 4198063328\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=1673380854, minor_hash=394531314, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"2.txt 1673380854\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=2170292718, minor_hash=758117636, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"3.txt 2170292718\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=4053642864, minor_hash=2642363966, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"4.txt 4053642864\n" +
"Breakpoint 6, ext4_htree_store_dirent (dir_file=<optimized out>, hash=1832402012, minor_hash=2399099147, dirent=<optimized out>, ent_name=0xffffc9000074bcb8) at fs/ext4/dir.c:459\n" +
"459\t\tnew_fn->name[ent_name->len] = 0;\n" +
"2.xml 1832402012 \n";
Map<Long, String> hash2FileName = new TreeMap<>();
for (String line : lines.split("\n")) {
if (line.contains("ext4_htree_store_dirent")) {
continue;
}
if (line.contains("new_fn->name")) {
continue;
}
if (line.contains("printf")) {
continue;
}
// System.out.println(line);
String[] splits = line.split("\\s+");
hash2FileName.put(Long.parseLong(splits[1]), splits[0]);
}
for (Map.Entry<Long, String> entry : hash2FileName.entrySet()) {
System.out.println(entry.getValue());
}
}
}
文件顺序如下, 呵呵 是不是和 "ls -l /jerry" 的顺序差不多, 只是顺序是反的
从 readdir 中读取的文件的顺序和 上面的 Test13ResolveFileAndHash 的顺序一致, 那就是 外围的 ls 的处理可能导致的这个顺序上的差异
类似于 coreutils 中 ls 是有 根据文件名排序, 根据扩展名排序, 根据文件大小排序, 根据版本排序, 根据时间排序
只是 qemu虚拟机中的 ls 的排序 是另外一种排序, 并且有一些 奇怪
Test02Malloc
Test18UdpServer
Test18UdpServer.c
Test01Sum.c
Test05SocketClient
Test19UdpServer
2.txt
..
2.xml
Test01Sum.txt
3.txt
Test05SocketServer.c
ping
Test18UdpClient
Test19UdpClient
Test19UdpClient.c
Test18UdpClient.c
.
lost+found
Test01Sum
Test05SocketServer
Test01SumStatic
4.txt
1.txt
Test05SocketClient.c
Test19UdpServer.c
Test19UdpClient02
readdir 的顺序, 抽样前两个
"ls -ail /jerry" 的顺序
(initramfs) ls -ail /jerry
29 -rwxr-xr-x 1 8912 Test19UdpClient02
32 -rw-r--r-- 1 860 Test19UdpServer.c
21 -rw-r--r-- 1 1036 Test05SocketClient.c
12 -rw-r--r-- 1 7 1.txt
35 -rw-r--r-- 1 2 4.txt
16 -rwxr-xr-x 1 913944 Test01SumStatic
22 -rwxr-xr-x 1 9232 Test05SocketServer
14 -rwxr-xr-x 1 9784 Test01Sum
11 drwx------ 2 12288 lost+found
2 drwxr-xr-x 3 1024 .
25 -rw-r--r-- 1 2213 Test18UdpClient.c
30 -rw-r--r-- 1 790 Test19UdpClient.c
28 -rwxr-xr-x 1 8912 Test19UdpClient
24 -rwxr-xr-x 1 13656 Test18UdpClient
13 -rwxr-xr-x 1 44168 ping
23 -rw-r--r-- 1 1839 Test05SocketServer.c
34 -rw-r--r-- 1 2 3.txt
17 -rw-r--r-- 1 8828 Test01Sum.txt
19 -rw-r--r-- 1 112 Test02Malloc.c
36 -rw-r--r-- 1 0 2.xml
1 drwxr-xr-x 18 0 ..
33 -rw-r--r-- 1 2 2.txt
31 -rwxr-xr-x 1 9008 Test19UdpServer
20 -rwxr-xr-x 1 9208 Test05SocketClient
15 -rw-r--r-- 1 127 Test01Sum.c
27 -rw-r--r-- 1 2139 Test18UdpServer.c
26 -rwxr-xr-x 1 13656 Test18UdpServer
18 -rwxr-xr-x 1 9898 Test02Malloc
调试虚拟机 ls 命令帮助文档如下
(initramfs) ls -help
ls: invalid option -- 'h'
BusyBox v1.22.1 (Ubuntu 1:1.22.0-15ubuntu1) multi-call binary.
Usage: ls [-1AaCxdLHFplins] [FILE]...
List directory contents
-1 One column output
-a Include entries which start with .
-A Like -a, but exclude . and ..
-C List by columns
-x List by lines
-d List directory entries instead of contents
-L Follow symlinks
-H Follow symlinks on command line
-p Append / to dir entries
-F Append indicator (one of */=@|) to entries
-l Long listing format
-i List inode numbers
-n List numeric UIDs and GIDs instead of names
-s List allocated blocks
呵呵 从宿主机 ubuntu 拿到的顺序又不一样
root@ubuntu:/jerryDisk/linux-4.10.14# ls -ail images/share/
total 1075
2 drwxr-xr-x 3 root root 1024 May 4 00:57 .
414752 drwxr-xr-x 5 root root 4096 May 4 00:58 ..
12 -rw-r--r-- 1 root root 7 May 4 00:57 1.txt
33 -rw-r--r-- 1 root root 2 Jul 30 19:03 2.txt
36 -rw-r--r-- 1 root root 0 Jul 30 20:13 2.xml
34 -rw-r--r-- 1 root root 2 Jul 30 19:03 3.txt
35 -rw-r--r-- 1 root root 2 Jul 30 19:03 4.txt
14 -rwxr-xr-x 1 root root 9784 May 4 00:57 Test01Sum
15 -rw-r--r-- 1 root root 127 May 4 00:57 Test01Sum.c
17 -rw-r--r-- 1 root root 8828 May 4 00:57 Test01Sum.txt
16 -rwxr-xr-x 1 root root 913944 May 4 00:57 Test01SumStatic
18 -rwxr-xr-x 1 root root 9898 Jul 30 19:05 Test02Malloc
19 -rw-r--r-- 1 root root 112 May 4 00:57 Test02Malloc.c
20 -rwxr-xr-x 1 root root 9208 May 4 00:57 Test05SocketClient
21 -rw-r--r-- 1 root root 1036 May 4 00:57 Test05SocketClient.c
22 -rwxr-xr-x 1 root root 9232 May 4 00:57 Test05SocketServer
23 -rw-r--r-- 1 root root 1839 May 4 00:57 Test05SocketServer.c
24 -rwxr-xr-x 1 root root 13656 May 4 00:57 Test18UdpClient
25 -rw-r--r-- 1 root root 2213 May 4 00:57 Test18UdpClient.c
26 -rwxr-xr-x 1 root root 13656 May 4 00:57 Test18UdpServer
27 -rw-r--r-- 1 root root 2139 May 4 00:57 Test18UdpServer.c
28 -rwxr-xr-x 1 root root 8912 May 4 00:57 Test19UdpClient
30 -rw-r--r-- 1 root root 790 May 4 00:57 Test19UdpClient.c
29 -rwxr-xr-x 1 root root 8912 May 4 00:57 Test19UdpClient02
31 -rwxr-xr-x 1 root root 9008 May 4 00:57 Test19UdpServer
32 -rw-r--r-- 1 root root 860 May 4 00:57 Test19UdpServer.c
11 drwx------ 2 root root 12288 May 4 00:56 lost+found
13 -rwxr-xr-x 1 root root 44168 May 4 00:57 ping
上面提到的 hash 的计算方式存在于 hash.ext4fs_dirhash 中
readdir/ls 中的文件顺序关联的问题?
可以关联到如下问题中的 "WebappClassloader 如何加载 ?", 它的类加载顺序 依赖于 File.list
40 classpath中存在多个jar存在同限定名的class classloader会如何加载_蓝风9的博客-CSDN博客_xbootclasspath 多个jar
File.list 实现来自于 FileSystem.list
其实现 也取决于 readdir 相关具体的实现
完