实验要求:
1.在内核中先申请一个页面,使用内核提供的函数,按照寻页的步骤一步步的找到物理地址。这些步骤就相当于我们手动的模拟了mmu的寻页过程。(paging_lowmem.c)
2.通过mmap将物理内存的数据映射到一个设备文件中。
通过访问该设备就可以实现访问物理内存的功能。(dram.c)
3.通过和dram内核模块通信,在用户态用想要的格式查看物理内存中的内容。(fileview.c)
实验原理:
1.64位Linux系统虚拟地址转物理地址的原理:
地址转换的过程分为两部分,分段和分页。分段机制简单的来说是将进程的代码、数据、栈分在不同的虚拟地址段上,从而避免进程间的互相影响。分段之前的地址我们称之为逻辑地址,它有两部分组成,高位的段选择符和低位的段内偏移。在分段时先用段选择符在相应的段描述符表中找到段描述符,也就是某一个段的基地址,再加上段内偏移量就得到了对应的线性地址,线性地址也称之为虚拟地址。
而在实际的应用中,Linux为了增加可移植性并没有完整的使用分段机制,它让所有的段都指向相同的段地址范围,段的基地址都为0,这样逻辑地址和线性地址在数值上就相同了。所以,这里分析的重点在分页,也就是由线性地址到物理地址的转换过程。
Linux四级页表:
Linux64位线性地址:
其中,PGD 是页全局目录,PUD 是页上级目录,PMD 是页中间目录,PTE是页表,分别占了9位,而页内偏移占12位,共计48位,剩下的高16位保留下来,作为一个扩展。
在这种情况下,页面的大小都为4kb,每一个页表项大小为8bit,整个页表可以映射的空间是256TB。
在4.15的内核中,Linux已经在页全局目录和页上级目录之间又增加了一个新的页目录,叫做p4d页目录。这个页目录同32位中的情况一样,现在还未被使用,它的页目录项只有一个,线性地址中也没有它的索引位。
下图是实验中得到的结果:
从上图中我们可以看到PGD_SHIFT和PUD_SHIFT都是39,这也就意味着在线性地址中P4D这个字段是空的,我们也可以看到P4D的页目录项是1,这就说明虽然Linux现在使用的5级页表模型,但是实际上使用的页表只有4个。
关于偏移量,由于每个页表大小为8bit,所以我们在计算偏移量的时候首先得把偏移量乘以8,然后再转换为16进制。
所以计算上图的偏移量为:
Cr3寄存器用来保存当前进程的页全局目录的基地址,该地址加上pdg偏移量可以得出pdg页目录地址(1cc029d0),由于页内偏移占12,所以pgd低12位(后三位)取0得到基地址,将pud偏移量拼接到后边就可以得出pud页目录地址(1cc03cf8)。以此类推,将pud页目录地址后三位取0,把pmd偏移量拼接到后边就可以得出pmd页目录地址(2b78ad8),将pmd页目录地址后三位取0,把pte偏移量拼接到后边就可以得出pte页表地址(2b601163),后三位取0得到该页表项物理地址(paddr),其后9位(2b601000)是物理地址,其他位未保护位。
2.内核模块编码
图1
图2
Linux 内核模块(LKM)是一些在启动的操作系统内核需要时可以载入内核执行的代码块,不需要时由操作系统卸载。它们扩展了操作系统内核功能却不需要重新编译内核、启动系统。如果没有内核模块,就不得不反复编译生成操作系统的内核镜像来加入新功能,当附加的功能很多时,还会使内核变得臃肿。一个Linux 内核模块主要由以下几个部分组成:
必须要做:
(1) 模块加载函数:当通过insmod命令加载内核模块时,模块的加载函数会自动被内核执行,完成本模块相关初始化工作。比如,sudo insmod dram.ko
(2) 模块卸载函数:当通过rmmod 命令卸载模块时,模块的卸载函数会自动被内核执行,完成与模块加载函数相反的功能。比如,sudo rmmod dram.ko
(3) 模块许可证声明:模块许可证(LICENCE)声明描述内核模块的许可权限,如果不声明LICENCE,模块被加载时将收到内核被污染的警告。大多数情况下,内核模块应遵循GPL 兼容许可权。比如上图图2的MODULE_LICENSE(“GPL”)语句。
可以选择做:
(4) 模块参数:模块参数是模块被加载的时候可以被传递给他的值,它本身对应模块内部的全局变量。
(5) 模块导出符号:内核模块可以导出符号(symbol,对应于函数或变量),这样其他模块可以使用本模块中的变量或函数。
(6) 模块作者等信息声明。比如MODULE_LICENSE(“a student”)表明作者是一个学生。
一个内核模块至少包含两个函数,模块被加载时执行的初始化函数init_module()和模块被卸载时执行的结束函数cleanup_module()。在最新内核稳定版本2.6 中,两个函数可以起任意的名字,通过宏module_init()和module_exit()注册调用要编译内核模块,把代码嵌进内核空间,首先要获取内核源代码,且版本必需与当前正在运行的版本一致。比如上图图2中的module_init(v2p_init)是注册初始化模块,module_exit(v2p_exit) 是注册卸载模块
各语句的作用:
在Makefile文件中(图1),
obj-m :=产生模块的目标文件
PWD :=和KDIR := 指定内核代码目录(内含顶层makefile)
make -C $(KDIR) M=$(PWD) 中M=:让makefile构Arial之前返回发到模块源代码目录,然后目标指向obj-m设定的模块
clean语句表示执行make clean后删除后缀名为图中的文件
在内核.c源码文件中:
一般要引入基本的内核函数库:
#include <linux/module.h>
#include <linux/kernel.h>
还要编写模块初始化函数(比如上图图2的v2p_init函数)和模块清理和退出函数(v2p_exit)
执行过程:
进入.c文件和Makefile所在目录(两者在同一目录)
首先执行sudo make进行编译,生成.o、.ko等文件,然后执行sudo insmod .ko加载.ko模块到内核中,可以用lsmod命令查看我们编写的模块是否插入成功,然后利用dmesg命令查看内核打印的消息。
如果需要卸载模块,执行sudo rmmod .ko命令。
代码参考:
dram.c:
#include <linux/module.h> // for module_init()
#include <linux/highmem.h> // for kmap(), kunmap()
#include <linux/uaccess.h> // for copy_to_user()
char modname[] = "dram"; // for displaying driver's name
int my_major = 85; // note static major assignment
unsigned long dram_size; // total bytes of system memory
loff_t my_llseek( struct file *file, loff_t offset, int whence );
ssize_t my_read( struct file *file, char *buf, size_t count, loff_t *pos );
struct file_operations
my_fops = {
owner: THIS_MODULE,
llseek: my_llseek,
read: my_read,
};
static int __init dram_init( void )
{
printk( "<1>\nInstalling \'%s\' module ", modname );
printk( "(major=%d)\n", my_major );
dram_size = 0x25f5ffff8;
printk( "<1> ramtop=%08lX (%lu MB)\n", dram_size, dram_size >> 20 );
return register_chrdev( my_major, modname, &my_fops );
}
static void __exit dram_exit( void )
{
unregister_chrdev( my_major, modname );
printk( "<1>Removing \'%s\' module\n", modname );
}
ssize_t my_read( struct file *file, char *buf, size_t count, loff_t *pos )
{
struct page *pp;
void *from;
int page_number, page_indent, more;
// we cannot read beyond the end-of-file
if ( *pos >= dram_size ) return 0;
// determine which physical page to temporarily map
// and how far into that page to begin reading from
page_number = *pos / PAGE_SIZE;
page_indent = *pos % PAGE_SIZE;
// map the designated physical page into kernel space
/*If kerel vesion is 2.6.32 or later, please use pfn_to_page() to get page, and include
asm-generic/memory_model.h*/
pp = pfn_to_page( page_number);
from = kmap( pp ) + page_indent;
// cannot reliably read beyond the end of this mapped page
if ( page_indent + count > PAGE_SIZE ) count = PAGE_SIZE - page_indent;
// now transfer count bytes from mapped page to user-supplied buffer
more = copy_to_user( buf, from, count );
// ok now to discard the temporary page mapping
kunmap( pp );
// an error occurred if less than count bytes got copied
if ( more ) return -EFAULT;
// otherwise advance file-pointer and report number of bytes read
*pos += count;
return count;
}
loff_t my_llseek( struct file *file, loff_t offset, int whence )
{
loff_t newpos = -1;
switch( whence )
{
case 0: newpos = offset; break; // SEEK_SET
case 1: newpos = file->f_pos + offset; break; // SEEK_CUR
case 2: newpos = dram_size + offset; break; // SEEK_END
}
if (( newpos < 0 )||( newpos > dram_size )) return -EINVAL;
file->f_pos = newpos;
return newpos;
}
MODULE_LICENSE("GPL");
module_init( dram_init );
module_exit( dram_exit );
dram.c的Makefile文件:
ifneq ($(KERNELRELEASE),)
obj-m := dram.o
else
KDIR := /lib/modules/$(shell uname -r)/build
PWD := $(shell pwd)
default:
#$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules
$(MAKE) -C $(KDIR) M=$(PWD) modules
rm -r -f .tmp_versions *.mod.c .*.cmd *.o *.symvers
endif
fileview.cpp:
#include <stdio.h> // for printf(), perror(), fflush()
#include <fcntl.h> // for open()
#include <string.h> // for strncpy()
#include <unistd.h> // for read(), lseek64()
#include <stdlib.h> // for exit()
#include <termios.h> // for tcgetattr(), tcsetattr()
#define MAXNAME 80
#define BUFHIGH 16
#define BUFWIDE 16
#define BUFSIZE 256
#define ROW 6
#define COL 2
#define KB_SEEK 0x0000000A
#define KB_QUIT 0x0000001B
#define KB_BACK 0x0000007F
#define KB_HOME 0x00315B1B
#define KB_LNUP 0x00415B1B
#define KB_PGUP 0x00355B1B
#define KB_LEFT 0x00445B1B
#define KB_RGHT 0x00435B1B
#define KB_LNDN 0x00425B1B
#define KB_PGDN 0x00365B1B
#define KB_END 0x00345B1B
#define KB_DEL 0x00335B1B
char progname[] = "FILEVIEW";
char filename[ MAXNAME + 1 ];
char buffer[ BUFSIZE ];
char outline[ 80 ];
int main( int argc, char *argv[] )
{
// setup the filename (if supplied), else terminate
if ( argc > 1 ) strncpy( filename, argv[1], MAXNAME );
else { fprintf( stderr, "argument needed\n" ); exit(1); }
// open the file for reading
int fd = open( filename, O_RDONLY );
if ( fd < 0 ) { perror( filename ); exit(1); }
// obtain the filesize (if possible)
long long filesize = lseek64( fd, 0LL, SEEK_END );
if ( filesize < 0LL )
{
fprintf( stderr, "cannot locate \'end-of-file\' \n" );
exit(1);
}
long long incmin = ( 1LL << 8 );
long long incmax = ( 1LL << 36 );
long long posmin = 0LL;
long long posmax = (filesize - 241LL)&~0xF;
if ( posmax < posmin ) posmax = posmin;
// initiate noncanonical terminal input
struct termios tty_orig;
tcgetattr( STDIN_FILENO, &tty_orig );
struct termios tty_work = tty_orig;
tty_work.c_lflag &= ~( ECHO | ICANON ); // | ISIG );
tty_work.c_cc[ VMIN ] = 1;
tty_work.c_cc[ VTIME ] = 0;
tcsetattr( STDIN_FILENO, TCSAFLUSH, &tty_work );
printf( "\e[H\e[J" );
// display the legend
int i, j, k;
k = (77 - strlen( progname ))/2;
printf( "\e[%d;%dH %s ", 1, k, progname );
k = (77 - strlen( filename ))/2;
printf( "\e[%d;%dH\'%s\'", 3, k, filename );
char infomsg[ 80 ];
sprintf( infomsg, "filesize: %llu (=0x%013llX)", filesize, filesize );
k = (78 - strlen( infomsg ));
printf( "\e[%d;%dH%s", 24, k, infomsg );
fflush( stdout );
// main loop to navigate the file
long long pageincr = incmin;
long long lineincr = 16LL;
long long position = 0LL;
long long location = 0LL;
int format = 1;
int done = 0;
while ( !done )
{
// erase prior buffer contents
for (j = 0; j < BUFSIZE; j++) buffer[ j ] = ~0;
// restore 'pageincr' to prescribed bounds
if ( pageincr == 0LL ) pageincr = incmax;
else if ( pageincr < incmin ) pageincr = incmin;
else if ( pageincr > incmax ) pageincr = incmax;
// get current location of file-pointer position
location = lseek64( fd, position, SEEK_SET );
// try to fill 'buffer[]' with data from the file
char *where = buffer;
int to_read = BUFSIZE;
while ( to_read > 0 )
{
int nbytes = read( fd, where, to_read );
if ( nbytes <= 0 ) break;
to_read -= nbytes;
where += nbytes;
}
int datalen = BUFSIZE - to_read;
// display the data just read into the 'buffer[]' array
unsigned char *bp;
unsigned short *wp;
unsigned int *dp;
unsigned long long *qp;
for (i = 0; i < BUFHIGH; i++)
{
int linelen;
// draw the line-location (13-digit hexadecimal)
linelen = sprintf( outline, "%013llX ", location );
// draw the line in the selected hexadecimal format
switch ( format )
{
case 1: // 'byte' format
bp = (unsigned char*)&buffer[ i*BUFWIDE ];
for (j = 0; j < BUFWIDE; j++)
linelen += sprintf( outline+linelen,
"%02X ", bp[j] );
break;
case 2: // 'word' format
wp = (unsigned short*)&buffer[ i*BUFWIDE ];
for (j = 0; j < BUFWIDE/2; j++)
linelen += sprintf( outline+linelen,
" %04X ", wp[j] );
break;
case 4: // 'dword' format
dp = (unsigned int*)&buffer[ i*BUFWIDE ];
for (j = 0; j < BUFWIDE/4; j++)
linelen += sprintf( outline+linelen,
" %08X ", dp[j] );
break;
case 8: // 'qword' format
qp = (unsigned long long*)&buffer[ i*BUFWIDE ];
for (j = 0; j < BUFWIDE/8; j++)
linelen += sprintf( outline+linelen,
" %016llX ", qp[j] );
break;
case 16: // 'octaword'
qp = (unsigned long long*)&buffer[ i*BUFWIDE ];
linelen += sprintf( outline+linelen, " " );
linelen += sprintf( outline+linelen,
" %016llX%016llX ", qp[1], qp[0] );
linelen += sprintf( outline+linelen, " " );
break;
}
// draw the line in ascii format
for (j = 0; j < BUFWIDE; j++)
{
char ch = buffer[ i*BUFWIDE + j ];
if (( ch < 0x20 )||( ch > 0x7E )) ch = '.';
linelen += sprintf( outline+linelen, "%c", ch);
}
// transfer this output-line to the screen
printf( "\e[%d;%dH%s", ROW+i, COL, outline );
// advance 'location' for the next output-line
location += BUFWIDE;
}
printf( "\e[%d;%dH", 23, COL );
fflush( stdout );
// await keypress
long long inch = 0LL;
read( STDIN_FILENO, &inch, sizeof( inch ) );
printf( "\e[%d;%dH%60s", 23, COL, " " );
// interpret navigation or formatting command
inch &= 0x00FFFFFFLL;
switch ( inch )
{
// move to the file's beginning/ending
case 'H': case 'h':
case KB_HOME: position = posmin; break;
case 'E': case 'e':
case KB_END: position = posmax; break;
// move forward/backward by one line
case KB_LNDN: position += BUFWIDE; break;
case KB_LNUP: position -= BUFWIDE; break;
// move forward/packward by one page
case KB_PGDN: position += pageincr; break;
case KB_PGUP: position -= pageincr; break;
// increase/decrease the page-size increment
case KB_RGHT: pageincr >>= 4; break;
case KB_LEFT: pageincr <<= 4; break;
// reset the hexadecimal output-format
case 'B': case 'b': format = 1; break;
case 'W': case 'w': format = 2; break;
case 'D': case 'd': format = 4; break;
case 'Q': case 'q': format = 8; break;
case 'O': case 'o': format = 16; break;
// seek to a user-specified file-position
case KB_SEEK:
printf( "\e[%d;%dHAddress: ", 23, COL );
fflush( stdout );
{
char inbuf[ 16 ] = {0};
//tcsetattr( STDIN_FILENO, TCSANOW, &tty_orig );
int i = 0;
while ( i < 15 )
{
char ch = 0;
read( STDIN_FILENO, &ch, sizeof( ch ) );
ch &= 0xFFFFFF;
if ( ch == '\n' ) break;
if ( ch == KB_QUIT ) { inbuf[0] = 0; break; }
if ( ch == KB_LEFT ) ch = KB_BACK;
if ( ch == KB_DEL ) ch = KB_BACK;
if (( ch == KB_BACK )&&( i > 0 ))
{
inbuf[--i] = 0;
printf( "\b \b" );
fflush( stdout );
}
if (( ch < 0x20 )||( ch > 0x7E )) continue;
inbuf[ i++ ] = ch;
printf( "%c",ch );
fflush( stdout );
}
printf( "\e[%d;%dH%70s", 23, COL, " " );
fflush( stdout );
position = strtoull( inbuf, NULL, 16 );
position &= ~0xFLL; // paragraph align
}
break;
// program termination
case KB_QUIT: done = 1; break;
default:
printf( "\e[%d;%dHHit <ESC> to quit", 23, 2 );
}
fflush( stdout );
// insure that 'position' remains within bounds
if ( position < posmin ) position = posmin;
if ( position > posmax ) position = posmax;
}
// restore canonical terminal behavior
tcsetattr( STDIN_FILENO, TCSAFLUSH, &tty_orig );
printf( "\e[%d;%dH\e[0J\n", 23, 0 );
}
paging_lowmem.c:
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
#include <linux/mm_types.h>
#include <linux/sched.h>
#include <linux/export.h>
#include <linux/delay.h>
static unsigned long cr0,cr3;
static unsigned long vaddr = 0;
static void get_pgtable_macro(void)
{
cr0 = read_cr0();
cr3 = read_cr3_pa();
printk("cr0 = 0x%lx, cr3 = 0x%lx\n",cr0,cr3);
printk("PGDIR_SHIFT = %d\n", PGDIR_SHIFT);
printk("P4D_SHIFT = %d\n",P4D_SHIFT);
printk("PUD_SHIFT = %d\n", PUD_SHIFT);
printk("PMD_SHIFT = %d\n", PMD_SHIFT);
printk("PAGE_SHIFT = %d\n", PAGE_SHIFT);
printk("PTRS_PER_PGD = %d\n", PTRS_PER_PGD);
printk("PTRS_PER_P4D = %d\n", PTRS_PER_P4D);
printk("PTRS_PER_PUD = %d\n", PTRS_PER_PUD);
printk("PTRS_PER_PMD = %d\n", PTRS_PER_PMD);
printk("PTRS_PER_PTE = %d\n", PTRS_PER_PTE);
printk("PAGE_MASK = 0x%lx\n", PAGE_MASK);
}
static unsigned long vaddr2paddr(unsigned long vaddr)
{
pgd_t *pgd;
p4d_t *p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *pte;
unsigned long paddr = 0;
unsigned long page_addr = 0;
unsigned long page_offset = 0;
pgd = pgd_offset(current->mm,vaddr);
printk("pgd_val = 0x%lx, pgd_index = %lu\n", pgd_val(*pgd),pgd_index(vaddr));
if (pgd_none(*pgd)){
printk("not mapped in pgd\n");
return -1;
}
p4d = p4d_offset(pgd, vaddr);
printk("p4d_val = 0x%lx, p4d_index = %lu\n", p4d_val(*p4d),p4d_index(vaddr));
if(p4d_none(*p4d))
{
printk("not mapped in p4d\n");
return -1;
}
pud = pud_offset(p4d, vaddr);
printk("pud_val = 0x%lx, pud_index = %lu\n", pud_val(*pud),pud_index(vaddr));
if (pud_none(*pud)) {
printk("not mapped in pud\n");
return -1;
}
pmd = pmd_offset(pud, vaddr);
printk("pmd_val = 0x%lx, pmd_index = %lu\n", pmd_val(*pmd),pmd_index(vaddr));
if (pmd_none(*pmd)) {
printk("not mapped in pmd\n");
return -1;
}
pte = pte_offset_kernel(pmd, vaddr);
printk("pte_val = 0x%lx, ptd_index = %lu\n", pte_val(*pte),pte_index(vaddr));
if (pte_none(*pte)) {
printk("not mapped in pte\n");
return -1;
}
page_addr = pte_val(*pte) & PAGE_MASK;
page_offset = vaddr & ~PAGE_MASK;
paddr = page_addr | page_offset;
printk("page_addr = %lx, page_offset = %lx\n", page_addr, page_offset);
printk("vaddr = %lx, paddr = %lx\n", vaddr, paddr);
return paddr;
}
static int __init v2p_init(void)
{
unsigned long vaddr = 0 ;
printk("vaddr to paddr module is running..\n");
get_pgtable_macro();
printk("\n");
vaddr = __get_free_page(GFP_KERNEL);
if (vaddr == 0) {
printk("__get_free_page failed..\n");
return 0;
}
sprintf((char *)vaddr, "hello world from kernel");
printk("get_page_vaddr=0x%lx\n", vaddr);
vaddr2paddr(vaddr);
ssleep(600);
return 0;
}
static void __exit v2p_exit(void)
{
printk("vaddr to paddr module is leaving..\n");
free_page(vaddr);
}
module_init(v2p_init);
module_exit(v2p_exit);
MODULE_LICENSE("GPL");
paging_lowmem.c对应的Makefile文件:
obj-m:= paging_lowmem.o
PWD:= $(shell pwd)
KERNELDIR:= /lib/modules/$(shell uname -r)/build
#CURRENT_PATH:=$(shell pwd)
#LINUX_KERNEL:=$(shell uname -r)
#LINUX_KERNEL_PATH:=/usr/src/linux-headers-$(LINUX_KERNEL)
all:
make -C $(KERNELDIR) M=$(PWD) modules
#make -C $(LINUX_KERNEL_PATH) M=$(CURRENT_PATH) modules
clean:
#make -C $(LINUX_KERNEL_PATH) M=$(CURRENT_PATH) clean
@rm -rf *.o *.mod.c *.mod.o *.ko *.order *.symvers .*.cmd .tmp_versions