Linux系统编程 day04 文件和目录操作
- 1. 文件IO
- 1.1 open 函数
- 1.2 close函数
- 1.3 read函数
- 1.4 write函数
- 1.5 lseek函数
- 1.6 errno变量
- 1.7 文件示例1 读写文件
- 1.8 文件示例2 文件大小的计算
- 1.9 文件示例3 扩展文件大小
- 1.10 文件示例4 perror函数的使用
- 1.11 阻塞与非阻塞的测试
- 2. 文件和目录
- 2.1 文件操作相关函数
- 2.2 目录操作相关函数
- 2.3 dup/dup2/fcntl函数
1. 文件IO
在C语言阶段学习了关于文件操作的一系列C标准函数,如fopen
、fclose
、fread
、fwrite
、fscanf
、fprintf
等,这一系列函数无不是以f
开头。而这一节中关于文件IO操作的函数则是Linux的系统函数。在Linux中,fopen
函数会调用Linux系统调用中的open
函数,fclose
函数会调用Linux系统调用中的close
函数。
C标准函数和系统调用函数是不同的,系统调用是由操作系统实现并给外部应用程序提供的编程接口,也就是含有Linux系统的系统调用函数的程序离开了Linux就会不能再编译运行。也就是移植性变差了,不能实现跨平台。而只使用C标准函数的程序是可以跨平台的,不受操作系统的限制。
在我们之前调用fopen
的时候会返回一个FILE *
类型的指针,实际上这个指针维护着三个很重要的区域,分别是文件描述符、文件指针、文件缓冲区。每一个FILE
文件流的缓冲区默认大小是8192字节。Linux系统的IO函数默认是没有缓冲区的。关于文件描述符在上一节也就提过,本质是一个int
类型的整数。
在一个进程启动的时候,会默认打开三个文件描述符,分别如下:
#define STDIN_FILENO 0
#define STDOUT_FILENO 1
#define STDOUT_FILENO 2
而新打开的文件返回的是文件描述符表中未使用的最小文件描述符,一个文件描述符表最多可以存1024个文件描述符。调用open
函数就可以打开或者创建文件,得到一个文件描述符。
1.1 open 函数
下面是一些关键描述:
SYNOPSIS
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
DESCRIPTION
The open() system call opens the file specified by pathname. If the specified file does not exist, it may optionally (if
O_CREAT is specified in flags) be created by open().
The return value of open() is a file descriptor, a small, nonnegative integer that is used in subsequent system calls
(read(2), write(2), lseek(2), fcntl(2), etc.) to refer to the open file. The file descriptor returned by a successful
call will be the lowest-numbered file descriptor not currently open for the process.
A call to open() creates a new open file description, an entry in the system-wide table of open files. The open file de‐
scription records the file offset and the file status flags (see below). A file descriptor is a reference to an open
file description; this reference is unaffected if pathname is subsequently removed or modified to refer to a different
file. For further details on open file descriptions, see NOTES.
The argument flags must include one of the following access modes: O_RDONLY, O_WRONLY, or O_RDWR. These request opening
the file read-only, write-only, or read/write, respectively.
In addition, zero or more file creation flags and file status flags can be bitwise-or'd in flags. The file creation
flags are O_CLOEXEC, O_CREAT, O_DIRECTORY, O_EXCL, O_NOCTTY, O_NOFOLLOW, O_TMPFILE, and O_TRUNC. The file status flags
are all of the remaining flags listed below. The distinction between these two groups of flags is that the file creation
flags affect the semantics of the open operation itself, while the file status flags affect the semantics of subsequent
I/O operations. The file status flags can be retrieved and (in some cases) modified; see fcntl(2) for details.
The full list of file creation flags and file status flags is as follows:
O_APPEND
The file is opened in append mode. Before each write(2), the file offset is positioned at the end of the file, as
if with lseek(2). The modification of the file offset and the write operation are performed as a single atomic
step.
O_APPEND may lead to corrupted files on NFS filesystems if more than one process appends data to a file at once.
This is because NFS does not support appending to a file, so the client kernel has to simulate it, which can't be
done without a race condition.
O_CREAT
If pathname does not exist, create it as a regular file.
The owner (user ID) of the new file is set to the effective user ID of the process.
The group ownership (group ID) of the new file is set either to the effective group ID of the process (System V
semantics) or to the group ID of the parent directory (BSD semantics). On Linux, the behavior depends on whether
the set-group-ID mode bit is set on the parent directory: if that bit is set, then BSD semantics apply; otherwise,
System V semantics apply. For some filesystems, the behavior also depends on the bsdgroups and sysvgroups mount
options described in mount(8)).
The mode argument specifies the file mode bits be applied when a new file is created. This argument must be sup‐
plied when O_CREAT or O_TMPFILE is specified in flags; if neither O_CREAT nor O_TMPFILE is specified, then mode is
ignored. The effective mode is modified by the process's umask in the usual way: in the absence of a default ACL,
the mode of the created file is (mode & ~umask). Note that this mode applies only to future accesses of the newly
created file; the open() call that creates a read-only file may well return a read/write file descriptor.
The following symbolic constants are provided for mode:
S_IRWXU 00700 user (file owner) has read, write, and execute permission
S_IRUSR 00400 user has read permission
S_IWUSR 00200 user has write permission
S_IXUSR 00100 user has execute permission
S_IRWXG 00070 group has read, write, and execute permission
S_IRGRP 00040 group has read permission
S_IWGRP 00020 group has write permission
S_IXGRP 00010 group has execute permission
S_IRWXO 00007 others have read, write, and execute permission
S_IROTH 00004 others have read permission
S_IWOTH 00002 others have write permission
S_IXOTH 00001 others have execute permission
According to POSIX, the effect when other bits are set in mode is unspecified. On Linux, the following bits are
also honored in mode:
S_ISUID 0004000 set-user-ID bit
S_ISGID 0002000 set-group-ID bit (see inode(7)).
S_ISVTX 0001000 sticky bit (see inode(7)).
O_TRUNC
If the file already exists and is a regular file and the access mode allows writing (i.e., is O_RDWR or O_WRONLY)
it will be truncated to length 0. If the file is a FIFO or terminal device file, the O_TRUNC flag is ignored.
Otherwise, the effect of O_TRUNC is unspecified.
RETURN VALUE
open(), openat(), and creat() return the new file descriptor, or -1 if an error occurred (in which case, errno is set ap‐
propriately).
上面的内容大概介绍了open
函数的使用,通过上面的描述可以知道要是用open
函数需要包含三个头文件,分别是sys/types.h
、sys/stat.h
和fcntl.h
。open
函数有两种调用形式,分别是
int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);
该函数的作用是打开一个文件,并返回其文件描述符。其中前两个参数都是一样的,第一个参数pathname
表示文件的路径名字,第二个参数flags
是一些标志,部分重要的标志如下:
标志 | 作用 |
---|---|
O_RDWR | 可读可写 |
O_RDONLY | 只读 |
O_WRONLY | 只写 |
O_APPEND | 追加 |
O_CREAT | 创建, 这个flag需要指定最后一个参数mode |
O_TRUNC | 文件存在截断文件内容为长度0 |
当指定了O_CREAT
需要指定第三个参数mode
,其中mode
为用户的权限,权限如下:
mode | 权限 |
---|---|
S_IRWXU | 属主可读可写可执行 |
S_IRUSR | 属主可读 |
S_IWUSR | 属主可写 |
S_IXUSR | 属主可执行 |
S_IRWXG | 属组可读可写可执行 |
S_IRGRP | 属组可读 |
S_IWGRP | 属组可写 |
S_IXGRP | 属组可执行 |
S_IRWXO | 其它用户可读可写可执行 |
S_IROTH | 其它用户可读 |
S_IWOTH | 其它用户可写 |
S_IXOTH | 其它用户可执行 |
上面的flag
和mode
如果想要使用多个都可以用位运算符|
连接起来。
最后来看看该函数的返回值。该函数的返回值为一个新的文件描述符;如果发生了错误则返回-1
,并会设置相应的errno
。
1.2 close函数
SYNOPSIS
#include <unistd.h>
int close(int fd);
DESCRIPTION
close() closes a file descriptor, so that it no longer refers to any file and may be reused. Any record locks (see fc‐
ntl(2)) held on the file it was associated with, and owned by the process, are removed (regardless of the file descriptor
that was used to obtain the lock).
If fd is the last file descriptor referring to the underlying open file description (see open(2)), the resources associ‐
ated with the open file description are freed; if the file descriptor was the last reference to a file which has been re‐
moved using unlink(2), the file is deleted.
RETURN VALUE
close() returns zero on success. On error, -1 is returned, and errno is set appropriately.
该函数的原型为
int close(int fd);
该函数的作用是关闭打开的文件。参数fd
为打开的文件描述符,关闭成功返回值为0
,失败返回-1
,并设置相应的errno
。需要注意的是这个函数的open
函数的需要包含的头文件并不一样,该函数需要包含头文件unistd.h
。
1.3 read函数
SYNOPSIS
#include <unistd.h>
ssize_t read(int fd, void *buf, size_t count);
DESCRIPTION
read() attempts to read up to count bytes from file descriptor fd into the buffer starting at buf.
On files that support seeking, the read operation commences at the file offset, and the file offset is incremented by the
number of bytes read. If the file offset is at or past the end of file, no bytes are read, and read() returns zero.
If count is zero, read() may detect the errors described below. In the absence of any errors, or if read() does not check
for errors, a read() with a count of 0 returns zero and has no other effects.
According to POSIX.1, if count is greater than SSIZE_MAX, the result is implementation-defined; see NOTES for the upper
limit on Linux.
RETURN VALUE
On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this
number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example be‐
cause fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading
from a pipe, or from a terminal), or because read() was interrupted by a signal. See also NOTES.
On error, -1 is returned, and errno is set appropriately. In this case, it is left unspecified whether the file position
(if any) changes.
read
函数也需要头文件unistd.h
,其函数原型为:
ssize_t read(int fd, void *buf, size_t count);
该函数的作用是从fd
指向的文件中读取count
和字节放入buf
中。其中参数fd
是文件描述符,buf
是缓冲区的地址,count
是读取的字节数目。该函数的返回值为读取到的字节数,如果是0
表示已经到文件尾。如果失败了就返回-1
,并设置相应的errno
。
1.4 write函数
SYNOPSIS
#include <unistd.h>
ssize_t write(int fd, const void *buf, size_t count);
DESCRIPTION
write() writes up to count bytes from the buffer starting at buf to the file referred to by the file descriptor fd.
The number of bytes written may be less than count if, for example, there is insufficient space on the underlying phys‐
ical medium, or the RLIMIT_FSIZE resource limit is encountered (see setrlimit(2)), or the call was interrupted by a
signal handler after having written less than count bytes. (See also pipe(7).)
For a seekable file (i.e., one to which lseek(2) may be applied, for example, a regular file) writing takes place at
the file offset, and the file offset is incremented by the number of bytes actually written. If the file was open(2)ed
with O_APPEND, the file offset is first set to the end of the file before writing. The adjustment of the file offset
and the write operation are performed as an atomic step.
POSIX requires that a read(2) that can be proved to occur after a write() has returned will return the new data. Note
that not all filesystems are POSIX conforming.
According to POSIX.1, if count is greater than SSIZE_MAX, the result is implementation-defined; see NOTES for the upper
limit on Linux.
RETURN VALUE
On success, the number of bytes written is returned. On error, -1 is returned, and errno is set to indicate the cause
of the error.
Note that a successful write() may transfer fewer than count bytes. Such partial writes can occur for various reasons;
for example, because there was insufficient space on the disk device to write all of the requested bytes, or because a
blocked write() to a socket, pipe, or similar was interrupted by a signal handler after it had transferred some, but
before it had transferred all of the requested bytes. In the event of a partial write, the caller can make another
write() call to transfer the remaining bytes. The subsequent call will either transfer further bytes or may result in
an error (e.g., if the disk is now full).
If count is zero and fd refers to a regular file, then write() may return a failure status if one of the errors below
is detected. If no errors are detected, or error detection is not performed, 0 will be returned without causing any
other effect. If count is zero and fd refers to a file other than a regular file, the results are not specified.
该函数所需要的头文件和前面的read
函数是一样的,该函数原型为:
ssize_t write(int fd, const void *buf, size_t count);
该函数的作用是将buf
中的数据的前count
个字节写入到fd
指向的文件中。其中fd
是文件描述符,buf
是需要进行写操作数据的缓冲区,count
是需要写入的字节数。该函数的返回值为成功写入的字节数目,失败了返回-1
并设置相应的errno
。
1.5 lseek函数
SYNOPSIS
#include <sys/types.h>
#include <unistd.h>
off_t lseek(int fd, off_t offset, int whence);
DESCRIPTION
lseek() repositions the file offset of the open file description associated with the file descriptor fd to the argument
offset according to the directive whence as follows:
SEEK_SET
The file offset is set to offset bytes.
SEEK_CUR
The file offset is set to its current location plus offset bytes.
SEEK_END
The file offset is set to the size of the file plus offset bytes.
lseek() allows the file offset to be set beyond the end of the file (but this does not change the size of the file).
If data is later written at this point, subsequent reads of the data in the gap (a "hole") return null bytes ('\0') un‐
til data is actually written into the gap.
RETURN VALUE
Upon successful completion, lseek() returns the resulting offset location as measured in bytes from the beginning of
the file. On error, the value (off_t) -1 is returned and errno is set to indicate the error.
该函数需要头文件sys/types.h
和头文件unistd.h
。该函数的原型为:
off_t lseek(int fd, off_t offset, int whence);
该函数的作用是改变文件指针的位置,将fd
指向的文件的文件指针从whence
处移动offset
字节。参数fd
是文件描述符,offset
是偏移量,whence
表示移动的起始位置。该函数的返回值为文件指针距离文件开头处的偏移字节数,失败则返回-1
,并设置相应的errno
。
1.6 errno变量
ERRNO(3) Linux Programmer's Manual ERRNO(3)
NAME
errno - number of last error
SYNOPSIS
#include <errno.h>
DESCRIPTION
The <errno.h> header file defines the integer variable errno, which is set by system calls and some library functions
in the event of an error to indicate what went wrong.
errno
The value in errno is significant only when the return value of the call indicated an error (i.e., -1 from most system
calls; -1 or NULL from most library functions); a function that succeeds is allowed to change errno. The value of er‐
rno is never set to zero by any system call or library function.
For some system calls and library functions (e.g., getpriority(2)), -1 is a valid return on success. In such cases, a
successful return can be distinguished from an error return by setting errno to zero before the call, and then, if the
call returns a status that indicates that an error may have occurred, checking to see if errno has a nonzero value.
errno is defined by the ISO C standard to be a modifiable lvalue of type int, and must not be explicitly declared; er‐
rno may be a macro. errno is thread-local; setting it in one thread does not affect its value in any other thread.
Error numbers and names
Valid error numbers are all positive numbers. The <errno.h> header file defines symbolic names for each of the possi‐
ble error numbers that may appear in errno.
All the error names specified by POSIX.1 must have distinct values, with the exception of EAGAIN and EWOULDBLOCK, which
may be the same. On Linux, these two have the same value on all architectures.
The error numbers that correspond to each symbolic name vary across UNIX systems, and even across different architec‐
tures on Linux. Therefore, numeric values are not included as part of the list of error names below. The perror(3)
and strerror(3) functions can be used to convert these names to corresponding textual error messages.
On any particular Linux system, one can obtain a list of all symbolic error names and the corresponding error numbers
using the errno(1) command (part of the moreutils package):
$ errno -l
EPERM 1 Operation not permitted
ENOENT 2 No such file or directory
ESRCH 3 No such process
EINTR 4 Interrupted system call
EIO 5 Input/output error
...
The errno(1) command can also be used to look up individual error numbers and names, and to search for errors using
strings from the error description, as in the following examples:
$ errno 2
ENOENT 2 No such file or directory
$ errno ESRCH
ESRCH 3 No such process
$ errno -s permission
EACCES 13 Permission denied
List of error names
In the list of the symbolic error names below, various names are marked as follows:
* POSIX.1-2001: The name is defined by POSIX.1-2001, and is defined in later POSIX.1 versions, unless otherwise indi‐
cated.
* POSIX.1-2008: The name is defined in POSIX.1-2008, but was not present in earlier POSIX.1 standards.
* C99: The name is defined by C99. Below is a list of the symbolic error names that are defined on Linux:
E2BIG Argument list too long (POSIX.1-2001).
EACCES Permission denied (POSIX.1-2001).
EADDRINUSE Address already in use (POSIX.1-2001).
EADDRNOTAVAIL Address not available (POSIX.1-2001).
EAFNOSUPPORT Address family not supported (POSIX.1-2001).
EAGAIN Resource temporarily unavailable (may be the same value as EWOULDBLOCK) (POSIX.1-2001).
EALREADY Connection already in progress (POSIX.1-2001).
EBADE Invalid exchange.
EBADF Bad file descriptor (POSIX.1-2001).
EBADFD File descriptor in bad state.
EBADMSG Bad message (POSIX.1-2001).
EBADR Invalid request descriptor.
EBADRQC Invalid request code.
EBADSLT Invalid slot.
EBUSY Device or resource busy (POSIX.1-2001).
ECANCELED Operation canceled (POSIX.1-2001).
ECHILD No child processes (POSIX.1-2001).
ECHRNG Channel number out of range.
ECOMM Communication error on send.
ECONNABORTED Connection aborted (POSIX.1-2001).
ECONNREFUSED Connection refused (POSIX.1-2001).
ECONNRESET Connection reset (POSIX.1-2001).
EDEADLK Resource deadlock avoided (POSIX.1-2001).
EDEADLOCK On most architectures, a synonym for EDEADLK. On some architectures (e.g., Linux MIPS, PowerPC,
SPARC), it is a separate error code "File locking deadlock error".
EDESTADDRREQ Destination address required (POSIX.1-2001).
EDOM Mathematics argument out of domain of function (POSIX.1, C99).
EDQUOT Disk quota exceeded (POSIX.1-2001).
EEXIST File exists (POSIX.1-2001).
EFAULT Bad address (POSIX.1-2001).
EFBIG File too large (POSIX.1-2001).
EHOSTDOWN Host is down.
EHOSTUNREACH Host is unreachable (POSIX.1-2001).
EHWPOISON Memory page has hardware error.
EIDRM Identifier removed (POSIX.1-2001).
EILSEQ Invalid or incomplete multibyte or wide character (POSIX.1, C99).
The text shown here is the glibc error description; in POSIX.1, this error is described as "Illegal
byte sequence".
EINPROGRESS Operation in progress (POSIX.1-2001).
EINTR Interrupted function call (POSIX.1-2001); see signal(7).
EINVAL Invalid argument (POSIX.1-2001).
EIO Input/output error (POSIX.1-2001).
EISCONN Socket is connected (POSIX.1-2001).
EISDIR Is a directory (POSIX.1-2001).
EISNAM Is a named type file.
EKEYEXPIRED Key has expired.
EKEYREJECTED Key was rejected by service.
EKEYREVOKED Key has been revoked.
EL2HLT Level 2 halted.
EL2NSYNC Level 2 not synchronized.
EL3HLT Level 3 halted.
EL3RST Level 3 reset.
ELIBACC Cannot access a needed shared library.
ELIBBAD Accessing a corrupted shared library.
ELIBMAX Attempting to link in too many shared libraries.
ELIBSCN .lib section in a.out corrupted
ELIBEXEC Cannot exec a shared library directly.
ELNRANGE Link number out of range.
ELOOP Too many levels of symbolic links (POSIX.1-2001).
EMEDIUMTYPE Wrong medium type.
EMFILE Too many open files (POSIX.1-2001). Commonly caused by exceeding the RLIMIT_NOFILE resource limit de‐
scribed in getrlimit(2).
EMLINK Too many links (POSIX.1-2001).
EMSGSIZE Message too long (POSIX.1-2001).
EMULTIHOP Multihop attempted (POSIX.1-2001).
ENAMETOOLONG Filename too long (POSIX.1-2001).
ENETDOWN Network is down (POSIX.1-2001).
ENETRESET Connection aborted by network (POSIX.1-2001).
ENETUNREACH Network unreachable (POSIX.1-2001).
ENFILE Too many open files in system (POSIX.1-2001). On Linux, this is probably a result of encountering the
/proc/sys/fs/file-max limit (see proc(5)).
ENOANO No anode.
ENOBUFS No buffer space available (POSIX.1 (XSI STREAMS option)).
ENODATA No message is available on the STREAM head read queue (POSIX.1-2001).
ENODEV No such device (POSIX.1-2001).
ENOENT No such file or directory (POSIX.1-2001).
Typically, this error results when a specified pathname does not exist, or one of the components in the
directory prefix of a pathname does not exist, or the specified pathname is a dangling symbolic link.
ENOEXEC Exec format error (POSIX.1-2001).
ENOKEY Required key not available.
ENOLCK No locks available (POSIX.1-2001).
ENOLINK Link has been severed (POSIX.1-2001).
ENOMEDIUM No medium found.
ENOMEM Not enough space/cannot allocate memory (POSIX.1-2001).
ENOMSG No message of the desired type (POSIX.1-2001).
ENONET Machine is not on the network.
ENOPKG Package not installed.
ENOPROTOOPT Protocol not available (POSIX.1-2001).
ENOSPC No space left on device (POSIX.1-2001).
ENOSR No STREAM resources (POSIX.1 (XSI STREAMS option)).
ENOSTR Not a STREAM (POSIX.1 (XSI STREAMS option)).
ENOSYS Function not implemented (POSIX.1-2001).
ENOTBLK Block device required.
ENOTCONN The socket is not connected (POSIX.1-2001).
ENOTDIR Not a directory (POSIX.1-2001).
ENOTEMPTY Directory not empty (POSIX.1-2001).
ENOTRECOVERABLE State not recoverable (POSIX.1-2008).
ENOTSOCK Not a socket (POSIX.1-2001).
ENOTSUP Operation not supported (POSIX.1-2001).
ENOTTY Inappropriate I/O control operation (POSIX.1-2001).
ENOTUNIQ Name not unique on network.
ENXIO No such device or address (POSIX.1-2001).
EOPNOTSUPP Operation not supported on socket (POSIX.1-2001).
(ENOTSUP and EOPNOTSUPP have the same value on Linux, but according to POSIX.1 these error values
should be distinct.)
EOVERFLOW Value too large to be stored in data type (POSIX.1-2001).
EOWNERDEAD Owner died (POSIX.1-2008).
EPERM Operation not permitted (POSIX.1-2001).
EPFNOSUPPORT Protocol family not supported.
EPIPE Broken pipe (POSIX.1-2001).
EPROTO Protocol error (POSIX.1-2001).
EPROTONOSUPPORT Protocol not supported (POSIX.1-2001).
EPROTOTYPE Protocol wrong type for socket (POSIX.1-2001).
ERANGE Result too large (POSIX.1, C99).
EREMCHG Remote address changed.
EREMOTE Object is remote.
EREMOTEIO Remote I/O error.
ERESTART Interrupted system call should be restarted.
ERFKILL Operation not possible due to RF-kill.
EROFS Read-only filesystem (POSIX.1-2001).
ESHUTDOWN Cannot send after transport endpoint shutdown.
ESPIPE Invalid seek (POSIX.1-2001).
ESOCKTNOSUPPORT Socket type not supported.
ESRCH No such process (POSIX.1-2001).
ESTALE Stale file handle (POSIX.1-2001).
This error can occur for NFS and for other filesystems.
ESTRPIPE Streams pipe error.
ETIME Timer expired (POSIX.1 (XSI STREAMS option)).
(POSIX.1 says "STREAM ioctl(2) timeout".)
ETIMEDOUT Connection timed out (POSIX.1-2001).
ETOOMANYREFS Too many references: cannot splice.
ETXTBSY Text file busy (POSIX.1-2001).
EUCLEAN Structure needs cleaning.
EUNATCH Protocol driver not attached.
EUSERS Too many users.
EWOULDBLOCK Operation would block (may be same value as EAGAIN) (POSIX.1-2001).
EXDEV Improper link (POSIX.1-2001).
EXFULL Exchange full.
需要注意的是如果需要设置errno
变量需要引入头文件errno.h
。若发生错误了,使用perror
函数即可打印相应的错误。如果想要看对应的错误指代的是什么字符串,可以使用strerror
函数。函数原型为:
char *strerror(int errnum);
函数的参数为errno
,返回值为该错误编号指代的错误信息。
1.7 文件示例1 读写文件
在这里使用Linux的系统调用函数的编写一个程序可以打开一个文件,使用write
向文件中写入数据,再使用read
函数将内容读出来。
// open的使用
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
printf("filename = [%s]\n", argv[1]);
// 打开文件返回文件的文件描述符
//int open(const char *pathname, int flags);
//int open(const char *pathname, int flags, mode_t mode);
int fd = open(argv[1], O_RDWR | O_CREAT, S_IRWXU | S_IRWXG | S_IRWXO);
// 打开失败会返回-1
if(fd < 0)
{
perror("file open error");
return -1;
}
printf("fd = [%d]\n", fd);
// 写文件
//ssize_t write(int fd, const void *buf, size_t count);
int size = write(fd, "hello world", strlen("hello world"));
printf("write size = [%d]\n", size);
// 移动文件指针到开始处
//off_t lseek(int fd, off_t offset, int whence);
off_t offset = lseek(fd, 0, SEEK_SET);
printf("offset = [%lu]\n", offset);
// 读文件
//ssize_t read(int fd, void *buf, size_t count);
char buf[128];
memset(buf, 0, sizeof buf);
size = read(fd, buf, sizeof buf);
printf("read size = [%d]\n", size);
printf("read = [%s]\n", buf);
close(fd);
return 0;
}
1.8 文件示例2 文件大小的计算
通过lseek
函数去计算一个文件的大小。
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
int fd = open(argv[1], O_RDWR);
if(fd < 0)
{
perror("file open error");
return -1;
}
off_t size = lseek(fd, 0, SEEK_END);
printf("[%s] size = [%ld]\n", argv[1], size);
close(fd);
return 0;
}
1.9 文件示例3 扩展文件大小
使用lseek
函数使一个小文件扩展成大文件。方法为将文件指针移动到需要扩展大小的偏移处,再进行一次写操作即可。
#include <stdio.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char *argv[])
{
int fd = open(argv[1], O_RDWR);
if(fd < 0)
{
perror("file open error");
return -1;
}
// 扩展到200字节大小
off_t offset = lseek(fd, 200, SEEK_SET);
// 进行一次写操作
write(fd, "a", 1);
close(fd);
return 0;
}
1.10 文件示例4 perror函数的使用
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
int main(int argc, char *argv[])
{
// 打开文件
int fd = open(argv[1], O_RDWR);
if(fd < 0)
{
perror("open error");
if(errno == ENOENT)
{
printf("same\n");
}
return -1;
}
int n = 0;
for(n = 0; n < 64; n ++)
{
errno = n;
printf("[%d]:[%s]\n", errno, strerror(errno));
}
close(fd);
return 0;
}
1.11 阻塞与非阻塞的测试
在Linux中我们读取文件会有阻塞与非阻塞一说。那么我们如何判断这个阻塞和非阻塞是文件的特性还是read
函数的特性呢?这里我们会使用read
函数去读取不同类型的文件。如果读取多个类型的文件得到的都是阻塞或者非阻塞,则说明阻塞和非阻塞是read
函数的特性;如果多个类型的文件得到的阻塞和非阻塞并不一样,那么说明阻塞和非阻塞是文件的特性,而不是read
函数的特性。
使用read
函数读取普通文件。
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <string.h>
#include <fcntl.h>
// 验证read汉书读普通文件是否阻塞
int main(int agrc, char *argv[])
{
// 打开文件
int fd = open(argv[1], O_RDWR);
if(fd < 0)
{
perror("open error");
return -1;
}
// 读文件
char buf[1024];
memset(buf, 0, 1024);
int n = read(fd, buf, sizeof(buf));
printf("first: n = [%d], buf = [%s]\n", n, buf);
// 再次读文件,验证read函数是否阻塞
memset(buf, 0, sizeof(buf));
n = read(fd, buf, sizeof(buf));
printf("second: n = [%d], buf = [%s]\n", n, buf);
// 关闭文件
close(fd);
return 0;
}
用read
读取设备文件:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <string.h>
#include <fcntl.h>
// 验证read函数读设备文件是阻塞的
int main()
{
// 标准输入
char buf[1024];
memset(buf, 0, sizeof(buf));
int n = read(STDIN_FILENO, buf, sizeof(buf));
printf("n = [%d], buf = [%s]\n", n, buf);
return 0;
}
通过这两个例子的测试,我们可以得到阻塞和非阻塞是文件本身的属性,而不是read
函数的属性。
2. 文件和目录
在上面的内容里,我们写了很多英文的内容。这些内容其实是Linux中为系统开发人员提供的帮助文档。这个帮助文档可以使用man
命令进行查看。执行格式如下:
man 需要查看的内容
man 卷号 需要查看的内容
其中一共有9卷。默认不加卷号使用就是查看的第一次出现的卷号的位置,如果有多个卷都有相同的内容,则需要加卷号进行区分。这说说一下在系统编程中我们需要查询的一些卷对应的内容。首先卷1对应了可执行程序以及shell命令;卷2对应着系统调用;卷3对应着C语言库调用。其余的在C语言基础的Linux和Unix中就已经提及过。
在Linux系统编程这一节,我们需要进行大量使用man
命令查阅开发文档,要学会如何查询开发文档以及使用开发文档进行编程,这一点是很重要的。在接下来的后续内容中,将不会再继续展示函数使用的开发文档,需要查看需要读者自行在Linux中执行man
命令进行查阅。
2.1 文件操作相关函数
在文件操作的函数如下,
函数名 | 函数原型 | 函数参数 | 函数返回值 | 作用 |
---|---|---|---|---|
stat | int stat(const char *pathname, struct stat *statbuf); | pathname: 文件路径 statbuf: 存储文件状态内存 | 成功返回0,失败返回-1并设置errno | 将文件pathname的状态信息保存到statbuf中 |
lstat | int lstat(const char *pathname, struct stat *statbuf); | pathname: 文件路径 statbuf: 存储文件状态内存 | 成功返回0,失败返回-1并设置errno | 将文件pathname的状态信息保存到statbuf中 |
这些函数的调用需要头文件sys/types.h
、sys/stat.h
、unistd.h
。上面的struct stat
的结构体定义如下:
struct stat {
dev_t st_dev; /* ID of device containing file */
ino_t st_ino; /* Inode number */
mode_t st_mode; /* File type and mode */
nlink_t st_nlink; /* Number of hard links */
uid_t st_uid; /* User ID of owner */
gid_t st_gid; /* Group ID of owner */
dev_t st_rdev; /* Device ID (if special file) */
off_t st_size; /* Total size, in bytes */
blksize_t st_blksize; /* Block size for filesystem I/O */
blkcnt_t st_blocks; /* Number of 512B blocks allocated */
/* Since Linux 2.6, the kernel supports nanosecond
precision for the following timestamp fields.
For the details before Linux 2.6, see NOTES. */
struct timespec st_atim; /* Time of last access */
struct timespec st_mtim; /* Time of last modification */
struct timespec st_ctim; /* Time of last status change */
#define st_atime st_atim.tv_sec /* Backward compatibility */
#define st_mtime st_mtim.tv_sec
#define st_ctime st_ctim.tv_sec
};
从上面的英文描述来看可以知道在st_mode
成员中存储了文件的类型和权限管理,这些存储的信息都是依靠二进制位进行存储的。文件类型如下:
S_IFMT 0170000 bit mask for the file type bit field
S_IFSOCK 0140000 socket
S_IFLNK 0120000 symbolic link
S_IFREG 0100000 regular file
S_IFBLK 0060000 block device
S_IFDIR 0040000 directory
S_IFCHR 0020000 character device
S_IFIFO 0010000 FIFO
其中S_IFMT
是文件类型的掩码,具体的文件类型需要使用st_mode & S_IFMT
进行确定,得到的值时什么就对应这上述的文件类型,如判断一个文件是否为文件夹文件可以使用语句(st_mode & S_IFMT) == S_IFDIR
。除此之外我们还有另外一种判断文件类型的宏函数,如下:
S_ISREG(m) is it a regular file?
S_ISDIR(m) directory?
S_ISCHR(m) character device?
S_ISBLK(m) block device?
S_ISFIFO(m) FIFO (named pipe)?
S_ISLNK(m) symbolic link? (Not in POSIX.1-1996.)
S_ISSOCK(m) socket? (Not in POSIX.1-1996.)
其中这里的m
传入的就是st_mode
。根据函数的真假来判断这个文件的具体类型。与第一种方法不同的是,第一种可以使用switch
来进行判断,而这种方法只能使用if
。如判断一个文件是否是块设备文件可以使用语句S_ISBLK(st_mode)
。
在st_mode
中,有属主、属组、其他人的各种权限,权限如下:
S_IRWXU 00700 owner has read, write, and execute permission
S_IRUSR 00400 owner has read permission
S_IWUSR 00200 owner has write permission
S_IXUSR 00100 owner has execute permission
S_IRWXG 00070 group has read, write, and execute permission
S_IRGRP 00040 group has read permission
S_IWGRP 00020 group has write permission
S_IXGRP 00010 group has execute permission
S_IRWXO 00007 others (not in group) have read, write, and
execute permission
S_IROTH 00004 others have read permission
S_IWOTH 00002 others have write permission
S_IXOTH 00001 others have execute permission
判断权限的时候只需要将st_mode
与上述的权限进行与&
操作,如果为真则表示有相应的权限。如判断属主是否有读权限可以使用语句st_mode & S_IRUSR
。
一般来说,对于时间我们更倾向于使用后面的宏定义出来的st_atime
、st_mtime
、st_ctime
,这些可以与以前的兼容。而这里的st_atim
、st_mtime
、st_ctime
也可以使用,两者实际上是一样的,都是秒数。其中struct timespec
的结构体定义如下:
struct timespec {
time_t tv_sec; /* seconds */
long tv_nsec; /* nanoseconds */
};
最后需要注意的是虽然stat
函数与lstat
函数使用是一样的,甚至他们的作用都是一样的,但是两者对于链接文件还是有区别的。对于stat
函数来说,调用之后得到的是链接文件指向文件的属性,而lstat
调用之后得到的是链接文件本身的属性。当对普通文件进行操作的时候,两者是没有任何区别的。
接下来看一个关于stat
函数的示例:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
// stat函数测试: 获取文件大小,文件属主和组
int main(int argc, char *argv[])
{
// int stat(const char *pathname, struct stat *statbuf);
struct stat st;
stat(argv[1], &st);
printf("uid = %d\n", st.st_uid);
printf("gid = %d\n", st.st_gid);
printf("size = %ld\n", st.st_size);
printf("inode = %ld\n", st.st_ino);
// 第一种方法判断文件类型
switch(st.st_mode & S_IFMT)
{
case S_IFSOCK:
printf("socket\n");
break;
case S_IFREG:
printf("regular file\n");
break;
case S_IFLNK:
printf("symbolic link\n");
break;
case S_IFBLK:
printf("block device\n");
break;
case S_IFDIR:
printf("directory\n");
break;
case S_IFCHR:
printf("character device\n");
break;
case S_IFIFO:
printf("FIFO\n");
break;
default:
printf("unknown file\n");
}
// 第二种方法判断文件类型
if(S_ISREG(st.st_mode))
{
printf("regular file\n");
}
if(S_ISDIR(st.st_mode))
{
printf("directory\n");
}
if(S_ISCHR(st.st_mode))
{
printf("character device\n");
}
if(S_ISBLK(st.st_mode))
{
printf("block device\n");
}
if(S_ISFIFO(st.st_mode))
{
printf("FIFO\n");
}
if(S_ISLNK(st.st_mode))
{
printf("symbolic link\n");
}
if(S_ISSOCK(st.st_mode))
{
printf("socket\n");
}
// 权限
// 属主
if(st.st_mode & S_IRUSR)
{
printf("r");
}
else
{
printf("-");
}
if(st.st_mode & S_IWUSR)
{
printf("w");
}
else
{
printf("-");
}
if(st.st_mode & S_IXUSR)
{
printf("x");
}
else
{
printf("-");
}
// 组
if(st.st_mode & S_IRGRP)
{
printf("r");
}
else
{
printf("-");
}
if(st.st_mode & S_IWGRP)
{
printf("w");
}
else
{
printf("-");
}
if(st.st_mode & S_IXGRP)
{
printf("x");
}
else
{
printf("-");
}
// 其它人
if(st.st_mode & S_IROTH)
{
printf("r");
}
else
{
printf("-");
}
if(st.st_mode & S_IWOTH)
{
printf("w");
}
else
{
printf("-");
}
if(st.st_mode & S_IXOTH)
{
printf("x\n");
}
else
{
printf("-\n");
}
return 0;
}
2.2 目录操作相关函数
目录操作的相关函数如下:
函数名 | 函数原型 | 函数参数 | 函数返回值 | 作用 |
---|---|---|---|---|
opendir | DIR *opendir(const char *name); | name: 目录名 | 成功返回指向目录流的指针,失败返回NULL并设置errno | 打开一个目录 |
readdir | struct dirent *readdir(DIR *dirp); | dirp: 目录流指针 | 返回一个指向目录结构的指针,失败返回NULL并设置errno | 读取目录流的一个目录结构 |
closedir | int closedir(DIR *dirp); | dirp: 目录流指针 | 成功返回0,失败返回-1并设置errno | 关闭目录 |
上面这些函数的调用需要头文件sys/types.h
、dirent.h
。其中结构体struct dirent
的定义如下:
struct dirent {
ino_t d_ino; /* Inode number */
off_t d_off; /* Not an offset; see below */
unsigned short d_reclen; /* Length of this record */
unsigned char d_type; /* Type of file; not supported
by all filesystem types */
char d_name[256]; /* Null-terminated filename */
};
其中d_name
是该文件的名字,d_type
是文件类型。文件类型的值如下:
DT_BLK This is a block device.
DT_CHR This is a character device.
DT_DIR This is a directory.
DT_FIFO This is a named pipe (FIFO).
DT_LNK This is a symbolic link.
DT_REG This is a regular file.
DT_SOCK This is a UNIX domain socket.
DT_UNKNOWN The file type could not be determined.
关于目录操作的函数使用的示例代码如下。
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <dirent.h>
int main(int argc, char *argv[])
{
// 打开文件夹
DIR *dir = opendir(argv[1]);
if(dir == NULL)
{
perror("opendir error");
return -1;
}
// 读取文件夹内容
struct dirent *ds = NULL;
while((ds = readdir(dir)) != NULL)
{
printf("filename: [%s] ", ds->d_name);
// 文件类型判断
if(ds->d_type == DT_BLK)
{
printf("This is a block device!\n");
}
else if(ds->d_type == DT_CHR)
{
printf("This is a character device!\n");
}
else if(ds->d_type == DT_DIR)
{
printf("This is a derectory!\n");
}
else if(ds->d_type == DT_FIFO)
{
printf("This is a named pipe!\n");
}
else if(ds->d_type == DT_LNK)
{
printf("This is a symbolic link!\n");
}
else if(ds->d_type == DT_REG)
{
printf("This is a regular file!\n");
}
else if(ds->d_type == DT_SOCK)
{
printf("This is a UNIX domain socket!\n");
}
else
{
printf("The file type could not be determined!\n");
}
}
return 0;
}
2.3 dup/dup2/fcntl函数
dup
和dup2
主要用于复制文件描述符,而fcntl
不仅可以复制文件描述符,也可以获取文件的flags
并且设置flags
。其中flags
是打开文件open
函数的第二个参数。这些函数的原型如下:
函数名 | 函数原型 | 函数参数 | 函数返回值 | 作用 |
---|---|---|---|---|
dup | int dup(int oldfd); | oldfd: 需要复制的文件描述符 | 新的文件描述符,失败返回-1并设置errno | 复制文件描述符 |
dup2 | int dup2(int oldfd, int newfd); | oldfd: 旧文件描述符 newfd: 新文件描述符 | 成功返回新的文件描述符即newfd,失败返回-1并设置errno | 复制文件描述符并指定为newfd |
fcntl | int fcntl(int fd, int cmd, … /* arg */ ); | fd: 文件描述符 cmd: 需要进行的操作 … :参数取决于cmd | 根据cmd不同返回值不一样 | 复制文件描述符,获取文件flags,设置flags等等,功能强大 |
在上面的函数中需要使用头文件unistd.h
,其中fcntl
函数需要多加一个fcntl.h
头文件。其中fcntl
函数的中常用cmd
如下。
cmd | 作用 | 函数返回值 |
---|---|---|
F_DUPFD | 复制文件描述符 | 成功返回文件描述符,失败返回-1并设置errno |
F_GETFL | 获取文件flags | 成功返回文件flags,失败返回-1并设置errno |
F_SETFL | 设置文件flags | 成功返回0,失败返回-1并设置errno |
关于这个函数的cmd
参数还有非常多,想要了解可以使用man fcntl
进行查阅相关文档。
常见的fcntl
操作如下:
// 1 复制一个新的文件描述符:
int newfd = fcntl(fd, F_DUPFD, 0);
// 2 获取文件的属性标志
int flag = fcntl(fd, F_GETFL, 0)
// 3 设置文件状态标志
flag = flag | O_APPEND;
fcntl(fd, F_SETFL, flag)
复制文件描述符使的工作原理如下:
可以看到实际上是多个文件描述符指向同一个文件,此时我们对其中的一个文件描述符使用close
操作的时候并不能真正关闭文件,需要所有的文件描述符都调用close
才能真正关闭文件。由于多个文件描述符操作一个文件,所以都是共用的第一个文件指针。
在dup2
中我们可以指定文件描述符,所以我们可以实现文件输出或者输入的重定向操作。
下面来看一些关于这些函数的例子。
-
关于dup函数的使用。
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <fcntl.h> #include <sys/types.h> #include <sys/stat.h> int main(int argc, char *argv[]) { // 打开文件 int fd = open(argv[1], O_RDWR); if(fd < 0) { perror("open error"); return -1; } // 复制文件描述符 int newfd = dup(fd); // 写文件 write(fd, "helloworld", strlen("helloworld")); // 移动文件指针到文件开头 lseek(fd, 0, SEEK_SET); // 使用newfd读文件 char buf[1024]; memset(buf, 0x00, sizeof buf); read(fd, buf, sizeof(buf)); printf("%s\n", buf); close(fd); close(newfd); return 0; }
-
关于dup2函数的使用。
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <fcntl.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> int main(int argc, char *argv[]) { int fd = open(argv[1], O_RDWR); if(fd < 0) { perror("open error"); return -1; } int newfd = 3; dup2(newfd, fd); // 向fd中写数据 write(fd, "nihaoya,damahou", strlen("nihaoya,damahou")); lseek(fd, 0, SEEK_SET); // 读newfd的数据 char buf[1024]; memset(buf, 0x00, sizeof buf); read(fd, buf, sizeof buf); printf("buf = %s\n", buf); close(fd); close(newfd); return 0; }
-
关于dup2函数的重定向使用。
// 实现文件重定向 #include <stdio.h> #include <stdlib.h> #include <string.h> #include <fcntl.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> int main(int argc, char *argv[]) { // 打开文件 int fd = open(argv[1], O_RDWR | O_CREAT, 0777); if(fd < 0) { perror("open error"); return -1; } // 重定向输出 dup2(fd, STDOUT_FILENO); printf("老铁6666\n"); printf("老铁NB Plus\n"); printf("大马猴,奥利给\n"); close(fd); return 0; }
-
关于fcntl函数的使用。
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <string.h> int main(int argc, char *argv[]) { // 打开文件 int fd = open(argv[1], O_RDWR); if(fd < 0) { perror("open error"); return -1; } // 获得和设置flags属性 int flags = fcntl(fd, F_GETFL, 0); flags = flags | O_APPEND; fcntl(fd, F_SETFL, flags); // 写文件 write(fd, "hello world", strlen("hello world")); // 关闭文件 close(fd); return 0; }