简要介绍
NIO(Non-blocking IO)是Java 1.4版本开始引入的一个新的IO API,旨在代替传统IO,它引入了缓冲区和通道的概念,通过选择器实现多路复用。
传统IO会区分字节流InputStream和OutputStream以及字符流Reader和Writer,可以发现他们都是单向的,而NIO的通道可以是双向的。
缓冲区Buffer
1. JDK注释及其中文翻译(手动翻译):
Buffer.class
A container for data of a specific primitive type. A buffer is a linear, finite sequence of elements of a specific primitive type. Aside from its content, the essential properties of a buffer are its capacity, limit, and position: A buffer's capacity is the number of elements it contains. The capacity of a buffer is never negative and never changes. A buffer's limit is the index of the first element that should not be read or written. A buffer's limit is never negative and is never greater than its capacity. A buffer's position is the index of the next element to be read or written. A buffer's position is never negative and is never greater than its limit. There is one subclass of this class for each non-boolean primitive type. Transferring data Each subclass of this class defines two categories of get and put operations: Relative operations read or write one or more elements starting at the current position and then increment the position by the number of elements transferred. If the requested transfer exceeds the limit then a relative get operation throws a BufferUnderflowException and a relative put operation throws a BufferOverflowException; in either case, no data is transferred. Absolute operations take an explicit element index and do not affect the position. Absolute get and put operations throw an IndexOutOfBoundsException if the index argument exceeds the limit. Data may also, of course, be transferred in to or out of a buffer by the I/O operations of an appropriate channel, which are always relative to the current position. Marking and resetting A buffer's mark is the index to which its position will be reset when the reset method is invoked. The mark is not always defined, but when it is defined it is never negative and is never greater than the position. If the mark is defined then it is discarded when the position or the limit is adjusted to a value smaller than the mark. If the mark is not defined then invoking the reset method causes an InvalidMarkException to be thrown. Invariants The following invariant holds for the mark, position, limit, and capacity values: 0 <= mark <= position <= limit <= capacity A newly-created buffer always has a position of zero and a mark that is undefined. The initial limit may be zero, or it may be some other value that depends upon the type of the buffer and the manner in which it is constructed. Each element of a newly-allocated buffer is initialized to zero. Clearing, flipping, and rewinding In addition to methods for accessing the position, limit, and capacity values and for marking and resetting, this class also defines the following operations upon buffers: clear makes a buffer ready for a new sequence of channel-read or relative put operations: It sets the limit to the capacity and the position to zero. flip makes a buffer ready for a new sequence of channel-write or relative get operations: It sets the limit to the current position and then sets the position to zero. rewind makes a buffer ready for re-reading the data that it already contains: It leaves the limit unchanged and sets the position to zero. Read-only buffers Every buffer is readable, but not every buffer is writable. The mutation methods of each buffer class are specified as optional operations that will throw a ReadOnlyBufferException when invoked upon a read-only buffer. A read-only buffer does not allow its content to be changed, but its mark, position, and limit values are mutable. Whether or not a buffer is read-only may be determined by invoking its isReadOnly method. Thread safety Buffers are not safe for use by multiple concurrent threads. If a buffer is to be used by more than one thread then access to the buffer should be controlled by appropriate synchronization. Invocation chaining Methods in this class that do not otherwise have a value to return are specified to return the buffer upon which they are invoked. This allows method invocations to be chained; for example, the sequence of statements b.flip(); b.position(23); b.limit(42); can be replaced by the single, more compact statement b.flip().position(23).limit(42); | 缓冲区是特定类型的数据的容器。 缓冲区是一个线性、有限的存放特定类型的元素序列,除了内容,他的基本属性是他的容量、限量和位置: 缓冲区的capacity是他存放的元素的个数。容量永远不会是负数,也永远不会改变。 缓冲区的limit是第一个不应该被读或写的元素的下标。限制永远不会是负数,也永远不会比容量大。 缓冲区的position是下一个将要被读或写的元素的下标。位置永远不会是负数并且永远不会比限制大。 对每一个非布尔类型都有一个Buffer的子类 传输数据 每一个Buffer类的子类定义了两类get和put的操作: 相对操作从当前position开始读取或写入一个或多个元素,然后将position增加传输的元素数。如果需要的传输超出了limit,那么一个相关的get操作会抛出一个BufferUnderflowException异常,相关的put操作会抛出一个BufferOverflowException异常;无论哪种情况,都不会传输任何数据。 绝对操作使用明确的元素下标并且不会影响position。如果这个下标参数超出了limit,绝对的get和put操作会抛出IndexOutOfBoundsException。 当然,数据也可以通过适当的通道传输到Buffer,这个操作也相对于position。 标记和重置 Buffer的mark是调用reset时将会被reset到的下标。mark不总是被定义,但是当它被定义后它永远不会为负数也永远不会比position大。如果mark被定义则当position或limit调整到小于mark时它将被丢弃。如果mark没有被定义则调用reset方法时会抛出InvalidMarkException异常。 不变式 下列的不变式保持了: 0 <= mark <= position <= limit <= capacity 一个新创建的buffer始终有一个值为0的position和一个未定义的mark。初始化的limit可能会为0,或者可以是取决于buffer的类型或构造方式的其他值。新分配的buffer的每个元素初始化都为0。 clear、flip和rewind 除了访问position、limit、capacity值的方法、标记和重置方法外,此类还定义了下列针对buffer的操作: clear使buffer管道读取或者相对的put操作做好准备:它设置了capacity、limit、position为0; flip使buffer的管道写入或者相对的get操作做好准备:它设置了limit=position,然后position=0; rewind使buffer重新读取已包含的数据做好准备:它保持限制不变,然后position=0; 只读缓冲区 每一个buffer都是可读的,但是不是所有buffer都是可写的。每一个buffer的类的突变方法都被指定为可选操作,这些操作在调用只读buffer时会抛出ReadOnlyBufferException异常。一个只读缓冲区不会允许他的内容改变,但是他的mark、position、和limit的值是可以变的。buffer是否只读可以调用isReadOnly()判断。 线程安全 buffers多线程使用是不安全的。如果一个缓冲区要由多个线程使用,那么应该通过适当的同步来控制对该缓冲区的访问。 链式调用 此类中没有返回值的方法指定返回buffer。这允许链式的方法调用;例如:下面的代码: b.flip(); b.position(23); b.limit(42); 可以被替换为: b.flip().position(23).limit(42); |
2. 使用示例:
它的具体使用我截取了Java NIO(非阻塞IO)图文详细解析。源码分析_nio非阻塞式网络通信源码_Hi丶ImViper的博客-CSDN博客的一个例子作为参考:
import java.nio.Buffer;
import java.nio.ByteBuffer;
// Press Shift twice to open the Search Everywhere dialog and type `show whitespaces`,
// then press Enter. You can now see whitespace characters in your code.
public class Main
{
public static void main(String[] args)
{
// 分配一个指定大小的缓冲区
ByteBuffer buf = ByteBuffer.allocate(1024);
soutBuf(buf, "初始化");
// 存入数据到缓冲区
String str = "abcde";
buf.put(str.getBytes());
soutBuf(buf, "存入数据");
// 切换读取数据模式
buf.flip();
soutBuf(buf, "切换读取数据模式");
// 开始读取数据
System.out.println("开始读取数据");
byte[] dst = new byte[buf.limit()];
buf.get(dst);
System.out.println(new String(dst));
soutBuf(buf, "数据读取完毕");
// 重读
buf.rewind();
soutBuf(buf, "存入数据");
// 清空
buf.clear();
soutBuf(buf, "clear");
}
private static void soutBuf(Buffer buf, String prompt)
{
System.out.println("存入数据");
System.out.println("position:" + buf.position());
System.out.println("limit:" + buf.limit());
System.out.println("capacity:" + buf.capacity());
}
}
3. 直接缓冲区和非直接缓冲区
非直接缓冲区:通过 allocate() 方法分配缓冲区,将缓冲区建立在JVM的内存中
直接缓冲区:通过allocateDirect() 方法分配直接缓冲区,将缓冲区建立在操作系统的物理内存中,可以提高效率。
4. 物理内存和JVM内存的区别和关系?
JVM内存:受JVM虚拟机内存大小的参数控制,当大小超过参数设置的大小时会报OOM;
本地内存:本地内存不受虚拟机内存参数的限制,只受物理内存容量的限制;虽然不受参数的限制,如果所占内存超过物理内存,仍然会报OOM。
本地内存 = 直接内存 + 元空间;
直接内存不是虚拟机运行时数据区的一部分,直接内存是在Java堆外地、直接向系统申请的内存区域。
直接内存使用NIO,通过存在堆中的DirectByteBuffer操作Native内存,所以读写性能高。
频繁读写操作推荐使用直接内存。
NIO允许Java程序使用直接内存,用于数据传输。
直接内存不受JVM管理,但是系统内存是有限的,物理内存不足时会报OOM。
元空间存储类的元数据信息。
通道Channel
1. JDK注释及其中文翻译(手动翻译):
Channel.class
A nexus for I/O operations. A channel represents an open connection to an entity such as a hardware device, a file, a network socket, or a program component that is capable of performing one or more distinct I/O operations, for example reading or writing. A channel is either open or closed. A channel is open upon creation, and once closed it remains closed. Once a channel is closed, any attempt to invoke an I/O operation upon it will cause a ClosedChannelException to be thrown. Whether or not a channel is open may be tested by invoking its isOpen method. Channels are, in general, intended to be safe for multithreaded access as described in the specifications of the interfaces and classes that extend and implement this interface. | 一个IO操作的连接。 一个管道代表一个打开的实体连接,这个实体可以是一个硬件、文件、网络通道或者一个程序组件,他能执行一个或多个不同的IO操作,比如读、写。 一个管道可以打开或关闭,管道创建时是打开的,并且一旦关闭它将保持关闭状态。一旦一个管道关闭,所有的试图调用IO的操作都会抛一个ClosedChannelException异常。通道是否是打开的可以通过调用isOpen()方法测试。 通常,通道对于多线程访问是安全的,正如扩展和实现该接口的接口和类的规范中所描述的那样。 |
选择器Selector
1. JDK注释及其中文翻译(手动翻译):
Selector.class
A multiplexor of SelectableChannel objects. A selector may be created by invoking the open method of this class, which will use the system's default selector provider to create a new selector. A selector may also be created by invoking the openSelector method of a custom selector provider. A selector remains open until it is closed via its close method. A selectable channel's registration with a selector is represented by a SelectionKey object. A selector maintains three sets of selection keys: The key set contains the keys representing the current channel registrations of this selector. This set is returned by the keys method. The selected-key set is the set of keys such that each key's channel was detected to be ready for at least one of the operations identified in the key's interest set during a prior selection operation. This set is returned by the selectedKeys method. The selected-key set is always a subset of the key set. The cancelled-key set is the set of keys that have been cancelled but whose channels have not yet been deregistered. This set is not directly accessible. The cancelled-key set is always a subset of the key set. All three sets are empty in a newly-created selector. A key is added to a selector's key set as a side effect of registering a channel via the channel's register method. Cancelled keys are removed from the key set during selection operations. The key set itself is not directly modifiable. A key is added to its selector's cancelled-key set when it is cancelled, whether by closing its channel or by invoking its cancel method. Cancelling a key will cause its channel to be deregistered during the next selection operation, at which time the key will removed from all of the selector's key sets. Keys are added to the selected-key set by selection operations. A key may be removed directly from the selected-key set by invoking the set's remove method or by invoking the remove method of an iterator obtained from the set. Keys are never removed from the selected-key set in any other way; they are not, in particular, removed as a side effect of selection operations. Keys may not be added directly to the selected-key set. Selection During each selection operation, keys may be added to and removed from a selector's selected-key set and may be removed from its key and cancelled-key sets. Selection is performed by the select(), select(long), and selectNow() methods, and involves three steps: Each key in the cancelled-key set is removed from each key set of which it is a member, and its channel is deregistered. This step leaves the cancelled-key set empty. The underlying operating system is queried for an update as to the readiness of each remaining channel to perform any of the operations identified by its key's interest set as of the moment that the selection operation began. For a channel that is ready for at least one such operation, one of the following two actions is performed: If the channel's key is not already in the selected-key set then it is added to that set and its ready-operation set is modified to identify exactly those operations for which the channel is now reported to be ready. Any readiness information previously recorded in the ready set is discarded. Otherwise the channel's key is already in the selected-key set, so its ready-operation set is modified to identify any new operations for which the channel is reported to be ready. Any readiness information previously recorded in the ready set is preserved; in other words, the ready set returned by the underlying system is bitwise-disjoined into the key's current ready set. If all of the keys in the key set at the start of this step have empty interest sets then neither the selected-key set nor any of the keys' ready-operation sets will be updated. If any keys were added to the cancelled-key set while step (2) was in progress then they are processed as in step (1). Whether or not a selection operation blocks to wait for one or more channels to become ready, and if so for how long, is the only essential difference between the three selection methods. Concurrency Selectors are themselves safe for use by multiple concurrent threads; their key sets, however, are not. The selection operations synchronize on the selector itself, on the key set, and on the selected-key set, in that order. They also synchronize on the cancelled-key set during steps (1) and (3) above. Changes made to the interest sets of a selector's keys while a selection operation is in progress have no effect upon that operation; they will be seen by the next selection operation. Keys may be cancelled and channels may be closed at any time. Hence the presence of a key in one or more of a selector's key sets does not imply that the key is valid or that its channel is open. Application code should be careful to synchronize and check these conditions as necessary if there is any possibility that another thread will cancel a key or close a channel. A thread blocked in one of the select() or select(long) methods may be interrupted by some other thread in one of three ways: By invoking the selector's wakeup method, By invoking the selector's close method, or By invoking the blocked thread's interrupt method, in which case its interrupt status will be set and the selector's wakeup method will be invoked. The close method synchronizes on the selector and all three key sets in the same order as in a selection operation. A selector's key and selected-key sets are not, in general, safe for use by multiple concurrent threads. If such a thread might modify one of these sets directly then access should be controlled by synchronizing on the set itself. The iterators returned by these sets' iterator methods are fail-fast: If the set is modified after the iterator is created, in any way except by invoking the iterator's own remove method, then a java.util.ConcurrentModificationException will be thrown. | SelectableChannel对象的多路复用器 selector可以通过调用此类的open()方法创建,这会使用系统的默认选择器去创建一个新的selector。selector也可以通过调用自定义的selector provider的openSelector()方法创建。selector保持打开状态直到调用close()方法为止。 可选择的通道和选择器的注册由SelectionKey对象表示。selector保留三组selection keys: 1. 一组表示该selector当前频道注册的key;这个集合由keys返回; 2. 一组已选择的key集合,表示在之前的选择阶段时,集合内的每一个key的通道都准备好了执行一个相关的通道操作。这个集合通过selectedKeys方法返回。这一组集合始终是keys()的子集。 3. 一组取消的集合,表示已经被尚未注销的频道取消的key的集合。这组集合不能直接访问,它始终是keys()的子集。 在新创建的selector中三个集合都是空的。 通过channel的register方法,一个key可以被添加在key集合中。在选择操作阶段,已取消的key集合将从key集合中移除。key集合本身是不可直接修改的。 key在被取消时添加在selector的cancelled-key集合中,无论是关闭它的channel或是调用它的cancel方法。 在下个选择操作阶段,取消key将会导致它的channel取消注册,此时key将会从selector的所有key集合中移除。 在选择操作阶段key被添加进selected-key集合中。key可以通过调用该集合的remove方法或调用该集合包含的迭代器的remove方法直接删除,keys不会通过其他任何方法从selected-key集合中移除;特别是,他们不会作为一种连带操作在选择阶段移除。keys不会直接添加在selected-key集合中。 选择 在每个选择阶段,keys可以在selected-key集合中被添加、删除,在它的key集合和cancelled-key集合中删除。通过select()、select(long)、selectNow()方法选择,它包含三个步骤: 1. 每个cancelled-key集合中的key都将从其所属的key集中删除,并注销其通道。这一步使cancelled-key集合置空。 2. 在选择操作开始时,底层操作系统查询剩余的通道是否准备好执行任何此key的兴趣集所标识的操作并进行更新。对于已经准备好至少一个此类操作的通道,会发生以下两种情况: a. 如果通道的key没有在selected-key集合中,它会被添加进去并且他的就绪操作集合被修改以便于精确地识别那些已经被报告就绪的操作。 b. 如果通道的key在selected-key集合中,他的就绪操作集合被修改以便于识别任何新的此通道已经报告就绪的操作。先前记录在就绪操作集合中的信息都会被保留;换句话说,就绪集合以底层操作系统的按位分离进key的当前就绪集合中的形式返回。 如果所有key集合中的key在此步骤开始时有空的兴趣集,则selected-key中的所有key的就绪操作集合都不会更新。 3. 如果在步骤2进行时有任何密钥被添加到被取消的密钥集中,则它们将按照步骤1进行处理。 选择操作是否阻止等待一个或多个通道准备就绪,以及如果阻止等待多长时间,是三种选择方法之间唯一的本质区别。 并发 selectors本身对于多线程是安全的,但他们的key集合却不是。 选择器本身、key集合和selected-key集合上的选择操作按顺序同步。在上述步骤(1)和(3)中,它们还对已取消的cancelled-key集合进行同步。 当选择操作正在进行时,对选择器key的兴趣集所做的更改对该操作没有影响;它们将被下一个选择操作看到。 按键可能被取消,通道可能随时关闭。因此,在选择器的一个或多个密钥集中存在密钥并不意味着该密钥是有效的或其通道是打开的。应用程序代码应该小心同步,并在必要时检查这些条件,如果有任何可能性,另一个线程将取消一个键或关闭一个通道。 在select()或select(long)方法中被阻塞的线程可能会被其他线程以以下三种方式之一中断: 通过调用选择器的wakeup方法, 通过调用选择器的close方法,或者 通过调用阻塞线程的interrupt方法,在这种情况下,将设置其中断状态,并调用选择器的唤醒方法。 关闭方法以与选择操作相同的顺序同步选择器和所有三个key集。 选择器的key和selected-key集通常对于多个并发线程使用是不安全的。如果这样的线程可能直接修改其中一个集合,那么访问应该通过对集合本身进行同步来控制。这些集合的迭代器方法返回的迭代程序是快速失败的:如果在迭代器创建后以任何方式修改集合,除了调用迭代器自己的remove方法,那么将抛出java.util.ConcurrentModificationException。 |
非阻塞?
非阻塞显然是通过Selector来实现的,Selector通过SelectionKey的三个集合来管理通道,包含所有可用的、已选择的和取消注册的。选择阶段时(调用select)通过底层操作系统检测通道是否可用来更新这些集合中key的就绪状态。通过select、select(long)和selectNow方法选择不同的可用通道,最终完成通信。
本文参考:Java NIO(非阻塞IO)图文详细解析。源码分析_nio非阻塞式网络通信源码_Hi丶ImViper的博客-CSDN博客