epubjs

<?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"><head><title>The NIO Package</title><link rel="stylesheet" href="core.css" type="text/css"/><meta name="generator" content="DocBook XSL Stylesheets V1.74.0"/></head><body><div class="sect1" title="The NIO Package"><div class="titlepage"><div><div><h1 class="title"><a id="learnjava3-CHP-12-SECT-5"/>The NIO Package</h1></div></div></div><p>We are now going to complete our introduction to core Java I/O facilities by returning to the <a id="I_indexterm12_id761367" class="indexterm"/><code class="literal">java.nio</code> package. The name NIO stands for “New I/O” and, as we saw earlier in this chapter in our discussion of <code class="literal">java.nio.file</code>, one aspect of NIO is simply to update and enhance features of the legacy <code class="literal">java.io</code> package. Much of the general NIO functionality does indeed overlap with existing APIs. However, NIO was first introduced to address specific issues of scalability for large systems, especially in networked applications. The following section outlines the basic elements of NIO, which center on working with <span class="emphasis"><em>buffers</em></span> and <span class="emphasis"><em>channels</em></span>.</p><div class="sect2" title="Asynchronous I/O"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-12-SECT-5.1"/>Asynchronous I/O</h2></div></div></div><p><a id="idx10678" class="indexterm"/> <a id="idx10714" class="indexterm"/> <a id="idx10732" class="indexterm"/>Most of the need for the NIO package was driven by the desire to add <span class="emphasis"><em>nonblocking</em></span> and <span class="emphasis"><em>selectable</em></span> I/O to Java. Prior to NIO, most read and write operations in Java were bound to threads and were forced to block for unpredictable amounts of time. Although certain APIs such as Sockets (which we’ll see in <a class="xref" href="ch13.html" title="Chapter 13. Network Programming">Chapter 13</a>) provided specific means to limit how long an I/O call could take, this was a workaround to compensate for the lack of a more general mechanism. In many languages, even those without threading, I/O could still be done efficiently by setting I/O streams to a nonblocking mode and testing them for their readiness to send or receive data. In a nonblocking mode, a read or write does only as much work as can be done immediately—filling or emptying a buffer and then returning. Combined with the ability to test for readiness, this allows a single-threaded application to continuously service many channels efficiently. The main thread “selects” a stream that is ready and works with it until it blocks and then moves on to another. On a single-processor system, this is fundamentally equivalent to using multiple threads. It turns out that this style of processing has scalability advantages even when using a pool of threads (rather than just one). We’ll discuss this in detail in <a class="xref" href="ch13.html" title="Chapter 13. Network Programming">Chapter 13</a>when we discuss networking and building servers that can handle many clients simultaneously.</p><p>In addition to nonblocking and selectable I/O, the NIO package enables closing and interrupting I/O operations asynchronously. As discussed in <a class="xref" href="ch09.html" title="Chapter 9. Threads">Chapter 9</a>, prior to NIO there was no reliable way to stop or wake up a thread blocked in an I/O operation. With NIO, threads blocked in I/O operations always wake up when interrupted or when the channel is closed by anyone. Additionally, if you interrupt a thread while it is blocked in an NIO operation, its channel is automatically closed. (Closing the channel because the thread is interrupted might seem too strong, but usually it’s the right thing to do.)<a id="I_indexterm12_id761505" class="indexterm"/><a id="I_indexterm12_id761512" class="indexterm"/><a id="I_indexterm12_id761519" class="indexterm"/></p></div><div class="sect2" title="Performance"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-12-SECT-5.2"/>Performance</h2></div></div></div><p><a id="I_indexterm12_id761533" class="indexterm"/> <a id="idx10741" class="indexterm"/>Channel I/O is designed around the concept of <span class="emphasis"><em>buffers</em></span>, which are a sophisticated form of array, tailored to working with communications. The NIO package supports the concept of <span class="emphasis"><em>direct buffers—</em></span>buffers that maintain their memory outside the Java VM in the host operating system. Because all real I/O operations ultimately have to work with the host OS by maintaining the buffer space there, some operations can be made much more efficient. Data moving between two external endpoints can be transferred without first copying it into Java and back out.</p></div><div class="sect2" title="Mapped and Locked Files"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-12-SECT-5.3"/>Mapped and Locked Files</h2></div></div></div><p><a id="I_indexterm12_id761575" class="indexterm"/> <a id="I_indexterm12_id761586" class="indexterm"/> <a id="I_indexterm12_id761592" class="indexterm"/> <a id="idx10740" class="indexterm"/>NIO provides two general-purpose file-related features not found in <code class="literal">java.io</code>: memory-mapped files and file locking. We’ll discuss memory-mapped files later, but suffice it to say that they allow you to work with file data as if it were all magically resident in memory. File locking supports the concept of shared and exclusive locks on regions of files—useful for concurrent access by multiple applications.</p></div><div class="sect2" title="Channels"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-12-SECT-5.4"/>Channels</h2></div></div></div><p><a id="idx10684" class="indexterm"/> <a id="idx10716" class="indexterm"/> <a id="idx10735" class="indexterm"/>While <code class="literal">java.io</code> deals with streams, <code class="literal">java.nio</code> works with channels. A <span class="emphasis"><em>channel</em></span> is an endpoint for communication. Although in practice channels are similar to streams, the underlying notion of a channel is more abstract and primitive. Whereas streams in <code class="literal">java.io</code> are defined in terms of input or output with methods to read and write bytes, the basic channel interface says nothing about how communications happen. It simply has the notion of being open or closed, supported via the methods <code class="literal">isOpen()</code> and <code class="literal">close()</code>. Implementations of channels for files, network sockets, or arbitrary devices then add their own methods for operations, such as reading, writing, or transferring data. The following channels are provided by NIO:</p><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><code class="literal">FileChannel</code></p></li><li class="listitem"><p><code class="literal">Pipe.SinkChannel</code>, <code class="literal">Pipe.SourceChannel</code></p></li><li class="listitem"><p><code class="literal">SocketChannel</code>, <code class="literal">ServerSocketChannel</code>, <code class="literal">DatagramChannel</code></p></li></ul></div><p>We’ll cover <code class="literal">FileChannel</code> in this chapter. The <code class="literal">Pipe</code> channels are simply the channel equivalents of the <code class="literal">java.io Pipe</code> facilities. We’ll talk about <code class="literal">Socket</code> and <code class="literal">Datagram</code> channels in <a class="xref" href="ch13.html" title="Chapter 13. Network Programming">Chapter 13</a>. Additionally, in Java 7 there are now asynchronous versions of both the file and socket channels: <code class="literal">AsynchronousFileChannel</code>, <code class="literal">AsynchronousSocketChannel</code>, <code class="literal">AsynchronousServerSocketChannel</code>, and <code class="literal">AsynchronousDatagramChannel</code>. These asynchronous versions essentially buffer all of their operations through a thread pool and report results back through an asynchronous API. We’ll talk about the asynchronous file channel later in this chapter.</p><p>All these basic channels implement the <code class="literal">ByteChannel</code> interface, designed for channels that have read and write methods like I/O streams. <code class="literal">ByteChannel</code>s read and write <code class="literal">ByteBuffer</code>s, however, as opposed to plain byte arrays.</p><p>In addition to these channel implementations, you can bridge channels with <code class="literal">java.io</code> I/O streams and readers and writers for interoperability. However, if you mix these features, you may not get the full benefits and performance offered by the NIO package.<a id="I_indexterm12_id761832" class="indexterm"/><a id="I_indexterm12_id761839" class="indexterm"/><a id="I_indexterm12_id761846" class="indexterm"/></p></div><div class="sect2" title="Buffers"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-12-SECT-5.5"/>Buffers</h2></div></div></div><p><a id="idx10715" class="indexterm"/>Most of the utilities of the <code class="literal">java.io</code> and <code class="literal">java.net</code> packages operate on byte arrays. The corresponding tools of the NIO package are built around <a id="idx10683" class="indexterm"/><code class="literal">ByteBuffer</code>s (with character-based buffer <a id="I_indexterm12_id761901" class="indexterm"/><code class="literal">CharBuffer</code> for text). Byte arrays are simple, so why are buffers necessary? They serve several purposes:</p><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p>They formalize the usage patterns for buffered data, provide for things like read-only buffers, and keep track of read/write positions and limits within a large buffer space. They also provide a mark/reset facility like that of <code class="literal">java.io.BufferedInputStream</code>.</p></li><li class="listitem"><p>They provide additional APIs for working with raw data representing primitive types. You can create buffers that “view” your byte data as a series of larger primitives, such as <code class="literal">short</code>s, <code class="literal">int</code>s, or <code class="literal">float</code>s. The most general type of data buffer, <code class="literal">ByteBuffer</code>, includes methods that let you read and write all primitive types just like <code class="literal">DataOutputStream</code> does for streams.</p></li><li class="listitem"><p>They abstract the underlying storage of the data, allowing for special optimizations by Java. Specifically, buffers may be allocated as direct buffers that use native buffers of the host operating system instead of arrays in Java’s memory. The NIO <code class="literal">Channel</code> facilities that work with buffers can recognize direct buffers automatically and try to optimize I/O to use them. For example, a read from a file channel into a Java byte array normally requires Java to copy the data for the read from the host operating system into Java’s memory. With a direct buffer, the data can remain in the host operating system, outside Java’s normal memory space until and unless it is needed.</p></li></ul></div><div class="sect3" title="Buffer operations"><div class="titlepage"><div><div><h3 class="title"><a id="learnjava3-CHP-12-SECT-5.5.1"/>Buffer operations</h3></div></div></div><p><a id="idx10681" class="indexterm"/> <a id="idx10733" class="indexterm"/>A buffer is a subclass of a <a id="I_indexterm12_id762016" class="indexterm"/><code class="literal">java.nio.Buffer</code> object. The base <code class="literal">Buffer</code> class is something like an array with state. It does not specify what type of elements it holds (that is for subtypes to decide), but it does define functionality that is common to all data buffers. A <code class="literal">Buffer</code> has a fixed size called its <span class="emphasis"><em>capacity</em></span>. Although all the standard <code class="literal">Buffer</code>s provide “random access” to their contents, a <code class="literal">Buffer</code> generally expects to be read and written sequentially, so <code class="literal">Buffer</code>s maintain the notion of a <span class="emphasis"><em>position</em></span> where the next element is read or written. In addition to position, a <code class="literal">Buffer</code> can maintain two other pieces of state information: a <span class="emphasis"><em>limit</em></span>, which is a position that is a “soft” limit to the extent of a read or write, and a <span class="emphasis"><em>mark</em></span>, which can be used to remember an earlier position for future recall.</p><p>Implementations of <code class="literal">Buffer</code> add specific, typed get and put methods that read and write the buffer contents. For example, <code class="literal">ByteBuffer</code> is a buffer of bytes and it has <a id="I_indexterm12_id762094" class="indexterm"/><code class="literal">get()</code> and <a id="I_indexterm12_id762107" class="indexterm"/><code class="literal">put()</code> methods that read and write bytes and arrays of bytes (along with many other useful methods we’ll discuss later). Getting from and putting to the <code class="literal">Buffer</code> changes the position marker, so the <code class="literal">Buffer</code> keeps track of its contents somewhat like a stream. Attempting to read or write past the limit marker generates a <a id="I_indexterm12_id762132" class="indexterm"/><code class="literal">BufferUnderflowException</code> or <a id="I_indexterm12_id762144" class="indexterm"/><code class="literal">BufferOverflowException</code>, respectively.</p><p>The mark, position, limit, and capacity values always obey the following formula:</p><a id="I_programlisting12_id762167"/><pre class="programlisting"> <code class="n">mark</code> <code class="o"><=</code> <code class="n">position</code> <code class="o"><=</code> <code class="n">limit</code> <code class="o"><=</code> <code class="n">capacity</code></pre><p>The position for reading and writing the <code class="literal">Buffer</code> is always between the mark, which serves as a lower bound, and the limit, which serves as an upper bound. The capacity represents the physical extent of the buffer space.</p><p>You can set the position and limit markers explicitly with the <a id="I_indexterm12_id762186" class="indexterm"/><code class="literal">position()</code> and <a id="I_indexterm12_id762197" class="indexterm"/><code class="literal">limit()</code> methods. Several convenience methods are provided for common usage patterns. The <a id="I_indexterm12_id762208" class="indexterm"/><code class="literal">reset()</code> method sets the position back to the mark. If no mark has been set, an <code class="literal">InvalidMarkException</code> is thrown. The <a id="I_indexterm12_id762225" class="indexterm"/><code class="literal">clear()</code> method resets the position to <code class="literal">0</code> and makes the limit the capacity, readying the buffer for new data (the mark is discarded). Note that the <code class="literal">clear()</code> method does not actually do anything to the data in the buffer; it simply changes the position markers.</p><p>The <code class="literal">flip()</code> method is used for the common pattern of writing data into the buffer and then reading it back out. <a id="I_indexterm12_id762257" class="indexterm"/><code class="literal">flip</code> makes the current position the limit and then resets the current position to <code class="literal">0</code> (any mark is thrown away), which saves having to keep track of how much data was read. Another method, <code class="literal">rewind()</code>, simply resets the position to <code class="literal">0</code>, leaving the limit alone. You might use it to write the same size data again. Here is a snippet of code that uses these methods to read data from a channel and write it to two channels:</p><a id="I_12_tt817"/><pre class="programlisting"> <code class="n">ByteBuffer</code> <code class="n">buff</code> <code class="o">=</code> <code class="o">...</code> <code class="k">while</code> <code class="o">(</code> <code class="n">inChannel</code><code class="o">.</code><code class="na">read</code><code class="o">(</code> <code class="n">buff</code> <code class="o">)</code> <code class="o">></code> <code class="mi">0</code> <code class="o">)</code> <code class="o">{</code> <code class="c1">// position = ?</code> <code class="n">buff</code><code class="o">.</code><code class="na">flip</code><code class="o">();</code> <code class="c1">// limit = position; position = 0;</code> <code class="n">outChannel</code><code class="o">.</code><code class="na">write</code><code class="o">(</code> <code class="n">buff</code> <code class="o">);</code> <code class="n">buff</code><code class="o">.</code><code class="na">rewind</code><code class="o">();</code> <code class="c1">// position = 0</code> <code class="n">outChannel2</code><code class="o">.</code><code class="na">write</code><code class="o">(</code> <code class="n">buff</code> <code class="o">);</code> <code class="n">buff</code><code class="o">.</code><code class="na">clear</code><code class="o">();</code> <code class="c1">// position = 0; limit = capacity</code> <code class="o">}</code></pre><p>This might be confusing the first time you look at it because here, the read from the <code class="literal">Channel</code> is actually a write to the <code class="literal">Buffer</code> and vice versa. Because this example writes all the available data up to the limit, either <code class="literal">flip()</code> or <code class="literal">rewind()</code> have the same effect in this case.<a id="I_indexterm12_id762328" class="indexterm"/><a id="I_indexterm12_id762335" class="indexterm"/></p></div><div class="sect3" title="Buffer types"><div class="titlepage"><div><div><h3 class="title"><a id="learnjava3-CHP-12-SECT-5.5.2"/>Buffer types</h3></div></div></div><p><a id="idx10682" class="indexterm"/> <a id="idx10734" class="indexterm"/>As stated earlier, various buffer types add get and put methods for reading and writing specific data types. Each of the Java primitive types has an associated buffer type: <a id="I_indexterm12_id762377" class="indexterm"/><code class="literal">ByteBuffer</code>, <a id="I_indexterm12_id762390" class="indexterm"/><code class="literal">CharBuffer</code>, <a id="I_indexterm12_id762400" class="indexterm"/><code class="literal">ShortBuffer</code>, <a id="I_indexterm12_id762411" class="indexterm"/><code class="literal">IntBuffer</code>, <a id="I_indexterm12_id762421" class="indexterm"/><code class="literal">LongBuffer</code>, <a id="I_indexterm12_id762432" class="indexterm"/><code class="literal">FloatBuffer</code>, and <a id="I_indexterm12_id762443" class="indexterm"/><code class="literal">DoubleBuffer</code>. Each provides get and put methods for reading and writing its type and arrays of its type. Of these, <code class="literal">ByteBuffer</code> is the most flexible. Because it has the “finest grain” of all the buffers, it has been given a full complement of get and put methods for reading and writing all the other data types as well as <code class="literal">byte</code>. Here are some <code class="literal">ByteBuffer</code> methods:</p><a id="I_12_tt818"/><pre class="programlisting"> <code class="kt">byte</code> <code class="nf">get</code><code class="o">()</code> <code class="kt">char</code> <code class="nf">getChar</code><code class="o">()</code> <code class="kt">short</code> <code class="nf">getShort</code><code class="o">()</code> <code class="kt">int</code> <code class="nf">getInt</code><code class="o">()</code> <code class="kt">long</code> <code class="nf">getLong</code><code class="o">()</code> <code class="kt">float</code> <code class="nf">getFloat</code><code class="o">()</code> <code class="kt">double</code> <code class="nf">getDouble</code><code class="o">()</code> <code class="kt">void</code> <code class="nf">put</code><code class="o">(</code><code class="kt">byte</code> <code class="n">b</code><code class="o">)</code> <code class="kt">void</code> <code class="nf">put</code><code class="o">(</code><code class="n">ByteBuffer</code> <code class="n">src</code><code class="o">)</code> <code class="kt">void</code> <code class="nf">put</code><code class="o">(</code><code class="kt">byte</code><code class="o">[]</code> <code class="n">src</code><code class="o">,</code> <code class="kt">int</code> <code class="n">offset</code><code class="o">,</code> <code class="kt">int</code> <code class="n">length</code><code class="o">)</code> <code class="kt">void</code> <code class="nf">put</code><code class="o">(</code><code class="kt">byte</code><code class="o">[]</code> <code class="n">src</code><code class="o">)</code> <code class="kt">void</code> <code class="nf">putChar</code><code class="o">(</code><code class="kt">char</code> <code class="n">value</code><code class="o">)</code> <code class="kt">void</code> <code class="nf">putShort</code><code class="o">(</code><code class="kt">short</code> <code class="n">value</code><code class="o">)</code> <code class="kt">void</code> <code class="nf">putInt</code><code class="o">(</code><code class="kt">int</code> <code class="n">value</code><code class="o">)</code> <code class="kt">void</code> <code class="nf">putLong</code><code class="o">(</code><code class="kt">long</code> <code class="n">value</code><code class="o">)</code> <code class="kt">void</code> <code class="nf">putFloat</code><code class="o">(</code><code class="kt">float</code> <code class="n">value</code><code class="o">)</code> <code class="kt">void</code> <code class="nf">putDouble</code><code class="o">(</code><code class="kt">double</code> <code class="n">value</code><code class="o">)</code></pre><p>As we said, all the standard buffers also support random access. For each of the aforementioned methods of <code class="literal">ByteBuffer</code>, an additional form takes an index; for example:</p><a id="I_12_tt819"/><pre class="programlisting"> <code class="n">getLong</code><code class="o">(</code> <code class="kt">int</code> <code class="n">index</code> <code class="o">)</code> <code class="n">putLong</code><code class="o">(</code> <code class="kt">int</code> <code class="n">index</code><code class="o">,</code> <code class="kt">long</code> <code class="n">value</code> <code class="o">)</code></pre><p>But that’s not all. <code class="literal">ByteBuffer</code> can also provide “views” of itself as any of the coarse-grained types. For example, you can fetch a <code class="literal">ShortBuffer</code> view of a <code class="literal">ByteBuffer</code> with the <a id="I_indexterm12_id762525" class="indexterm"/><code class="literal">asShortBuffer()</code> method. The <code class="literal">ShortBuffer</code> view is <span class="emphasis"><em>backed</em></span> by the <code class="literal">ByteBuffer</code>, which means that they work on the same data, and changes to either one affect the other. The view buffer’s extent starts at the <code class="literal">ByteBuffer</code>’s current position, and its capacity is a function of the remaining number of bytes, divided by the new type’s size. (For example, <code class="literal">short</code>s consume two bytes each, <code class="literal">float</code>s four, and <code class="literal">long</code>s and <code class="literal">double</code>s take eight.) View buffers are convenient for reading and writing large blocks of a contiguous type within a <code class="literal">ByteBuffer</code>.</p><p><code class="literal">CharBuffer</code>s are interesting as well, primarily because of their integration with <code class="literal">String</code>s. Both <code class="literal">CharBuffer</code>s and <code class="literal">String</code>s implement the <a id="I_indexterm12_id762609" class="indexterm"/><code class="literal">java.lang.CharSequence</code> interface. This is the interface that provides the standard <code class="literal">charAt()</code> and <code class="literal">length()</code> methods. Because of this, newer APIs (such as the <code class="literal">java.util.regex</code> package) allow you to use a <code class="literal">CharBuffer</code> or a <code class="literal">String</code> interchangeably. In this case, the <code class="literal">CharBuffer</code> acts like a modifiable <code class="literal">String</code> with user-configurable, logical start and end positions.<a id="I_indexterm12_id762660" class="indexterm"/><a id="I_indexterm12_id762667" class="indexterm"/></p></div><div class="sect3" title="Byte order"><div class="titlepage"><div><div><h3 class="title"><a id="learnjava3-CHP-12-SECT-5.5.3"/>Byte order</h3></div></div></div><p><a id="I_indexterm12_id762681" class="indexterm"/> <a id="I_indexterm12_id762689" class="indexterm"/>Because we’re talking about reading and writing types larger than a byte, the question arises: in what order do the bytes of multibyte values (e.g., <code class="literal">short</code>s and <code class="literal">int</code>s) get written? There are two camps in this world: “big endian” and “little endian.”<sup>[<a id="learnjava3-CHP-12-FN-3" href="#ftn.learnjava3-CHP-12-FN-3" class="footnote">36</a>]</sup> Big endian means that the most significant bytes come first; little endian is the reverse. If you’re writing binary data for consumption by some native application, this is important. Intel-compatible computers use little endian, and many workstations that run Unix use big endian. The <a id="I_indexterm12_id762750" class="indexterm"/><code class="literal">ByteOrder</code> class encapsulates the choice. You can specify the byte order to use with the <code class="literal">ByteBuffer order()</code> method, using the identifiers <code class="literal">ByteOrder.BIG_ENDIAN</code> and <code class="literal">ByteOrder.LITTLE_ENDIAN</code> like so:</p><a id="I_12_tt820"/><pre class="programlisting"> <code class="n">byteArray</code><code class="o">.</code><code class="na">order</code><code class="o">(</code> <code class="n">ByteOrder</code><code class="o">.</code><code class="na">BIG_ENDIAN</code> <code class="o">);</code></pre><p>You can retrieve the native ordering for your platform using the static <code class="literal">ByteOrder.nativeOrder()</code> method. (I know you’re curious.)</p></div><div class="sect3" title="Allocating buffers"><div class="titlepage"><div><div><h3 class="title"><a id="learnjava3-CHP-12-SECT-5.5.4"/>Allocating buffers</h3></div></div></div><p><a id="I_indexterm12_id762803" class="indexterm"/> <a id="I_indexterm12_id762812" class="indexterm"/>You can create a buffer either by allocating it explicitly using <a id="I_indexterm12_id762823" class="indexterm"/><code class="literal">allocate()</code> or by wrapping an existing plain Java array type. Each buffer type has a static <code class="literal">allocate()</code> method that takes a capacity (size) and also a <code class="literal">wrap()</code> method that takes an existing array:</p><a id="I_12_tt821"/><pre class="programlisting"> <code class="n">CharBuffer</code> <code class="n">cbuf</code> <code class="o">=</code> <code class="n">CharBuffer</code><code class="o">.</code><code class="na">allocate</code><code class="o">(</code> <code class="mi">64</code><code class="o">*</code><code class="mi">1024</code> <code class="o">);</code></pre><p>A direct buffer is allocated in the same way, with the <a id="I_indexterm12_id762858" class="indexterm"/><code class="literal">allocateDirect()</code> method:</p><a id="I_12_tt822"/><pre class="programlisting"> <code class="n">ByteBuffer</code> <code class="n">bbuf</code> <code class="o">=</code> <code class="n">ByteBuffer</code><code class="o">.</code><code class="na">allocateDirect</code><code class="o">(</code> <code class="mi">64</code><code class="o">*</code><code class="mi">1024</code> <code class="o">);</code> <code class="n">ByteBuffer</code> <code class="n">bbuf2</code> <code class="o">=</code> <code class="n">ByteBuffer</code><code class="o">.</code><code class="na">wrap</code><code class="o">(</code> <code class="n">someExistingArray</code> <code class="o">);</code></pre><p>As we described earlier, direct buffers can use operating system memory structures that are optimized for use with some kinds of I/O operations. The tradeoff is that allocating a direct buffer is a little slower and heavier weight operation than a plain buffer, so you should try to use them for longer-term buffers.<a id="I_indexterm12_id762884" class="indexterm"/></p></div></div><div class="sect2" title="Character Encoders and Decoders"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-12-SECT-5.6"/>Character Encoders and Decoders</h2></div></div></div><p><a id="idx10685" class="indexterm"/> <a id="idx10717" class="indexterm"/> <a id="idx10736" class="indexterm"/>Character encoders and decoders turn characters into raw bytes and vice versa, mapping from the Unicode standard to particular encoding schemes. Encoders and decoders have long existed in Java for use by <code class="literal">Reader</code> and <code class="literal">Writer</code> streams and in the methods of the <code class="literal">String</code> class that work with byte arrays. However, early on there was no API for working with encoding explicitly; you simply referred to encoders and decoders wherever necessary by name as a <code class="literal">String</code>. The <a id="I_indexterm12_id762962" class="indexterm"/><code class="literal">java.nio.charset</code> package formalized the idea of a Unicode character set encoding with the <a id="I_indexterm12_id762973" class="indexterm"/><code class="literal">Charset</code> class.</p><p>The <code class="literal">Charset</code> class is a factory for <code class="literal">Charset</code> instances, which know how to encode character buffers to byte buffers and decode byte buffers to character buffers. You can look up a character set by name with the static <code class="literal">Charset.forName()</code> method and use it in conversions:</p><a id="I_12_tt823"/><pre class="programlisting"> <code class="n">Charset</code> <code class="n">charset</code> <code class="o">=</code> <code class="n">Charset</code><code class="o">.</code><code class="na">forName</code><code class="o">(</code><code class="s">"US-ASCII"</code><code class="o">);</code> <code class="n">CharBuffer</code> <code class="n">charBuff</code> <code class="o">=</code> <code class="n">charset</code><code class="o">.</code><code class="na">decode</code><code class="o">(</code> <code class="n">byteBuff</code> <code class="o">);</code> <code class="c1">// to ascii</code> <code class="n">ByteBuffer</code> <code class="n">byteBuff</code> <code class="o">=</code> <code class="n">charset</code><code class="o">.</code><code class="na">encode</code><code class="o">(</code> <code class="n">charBuff</code> <code class="o">);</code> <code class="c1">// and back</code></pre><p>You can also test to see if an encoding is available with the static <code class="literal">Charset.isSupported()</code> method.</p><p>The following character sets are guaranteed to be supplied:</p><div class="itemizedlist"><ul class="itemizedlist"><li class="listitem"><p><a id="I_indexterm12_id763029" class="indexterm"/>US-ASCII</p></li><li class="listitem"><p><a id="I_indexterm12_id763037" class="indexterm"/>ISO-8859-1</p></li><li class="listitem"><p><a id="I_indexterm12_id763046" class="indexterm"/> <a id="I_indexterm12_id763053" class="indexterm"/>UTF-8</p></li><li class="listitem"><p><a id="I_indexterm12_id763062" class="indexterm"/>UTF-16BE</p></li><li class="listitem"><p><a id="I_indexterm12_id763070" class="indexterm"/>UTF-16LE</p></li><li class="listitem"><p><a id="I_indexterm12_id763079" class="indexterm"/>UTF-16</p></li></ul></div><p>You can list all the encoders available on your platform using the static <a id="I_indexterm12_id763088" class="indexterm"/><code class="literal">availableCharsets()</code> method:</p><a id="I_12_tt824"/><pre class="programlisting"> <code class="n">Map</code> <code class="n">map</code> <code class="o">=</code> <code class="n">Charset</code><code class="o">.</code><code class="na">availableCharsets</code><code class="o">();</code> <code class="n">Iterator</code> <code class="n">it</code> <code class="o">=</code> <code class="n">map</code><code class="o">.</code><code class="na">keySet</code><code class="o">().</code><code class="na">iterator</code><code class="o">();</code> <code class="k">while</code> <code class="o">(</code> <code class="n">it</code><code class="o">.</code><code class="na">hasNext</code><code class="o">()</code> <code class="o">)</code> <code class="n">System</code><code class="o">.</code><code class="na">out</code><code class="o">.</code><code class="na">println</code><code class="o">(</code> <code class="n">it</code><code class="o">.</code><code class="na">next</code><code class="o">()</code> <code class="o">);</code></pre><p>The result of <code class="literal">availableCharsets()</code> is a map because character sets may have “aliases” and appear under more than one name.</p><p>In addition to the buffer-oriented classes of the <code class="literal">java.nio</code> package, the <code class="literal">InputStreamReader</code> and <code class="literal">OutputStreamWriter</code> bridge classes of the <code class="literal">java.io</code> package have been updated to work with <code class="literal">Charset</code> as well. You can specify the encoding as a <code class="literal">Charset</code> object or by name.</p><div class="sect3" title="CharsetEncoder and CharsetDecoder"><div class="titlepage"><div><div><h3 class="title"><a id="learnjava3-CHP-12-SECT-5.6.1"/>CharsetEncoder and CharsetDecoder</h3></div></div></div><p><a id="idx10687" class="indexterm"/> <a id="idx10688" class="indexterm"/>You can get more control over the encoding and decoding process by creating an instance of <code class="literal">CharsetEncoder</code> or <code class="literal">CharsetDecoder</code> (a codec) with the <code class="literal">Charset newEncoder()</code> and <code class="literal">newDecoder()</code> methods. In the previous snippet, we assumed that all the data was available in a single buffer. More often, however, we might have to process data as it arrives in chunks. The encoder/decoder API allows for this by providing more general <a id="I_indexterm12_id763207" class="indexterm"/><code class="literal">encode()</code> and <a id="I_indexterm12_id763218" class="indexterm"/><code class="literal">decode()</code> methods that take a flag specifying whether more data is expected. The codec needs to know this because it might have been left hanging in the middle of a multibyte character conversion when the data ran out. If it knows that more data is coming, it does not throw an error on this incomplete conversion. In the following snippet, we use a decoder to read from a <code class="literal">ByteBuffer bbuff</code> and accumulate character data into a <code class="literal">CharBuffer cbuff</code>:</p><a id="I_12_tt825"/><pre class="programlisting"> <code class="n">CharsetDecoder</code> <code class="n">decoder</code> <code class="o">=</code> <code class="n">Charset</code><code class="o">.</code><code class="na">forName</code><code class="o">(</code><code class="s">"US-ASCII"</code><code class="o">).</code><code class="na">newDecoder</code><code class="o">();</code> <code class="kt">boolean</code> <code class="n">done</code> <code class="o">=</code> <code class="kc">false</code><code class="o">;</code> <code class="k">while</code> <code class="o">(</code> <code class="o">!</code><code class="n">done</code> <code class="o">)</code> <code class="o">{</code> <code class="n">bbuff</code><code class="o">.</code><code class="na">clear</code><code class="o">();</code> <code class="n">done</code> <code class="o">=</code> <code class="o">(</code> <code class="n">in</code><code class="o">.</code><code class="na">read</code><code class="o">(</code> <code class="n">bbuff</code> <code class="o">)</code> <code class="o">==</code> <code class="o">-</code><code class="mi">1</code> <code class="o">);</code> <code class="n">bbuff</code><code class="o">.</code><code class="na">flip</code><code class="o">();</code> <code class="n">decoder</code><code class="o">.</code><code class="na">decode</code><code class="o">(</code> <code class="n">bbuff</code><code class="o">,</code> <code class="n">cbuff</code><code class="o">,</code> <code class="n">done</code> <code class="o">);</code> <code class="o">}</code> <code class="n">cbuff</code><code class="o">.</code><code class="na">flip</code><code class="o">();</code> <code class="c1">// use cbuff. . .</code></pre><p>Here, we look for the end of input condition on the <code class="literal">in</code> channel to set the flag <code class="literal">done</code>. Note that we take advantage of the <code class="literal">flip()</code> method on <code class="literal">ByteBuffer</code> to set the limit to the amount of data read and reset the position, setting us up for the decode operation in one step. The <code class="literal">encode()</code> and <code class="literal">decode()</code> methods also return a result object, <code class="literal">CoderResult</code>, that can determine the progress of encoding (we do not use it in the previous snippet). The methods <a id="I_indexterm12_id763297" class="indexterm"/><code class="literal">isError()</code>, <a id="I_indexterm12_id763308" class="indexterm"/><code class="literal">isUnderflow()</code>, and <a id="I_indexterm12_id763318" class="indexterm"/><code class="literal">isOverflow()</code> on the <code class="literal">CoderResult</code> specify why encoding stopped: for an error, a lack of bytes on the input buffer, or a full output buffer, respectively.<a id="I_indexterm12_id763336" class="indexterm"/><a id="I_indexterm12_id763343" class="indexterm"/><a id="I_indexterm12_id763350" class="indexterm"/><a id="I_indexterm12_id763357" class="indexterm"/><a id="I_indexterm12_id763364" class="indexterm"/></p></div></div><div class="sect2" title="FileChannel"><div class="titlepage"><div><div><h2 class="title"><a id="learnjava3-CHP-12-SECT-5.7"/>FileChannel</h2></div></div></div><p><a id="idx10718" class="indexterm"/>Now that we’ve covered the basics of channels and buffers, it’s time to look at a real channel type. The <code class="literal">FileChannel</code> is the NIO equivalent of the <code class="literal">java.io.RandomAccessFile</code> , but it provides several core new features in addition to some performance optimizations. In particular, use a <code class="literal">FileChannel</code> in place of a plain <code class="literal">java.io</code> file stream if you wish to use file locking, memory-mapped file access, or highly optimized data transfer between files or between file and network channels.</p><p>A <code class="literal">FileChannel</code> can be created for a <code class="literal">Path</code> using the static <code class="literal">FileChannel</code><code class="literal">open()</code> method.</p><a id="I_programlisting12_id763442"/><pre class="programlisting"> <code class="n">FileSystem</code> <code class="n">fs</code> <code class="o">=</code> <code class="n">FileSystems</code><code class="o">.</code><code class="na">getDefault</code><code class="o">();</code> <code class="n">Path</code> <code class="n">p</code> <code class="o">=</code> <code class="n">fs</code><code class="o">.</code><code class="na">getPath</code><code class="o">(</code> <code class="s">"/tmp/foo.txt"</code> <code class="o">);</code> <code class="c1">// Open default for reading</code> <code class="k">try</code> <code class="o">(</code> <code class="n">FileChannel</code> <code class="n">channel</code> <code class="o">=</code> <code class="n">FileChannel</code><code class="o">.</code><code class="na">open</code><code class="o">((</code> <code class="n">p</code> <code class="o">)</code> <code class="o">)</code> <code class="o">{</code> <code class="o">...</code> <code class="o">}</code> <code class="c1">// Open with options for writing</code> <code class="kn">import</code> <code class="nn">static</code> <code class="n">java</code><code class="o">.</code><code class="na">nio</code><code class="o">.</code><code class="na">file</code><code class="o">.</code><code class="na">StandardOpenOption</code><code class="o">.*;</code> <code class="k">try</code> <code class="o">(</code> <code class="n">FileChannel</code> <code class="n">channel</code> <code class="o">=</code> <code class="n">FileChannel</code><code class="o">.</code><code class="na">open</code><code class="o">(</code> <code class="n">p</code><code class="o">,</code> <code class="n">WRITE</code><code class="o">,</code> <code class="n">APPEND</code><code class="o">,</code> <code class="o">...</code> <code class="o">)</code> <code class="o">)</code> <code class="o">{</code> <code class="o">...</code> <code class="o">}</code></pre><p>By default, <code class="literal">open()</code> creates a read-only channel for the file. We can open a channel for writing or appending and control other more advanced features such as atomic create and data syncing by passing additional options as shown in the second part of the previous example. <a class="xref" href="ch12s06.html#t1204" title="Table 12-4. java.nio.file.StandardOpenOption">Table 12-4</a> summarizes these options.<a id="I_indexterm12_id763466" class="indexterm"/><a id="I_indexterm12_id763471" class="indexterm"/><a id="I_indexterm12_id763477" class="indexterm"/><a id="I_indexterm12_id763482" class="indexterm"/><a id="I_indexterm12_id763488" class="indexterm"/><a id="I_indexterm12_id763493" class="indexterm"/><a id="I_indexterm12_id763499" class="indexterm"/><a id="I_indexterm12_id763504" class="indexterm"/><a id="I_indexterm12_id763510" class="indexterm"/><a id="I_indexterm12_id763515" class="indexterm"/></p><div class="table"><a id="t1204"/><p class="title">Table 12-4. java.nio.file.StandardOpenOption</p><div class="table-contents"><table summary="java.nio.file.StandardOpenOption" style="border-collapse: collapse;border-top: 0.5pt solid ; border-bottom: 0.5pt solid ; border-left: 0.5pt solid ; border-right: 0.5pt solid ; "><colgroup><col/><col/></colgroup><thead><tr><th style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; ">Option</th><th style="border-bottom: 0.5pt solid ; ">Description</th></tr></thead><tbody><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; "><code class="literal">READ</code>, <code class="literal">WRITE</code></td><td style="border-bottom: 0.5pt solid ; ">Open the file for read-only or write-only (default is read-only). Use both for read-write.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; "><code class="literal">APPEND</code></td><td style="border-bottom: 0.5pt solid ; ">Open the file for writing; all writes are positioned at the end of the file.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; "><code class="literal">CREATE</code></td><td style="border-bottom: 0.5pt solid ; ">Use with <code class="literal">WRITE</code> to open the file and create it if needed.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; "><code class="literal">CREATE_NEW</code></td><td style="border-bottom: 0.5pt solid ; ">Use with <code class="literal">WRITE</code> to create a file atomically; failing if the file already exists.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; "><code class="literal">DELETE_ON_CLOSE</code></td><td style="border-bottom: 0.5pt solid ; ">Attempt to delete the file when it is closed or, if open, when the VM exits.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; "><code class="literal">SYNC</code>, <code class="literal">DSYNC</code></td><td style="border-bottom: 0.5pt solid ; ">Wherever possible, guarantee that write operations block until all data is written to storage. <code class="literal">SYNC</code> does this for all file changes including data and metadata (attributes) whereas <code class="literal">DSYNC</code> only adds this requirement for the data content of the file.</td></tr><tr><td style="border-right: 0.5pt solid ; border-bottom: 0.5pt solid ; "><code class="literal">SPARSE</code></td><td style="border-bottom: 0.5pt solid ; ">Use when creating a new file, requests the file be sparse. On filesystems where this is supported, a sparse file handles very large, mostly empty files without allocating as much real storage for empty portions.</td></tr><tr><td style="border-right: 0.5pt solid ; "><code class="literal">TRUNCATE_EXISTING</code></td><td style="">Use <code class="literal">WRITE</code> on an existing file, set the file length to zero upon opening it.</td></tr></tbody></table></div></div><p>A <code class="literal">FileChannel</code> can also be constructed from a classic <code class="literal">FileInputStream</code>, <code class="literal">FileOutputStream</code>, or <code class="literal">RandomAccessFile</code>:</p><a id="I_12_tt826"/><pre class="programlisting"> <code class="n">FileChannel</code> <code class="n">readOnlyFc</code> <code class="o">=</code> <code class="k">new</code> <code class="n">FileInputStream</code><code class="o">(</code><code class="s">"file.txt"</code><code class="o">).</code><code class="na">getChannel</code><code class="o">();</code> <code class="n">FileChannel</code> <code class="n">readWriteFc</code> <code class="o">=</code> <code class="k">new</code> <code class="n">RandomAccessFile</code><code class="o">(</code><code class="s">"file.txt"</code><code class="o">,</code> <code class="s">"rw"</code><code class="o">)</code> <code class="o">.</code><code class="na">getChannel</code><code class="o">();</code></pre><p><code class="literal">FileChannel</code>s created from these file input and output streams are read-only or write-only, respectively. To get a read/write <code class="literal">FileChannel</code>, you must construct a <code class="literal">RandomAccessFile</code> with read/write permissions, as in the previous example.</p><p>Using a <code class="literal">FileChannel</code> is just like a <code class="literal">RandomAccessFile</code>, but it works with <code class="literal">ByteBuffer</code> instead of byte arrays:</p><a id="I_12_tt827"/><pre class="programlisting"> <code class="n">ByteBuffer</code> <code class="n">bbuf</code> <code class="o">=</code> <code class="n">ByteBuffer</code><code class="o">.</code><code class="na">allocate</code><code class="o">(</code> <code class="o">...</code> <code class="o">);</code> <code class="n">bbuf</code><code class="o">.</code><code class="na">clear</code><code class="o">();</code> <code class="n">readOnlyFc</code><code class="o">.</code><code class="na">position</code><code class="o">(</code> <code class="n">index</code> <code class="o">);</code> <code class="n">readOnlyFc</code><code class="o">.</code><code class="na">read</code><code class="o">(</code> <code class="n">bbuf</code> <code class="o">);