Abstract
Java I/O is a very fundamental part of the Java technology architecture and is the foundation for learning Java NIO. An in-depth understanding of Java NIO is the foundation for learning about modern high-performance network communication frameworks such as Netty. This article attempts to understand the Java I/O flow from a designer’s point of view, starting with the initial concept of Java I/O input/output flow and gradually developing an introduction to the entire Java I/O through the use of various object-oriented extension techniques (inheritance, design patterns, and so on). I conclude by introducing extensions to Java I/O provided by third-party open source frameworks. By reading this article, I hope you can get a general idea of the Java I/O architecture. You can derive the technical architecture of Java I/O conceptually rather than in detail, and know what Java I/O is and why it is rather than what it is. This article does not cover the specifics of the code, but if you are interested in the details of the Java I/O architecture, you can read the JDK source code.
1. Basic concepts
Any software framework or technology architecture, going back to the beginning, must have a few basic concepts underpinning its development. The most basic concepts of Java I/O are input streams and output streams, namely InputStream and OutputStream.
InputStream
The most basic byte input stream, the abstract class, defines all the basic methods for reading raw bytes
1.1.1, public abstract int read() The basic method for reading a byte
1.1.2, public int read(byte b[], int off, int len) Reads a specified length of bytes into a byte array, based on method 1.1.1
1.1.3, public int read(byte b[]) read an array of bytes based on method 1.1.2
1.1.4 public long skip(long n) public long skip(long n
1.1.5, public int available() returns the minimum number of bytes that can be read
The mark(int readlimit), void reset(), and markSupported() methods are not supported by every subclass, so they can be moved to a new interface.
1.1.7, public void close() closes the input stream
OutputStream
The most basic byte output stream, the abstract class, defines all the basic methods for writing raw bytes
1.2.1, public abstract void write(int b) Write a byte
1.2.2, public void write(byte b[], int off, int len) Write part of a byte array based on basic method 1.2.1
Public void write(byte b[]) writes an array of bytes based on method 1.2.2
1.2.4. Public void close() closes the output stream
1.2.5, public void flush() flush the output stream, because the operating system has write buffering mechanism, that is, when we call write, the operating system will buffer the bytes we want to write to the buffer, and then flush to local disk at some point, if we want to force the flush, You need to call this method.
summary
InputStream and OutputStream define the most basic behavior in the I/O world, which is reading and writing a byte, with template methods that extend the read and write behavior appropriately.
Extension point 1: Inheritance of I/O streams
InputStream and OutputStream are abstract classes that define only the most basic methods in the I/O domain, but do not involve implementation. There are three implementations of InputStream and OutputStream for different data sources: A ByteArrayInputStream/ByteArrayOutputStream is based on memory, a FileInputStream/FileOutputStream is based on a disk file, There is also a SocketInputStream/SocketOutputStream that is based on networks.
FileInputStream/FileOutputStream
Read Source files of the operating system FileInputStream uses the native method to read underlying files
private native int read0()
All other read methods are ultimately implemented based on this native method.
FileOutputStream uses native methods to write underlying files
private native void writeBytes(byte b[], int off, int len, boolean append)
All other write methods are implemented based on this native method.
ByteArrayInputStream/ByteArrayOutputStream
The source of a read write is an array of memory, which is rarely used.
SocketInputStream/SocketOutputStream
SocketInputStream Use private native int socketRead0(FileDescriptor fd,byte B [], int off, Int len,int timeout) This native method reads the data stream from the remote server. All read methods are implemented based on this native method. SocketOutputStream uses private Native void socketWrite0(FileDescriptor fd, byte[] b, Int off,int len) is a native method for writing remote data streams. All write methods are based on this method.
summary
InputStream and OutputStream are abstractions of convection. Different concrete streams are implemented through inheritance. For Java native platforms, the most basic is file-system-based streams.
3. Extension Point 2: Extends IO traffic behavior
Decorator pattern extensions to the behavior of a class, and do not change the interface, the Java through FilterInputStream/FilterOutputStream decorative pattern is realized.
The chain of responsibility pattern defines a unified interface, and then collaborates sequentially through multiple subclasses that implement the interface to accomplish a complex function. Java through multiple FilterInputStream/FilterOutputStream subclass together to achieve the chain of responsibility pattern.
FilterInputStream/FilterOutputStream
FilterInputStream does not implement the function of an InputStream itself. Instead, it passes in another subclass of InputStream via its constructor, passing it the function of an InputStream. The behavior of input and output streams can be extended by inheriting FilterInputStream, which is a typical use of decorator patterns. The chain of responsibility pattern is implemented through multiple decorator classes, which divide the different processing of an input stream into different FilterInputStreams. FilterOutputStream and FilterInputStream work in the same way.
BufferedInputStream/BufferedOutputStream
Inheriting FilterInputStream, the function of buffering in input stream processing is realized. The underlying stream is read into a byte array. When the user reads data with BufferedInputStream, the byte array is read first and then the underlying stream is called for further reads. This approach can improve read performance. Inherit FilterOutputStream, realize the buffering function in the output stream processing. When the user writes to a BufferedOutputStream, it writes to an array of bytes, and when the array is full, it actually calls the underlying output stream to perform the output action. This approach can improve write performance. When BufferedOutputStream is used, it is important to flush because the underlying stream will not be written to if the buffer array is not full, and the buffer may not be full when the last bit of data is written to.
PrintStream
The print and println methods of this class convert some of Java’s primitive types to bytes and write them to the underlying output stream. However, PrintStream’s conversion to String is platform-dependent. Different platforms have different encodings. So the bytes written to the bottom layer are different, so PrintStream is only good for test output, not for I/O operations in general, especially network flows.
DataInputStream/DataOutputStream
These two classes inherited FilterInputStream/FilterOutputStream, used to implement the Java primitive types into binary to read and write operations, these two classes of readUTF and writeUTF method using a special way of UTF decoding, Can only be used in Java programs, so it is not recommended to use both classes in network flow or cross-platform applications
PushbackInputStream
FilterInputStream inherits FilterInputStream, providing a fallback mechanism that can implement unread, essentially using buffer arrays, that is, the scope of fallback is limited.
summary
Java I/O designers extend the behavior of I/O through the decorator pattern and the chain of responsibility pattern. The building blocks for this extension are FilterOutputStream and FilterInputStream. Because of these two design patterns, we write Java I/O code that looks like this:
InputStream in = new GZIPInputStream(new BufferedInputStream(new FileInputStream(“1.txt”)));
The innermost FileInputStream is the real source of the data, while BufferedInputStream and GZIPInputStream both inherit FilterInputStream and change the behavior of I/O.
4, The reason for the emergence of Reader/Writer
InputStream and OutputStream are byte-oriented, whereas humans are character-oriented, so InputStream and OutputStream are not very good for the user experience of programmers, so you need to provide some character-oriented streams. Due to a DataInputStream/DataOutputStream problems under the condition of the cross-platform, therefore, the Java designers simply imitate the InputStream and OutputStream to design a set of I/O, character-oriented Reader/Writer
Reader
Basic character input stream, an abstract class
4.1.1, public abstract int read() The basic method for reading a character
4.1.2, public int read(char b[], int off, int len) read(char b[], int off, int len) into a byte array, based on method 4.1.1
4.1.2, public int read(char b[])
4.1.4, public long skip(long n
4.1.5, public int available() returns the minimum number of characters that can be read
The mark(int readlimit), void reset(), and markSupported() methods are not supported by every subclass and should be migrated to a new interface.
4.1.6. Public void close() closes the input stream
4.1.7, public Boolean Ready () Specifies whether ready
Writer
Basic character output stream, an abstract class
Public void write(char cbuf[], int off, int len) write(char cbuf[], int off, int len
4.2.2, public void write(char cbuf[]) based on 4.2.1,, write a character data
4.2.3, public void write(int c) write an int 16 bits as a character, based on 4.2.1
Public void write(String STR) Write a String, based on 4.2.1
Public void write(String STR, int off, int len) Writes part of a String, based on 4.2.1
Conversion between characters and bytes
Since computers only recognize bytes, the source of Reader/Writer data is ultimately bytes, and they cannot deal with bytes directly. In this case, an intermediary is needed to bridge Reader/Writer with InputStream and OutputStream. So you have InputStreamReader and OutputStreamWriter
Reader/Writer inheritance
Different sources of Reader/Writer, they inherit InputStreamReader/OutputStreamWriter
FileReader/FileWriter
Inherited InputStreamReader/OutputStreamWriter, incoming FileInputStream/FileOutputStream as the underlying bytes I/O
CharArrayReader/CharArrayWriter
Inherited InputStreamReader/OutputStreamWriter, use char array as a data source, with less
6. Extensions to Reader/Writer behavior
Similar to the byte stream, the decorator mode and the responsibility chain mode are used
FilterReader/FilterWriter
For Reader/Writer agents, the underlying layer uses other readers/writers as the real operation.
BufferedReader/BufferedWriter
Inherited FilterReader/FilterWriter BufferedReader use char array as a data buffer, to read data from the buffer to read first, can’t read the Reader to read from the bottom, the Reader is actually used more underlying InputStream, Try not to use BufferedInputStream as the underlying InputStream; two-tier buffers are not necessary. BufferedWriter writes to the buffer first, and then writes to the underlying Writer when the buffer is full. Writer actually uses the underlying OutputStream. Try not to use BufferedOutputStream as the underlying OutputStream. Two-tier buffers are not necessary.
PushbackReader
Inherit FilterReader, implement returnable write, essentially use a char array, so returnable is bounded.
PrintWriter
Used instead of PrintStream, it can convert Java primitives to byte output, and it can handle internationalization of different character sets correctly.
Now that we’ve looked at all the relevant classes in the java.io package, let’s take a look at what extensions have been made to Java IO by third-party open source frameworks.
7. Open source library extensions to Java IO
Java IO extends data sources through inheritance, and extends behavior through decorator patterns. Commons-io, which is introduced below, expands behavior and simplifies IO operations by providing some tools and methods for IO operations, while Okio takes a different approach and scrapes the Java IO architecture and designs the Source /sink interface architecture.
commons-io
Extend the behavior
The latest Commons – IO 2.5 provides various extensions of input and output, through inheritance FilterInputStream/FilterOutputStream implementation
Input:
AutoCloseInputStream: When an IO stream reads an EOF, it is automatically closed
BOMInputStream: Used to process input streams containing BOM, such as files saved in Notepad under Windows
BoundedInputStream: An input stream with a threshold beyond which reading will stop
CountingInputStream: An input stream containing statistics
DemuxInputStream: This input stream will store the real stream in ThreadLocal
ProxyInputStream: An abstract class that provides callback methods before and after reading a byte
TaggedInputStream: This class is used to track exceptions by marking exceptions when they are thrown
TeeInputStream: Reads data from a source and saves it to a specified source, similar to the Unix tee command
UnixLineEndingInputStream: this flow when reading a newline in the Unix format will be read
WindowsLineEndingInputStream: this flow when reading a newline in the Windows format will be read
output
ChunkedOutputStream: Writes to streams in chunks
CountingOutputStream: Output stream with statistics
DemuxOutputStream: This output stream stores the actual stream in a ThreadLocal
ProxyOutputStream: An abstract class that provides callback methods before and after writing a byte
TaggedOutputStream: This class is used to track exceptions by marking exceptions when they are thrown
TeeOutputStream: Writes data to a source and saves it to a specified source, similar to the Unix tee command
Tool methods The IOUtils tool class provides the following tool methods:
CloseQuietly – Close a flow and ignore the exception
ToXxx /read – Reads data from a stream
Write – Writes data to a stream
Copy – Copies from one stream to another
ContentEquals – Compares the contents of the two streams
okio
If you use a native Java IO for basic types of reading and writing, we need to use a DataInputStream/DataOutputStream and BufferedReader/BufferedWriter these four classes, in addition to this, We also need to know the responsibility between the InputStreamReader/OutputStreamWriter and Java IO chain, for the average Java developer, this is too complicated. Therefore, Okio redesigned the interface Source/Sink to provide access to basic types of interfaces and buffering functions, while shielding the underlying complex IO system. Developers only need to pass InputStream and OutputStream.
The specific class relationship is as follows:
The Java code for using Okio is as follows:
try {
BufferedSource bufferedSource = Okio.buffer(Okio.source(new FileInputStream(“1.txt”)));
int i = bufferedSource.readInt();
long l = bufferedSource.readLong();
String s = bufferedSource.readString(Charset.forName(“UTF-8”));
BufferedSink bufferedSink = Okio.buffer(Okio.sink(new FileOutputStream(“2.txt”)));
bufferedSink.writeInt(1);
bufferedSink.writeLong(2L);
bufferedSink.writeString(“123”, Charset.forName(“UTF-8”));
} catch (Exception e) {
// process exception
}
8, summary
Finally, I will roughly restore the heart of Java designers: The abstract InputStream/OutputStream class initially fixed the two basic concepts of I/O, the behavior of input streams and output streams read/write, and enriched the two most commonly used methods read and write through template methods. Inheritance is then used naturally to implement these two abstract classes. The most obvious input and output streams in the computer world are local files and remote networks, and of course Java designers designed a memory-based stream. The next step is to extend the read/write behavior of the input/output streams. Designers can extend the behavior without changing the semantics of the interface through the decorator pattern and chain of responsibility pattern, resulting in the unique layer upon layer code writing of Java I/O. When inheritance and design patterns extended I/O, the designer realized that the interface he had designed was computer-oriented, that is, byte stream oriented, and it was too difficult for the developer to use. The developer needed character stream oriented interface, so he started tinkering with it. Design a DataInputStream/DataOutputStream, but failed because of platform compatibility, so simply drawing board, Redesigned the Reader/Writer, and a similar interface system between the byte stream and characters of the flow transducer InputStreamReader/OutputStreamWriter, thus solving the problem of the habit of the developers.