preface

This paper does not elaborate the causes of Okio, nor does it compare the advantages and disadvantages of Okio and Java native IO. Instead, it simply analyzes the implementation of Okio, thoroughly analyzes each key point, and provides exquisite illustrations.

This article analyzes the point directory

  • Okio class framework diagram
  • The Source and Sink
  • Buffer
  • Segment
  • To analyze the write method of Buffer, see the data flow process

Okio class framework diagram

Okio’s entire framework has a small amount of code, which reflects a highly cohesive design. The class framework is roughly as follows:

The diagram shows the relationships between classes within the framework. The framework does something like this:

The Source and Sink

Source and Sink are the base class interfaces of all input and output streams in Okio, similar to InputStream and OutputStream in Java IO.

I’m the Source of all the data you need.

I read the data from the Source and send it to you through me.

The read and write interfaces of data streams are defined respectively:

public interface Source extends Closeable {
  /**
   * Removes at least 1, and up to {@code byteCount} bytes from this and appends
   * them to {@code sink}. Returns the number of bytes read, or -1 if this
   * source is exhausted.
   */
  long read(Buffer sink, long byteCount) throws IOException;

  /** Returns the timeout for this source. */
  Timeout timeout(a);

  /** * Closes this source and releases the resources held by this source. It is an * error to read a closed source. It is  safe to close a source more than once. */
  @Override void close(a) throws IOException;
}
Copy the code
public interface Sink extends Closeable.Flushable {
  /** Removes {@code byteCount} bytes from {@code source} and appends them to this. */
  void write(Buffer source, long byteCount) throws IOException;

  /** Pushes all buffered bytes to their final destination. */
  @Override void flush(a) throws IOException;

  /** Returns the timeout for this sink. */
  Timeout timeout(a);

  /** * Pushes all buffered bytes to their final destination and releases the * resources held by this sink. It is an error to write a closed sink. It is * safe to close a sink more than once. */
  @Override void close(a) throws IOException;
}
Copy the code

Source, Sink also has a couple of implementation classes in Okio, so let’s just look at RealBufferedSource, RealBufferedSink.

RealBufferedSource

As the name suggests, the buffered data source, but it’s not the real data source, the real data source is the member variable source, which is just wrapped up, and when a method starting with read is called, it reads from the member variable source, Once read, the data is stored in the member variable buffer and then read from buffer.

Look at the read method:

–> RealBufferedSource.java

final class RealBufferedSource implements BufferedSource {
  // Buffer. The data is stored in the buffer first, and read from the buffer when others read it. If there is no data, read from the source
  public final Buffer buffer = new Buffer();
  // The real data source, which is also the upstream data source of this RealBufferedSource, reads from the source when there is no data in the buffer
  public final Source source;
  booleanclosed; .@Override public long read(Buffer sink, long byteCount) throws IOException {
    // Read byteCount from the current Source and save it to sink
    if (sink == null) throw new IllegalArgumentException("sink == null");
    if (byteCount < 0) throw new IllegalArgumentException("byteCount < 0: " + byteCount);
    if (closed) throw new IllegalStateException("closed");

    // If there is no data in the current buffer, read from the member variable source and save it to buffer
    // If there is no data in the current source, return -1
    // If there is data in the current buffer, skip this and read directly from the buffer
    if (buffer.size == 0) {
      long read = source.read(buffer, Segment.SIZE);
      if (read == -1) return -1;
    }

	// There must be some data in the buffer, choose the smaller, because the amount of data in the buffer may be less than byteCount, whichever is smaller
    long toRead = Math.min(byteCount, buffer.size);
    // Read data from buffer to sink
    returnbuffer.read(sink, toRead); }}Copy the code

RealBufferedSink is the buffered output stream from the name, but it’s not the real output stream, the real output stream is the member variable sink, it’s just wrapped up, when the method that starts with write is called, In both cases, data is first written to the member variable buffer, and then written out through the member variable sink.

Look at the write method:

final class RealBufferedSink implements BufferedSink {
  // Buffer: When you need to output data through me, the data will be stored in buffer first and then output by sink
  public final Buffer buffer = new Buffer();
  // The real output stream is also downstream of this RealBufferedSink. When data output is needed, it is output by sink
  public final Sink sink;
  booleanclosed; .@Override public void write(Buffer source, long byteCount)
      throws IOException {
    if (closed) throw new IllegalStateException("closed");
    // Write data from source to buffer
    buffer.write(source, byteCount);
    // Write the data in the buffer by sink. How to write the data internally will be analyzed in the following section of buffer
    emitCompleteSegments();
}
Copy the code

The entire input stream to the output stream can be represented as follows:

Buffer

Buffer. Buffer is the core class in Okio, through which data is transferred.

So why do we need Buffer? We can read InputStream without Buffer in Java IO.

Let’s take a popular example to answer this question. If we have an apple tree in the orchard, we should pick one when we want to eat it. When we want to eat it again, we should pick another tree. I had to run into the garden every time. So why don’t we pick ten or eight of them, put them in baskets and take them home, and when we want to eat them, we just take them from the baskets, so we don’t have to go all the way up the tree.

Buffer plays the role of the upper basket, so the existence of Buffer is very critical to save time and effort.

Take a look at Okio’s Buffer:

–> Buffer.java

public final class Buffer implements BufferedSource.BufferedSink.Cloneable.ByteChannel {
  // Buffer implements BufferedSource, BufferedSink, i.e
  // As a buffer for Source or Sink,
  private static final byte[] DIGITS =
      { '0'.'1'.'2'.'3'.'4'.'5'.'6'.'7'.'8'.'9'.'a'.'b'.'c'.'d'.'e'.'f' };
  static final int REPLACEMENT_CHARACTER = '\ufffd';

  // Key point, store data link header
  @Nullable Segment head;
  // The number of bytes in the current buffer
  longsize; .public Buffer(a) {}}Copy the code

A Buffer is used to Buffer data. To Buffer data, you need a container. Is the Segment. The Buffer mainly relies on segments to store data and forms a circular linked list structure, which can be shown in the following figure:

Let’s take a look at the Segment. Note that Buffer is not finished. After analyzing the Segment, we will analyze specific methods to see how Buffer and Segment work together.

Segment

Class source code:

–> Segment.java

final class Segment {
  // The maximum capacity of each Segment is 8192 bytes
  static final int SIZE = 8192;

  // In Segment segmentation scenarios, if the number of bytes to be split is greater than 1024,
  // Share the data directly instead of copying the array
  static final int SHARE_MINIMUM = 1024;

  // Where the actual data is stored
  final byte[] data;

  // The first readable position in data, such as the current data 1024 bytes, pos = 0, after one byte was read
  // pos = 1
  int pos;

  // The first writable position in data. For example, if there are 1024 bytes in data, the first writable position is 1025
  int limit;

  // Whether this Segment shares data with another Segment or ByteString
  boolean shared;

  // If this Segment has data and can be appended to it, extend limit
  boolean owner;

  // loop the next node in the list
  Segment next;

  // Loop the previous node of the list
  Segment prev;

  Segment() {
    this.data = new byte[SIZE];
    this.owner = true;
    this.shared = false;
  }

  Segment(byte[] data, int pos, int limit, boolean shared, boolean owner) {
    this.data = data;
    this.pos = pos;
    this.limit = limit;
    this.shared = shared;
    this.owner = owner; }... }Copy the code

The Segment is a bidirectional list. The Segment is a bidirectional list. The Segment is a bidirectional list.

The methods for manipulating data and linked lists are defined within the Segment.

  • push
  • pop
  • split
  • compact
  • writeTo

push

// Add a node to the current list after the Segment being called
public final Segment push(Segment segment) {
    segment.prev = this;
    segment.next = next;
    next.prev = segment;
    next = segment;
    return segment;
}
Copy the code

pop

// Remove the node from the current list
public final @Nullable Segment pop(a) { Segment result = next ! =this ? next : null;
    prev.next = next;
    next.prev = prev;
    next = null;
    prev = null;
    return result;
}
Copy the code

The above two methods are simple linked list add and delete operations, do not need comments, just need to pay attention to the following points:

Push:

1. Prev and next of the pushed nodes are settled first

2. Next of the previous node adjacent to the new node is settled

3. Finally, the preV of the next node adjacent to the new node is settled

Pop:

1. Next, which settled its former neighbors first

2. Resettle the neighbor’s prev

3. Finally settle your own Prev and next

In short, in a two-way linked list, the operation of adding and deleting is only concerned with three nodes: oneself, the former neighbor, and the latter neighbor. Oneself solve prev next, the former neighbor solve next, and the latter neighbor solve prev

split


public final Segment split(int byteCount) {
    if (byteCount <= 0 || byteCount > limit - pos) throw new IllegalArgumentException();
    Segment prefix;

    // We have two competing performance goals:
    // We have two competing performance goals
    // - Avoid copying data. We accomplish this by sharing segments.
    // - Avoid copying data. We do this by sharing segments
    // - Avoid short shared segments. These are bad for performance because they are readonly and
    // may lead to long chains of short segments.
    // - Avoid short shared segments, which can lead to poor performance, because segments can only be read but not written, and
    // May cause short segments in the list to become very long
    // To balance these goals we only share segments when the copy will be large.
    // To balance these goals, we only share segments when there is a large amount of data to copy
    if (byteCount >= SHARE_MINIMUM) {
      // If the data to be partitioned is greater than SHARE_MINIMUM (1024), new segments will be created using shared segments.
      // Create a new Segment with the same data
      prefix = sharedCopy();
    } else {
      // If the data to be split is smaller than SHARE_MINIMUM (1024), copy it
      prefix = SegmentPool.take();
      System.arraycopy(data, pos, prefix.data, 0, byteCount);
    }
    
    // At this point, we have created the segmented Segment, but we should always note that the segmented Segment is
    // Data in Segment, data is cut off, so we need to adjust pos and limit in both segments

    // Adjust the limit of newly created Segment to pos + byteCount
    prefix.limit = prefix.pos + byteCount;
    // Change the Segment pos to pos + byteCount
    pos += byteCount;
    // Add the prev to the end of the Segment.
    // You might ask, this method can be called by any Segment object, this prev calls this method
    // The Segment is preceded by the tail node. There's only one place inside the Okio framework
    Write (Buffer source, long byteCount);
    // head prev is tail
    // This method is public and can be accessed externally, but look at Segment
    // It is not public, so it can only be used inside Okio.
    prev.push(prefix);
    return prefix;
}
Copy the code

The process of the split method can be illustrated as follows:

After the split process is completed, the linked list changes in the Buffer are shown as follows:

So what does this split method do? As mentioned in the above comment, the split method only calls write(Buffer source, long byteCount) in the Buffer in Okio. The last comment in this method reads:

/** * Occasionally we write only part of a source buffer to a sink buffer. For * example, given a sink [51%, 91%], we may want to write the first 30% of * a source [92%, 82%] to it. To simplify, we first transform the source to * an equivalent buffer [30%, 62%, 82%] and then move the head segment, * yielding sink [51%, 91%, 30%] and source [62%, 82%]. */
Copy the code

It means: In some scenarios, we only need to write part of data from Source to Sink. For example, we now have a Sink, and the proportion of unread data is [51%, 91%]. We might want to write 30% of the data from a [92%, 82%] Source to Sink. For simplicity, we first convert the Source to an equivalent Buffer [30%, 62%, 82%] and then move the Segment to Sink. Then the Source is left with [62%, 82%].

It is easy to understand that it is nothing more than to split the two segments of Source into three and then move one of them to Sink, so as to avoid copying a relatively large amount of data, but only move the pointer. In split, when the data to be written is less than 1024, the copying operation will be performed. Share data directly, so this is a performance boost.

writeTo

// Write the current Segment to sink
public final void writeTo(Segment sink, int byteCount) {
    // If sink cannot be written, throw an exception
    if(! sink.owner)throw new IllegalArgumentException();
    // If sink's writable space is insufficient
    if (sink.limit + byteCount > SIZE) {
      // We can't fit byteCount bytes at the sink's current position. Shift sink first.
      // If sink shares data with another Segment, throw an exception directly
      // Why not use the read space? Since the data is shared, using read space will affect another shared Segment
      if (sink.shared) throw new IllegalArgumentException();
      // If the sink space is not enough, throw an exception
      if (sink.limit + byteCount - sink.pos > SIZE) throw new IllegalArgumentException();
      System.arraycopy(sink.data, sink.pos, sink.data, 0, sink.limit - sink.pos);
      sink.limit -= sink.pos;
      sink.pos = 0; } System.arraycopy(data, pos, sink.data, sink.limit, byteCount); sink.limit += byteCount; pos += byteCount; }}Copy the code

The writeTo process can be represented as follows:

compact

// Compress data from tail to prev
public final void compact(a) {
    // If the former node is itself, then there is only one node in the list
    if (prev == this) throw new IllegalStateException();
    // If the front node is not exclusive, that is, not writable, return
    if(! prev.owner)return; // Cannot compact: prev isn't writable.
    // The size of unread data on the current node
    int byteCount = limit - pos;
    // The maximum writable space of the former node, remember our writeTo method above, Max writable space = writable space + read space
    int availableByteCount = SIZE - prev.limit + (prev.shared ? 0 : prev.pos);
    // If the maximum writable space of the front node does not contain the data to be written, return
    if (byteCount > availableByteCount) return; // Cannot compact: not enough writable space.
    // writeTo (writeTo, writeTo, writeTo
    writeTo(prev, byteCount);
    // The current node is disconnected from the list
    pop();
    // Don't throw it away, recycle it, use it again
    SegmentPool.recycle(this);
}
Copy the code

In some cases, the Segment utilization in the linked list is not very high (there is still a lot of space to write). Can we compress the data from the next node into this node? In order to improve the utilization, at the same time can recycle the latter node, reduce the length of the linked list, kill two birds with one stone. Note that this method is called by the tail node.

The Compact process can be represented as follows:Segment provides these atomic methods; let others call them.

Owner and share description

Owner indicates whether the Segment is writable or not. If the Segment is false, it has a veto right and cannot be written; if it is true, it can be written, but share is required to participate in judgment under certain conditions. For details, refer to writeTo ()

Share indicates that this Segment shares data with another Segment, and this field is set to true when the split () method is called to participate in some scenarios

Therefore, when writing data from one Segment to another, first judge the owner field of the Segment to be written. If false, the owner field cannot be written directly. If true, judge the share field.

The write and read methods of Buffer

In the previous section of Buffer, we will now analyze the process of reading and writing from one cache to another using the write and read methods of Buffer.

write

// Move byteCount bytes from source to current Buffer (remove many comments)
@Override public void write(Buffer source, long byteCount) {
    // Move bytes from the head of the source buffer to the tail of this buffer
    // Move data from the head of the source to the tail of the current Buffer

    if (source == null) throw new IllegalArgumentException("source == null");
    if (source == this) throw new IllegalArgumentException("source == this");
    // Check the number of offsets and moves to prevent transgressions
    checkOffsetAndCount(source.size, 0, byteCount);
    // Why a while loop, because byteCount can be large and Segment moves data is limited
    while (byteCount > 0) {
      // Is a prefix of the source's head segment all that we need to move?
      // If the unread data in the source header is larger than byteCount
      if (byteCount < (source.head.limit - source.head.pos)) {
        // Since all writes are written to the tail of the current Buffer, find the tail node firstSegment tail = head ! =null ? head.prev : null;
        if(tail ! =null && tail.owner
            && (byteCount + tail.limit - (tail.shared ? 0 : tail.pos) <= Segment.SIZE)) {
          // Our existing segments are sufficient. Move bytes from source's head to our tail.
          // The tail node of the current Buffer has enough writable space. Write data directly to tail
          source.head.writeTo(tail, (int) byteCount);
          source.size -= byteCount;
          size += byteCount;
          return;
        } else {
          // We're going to need another segment. Split the source's head
          // segment in two, then move the first of those two to this buffer.
          // Split the source head in two because the tail of the current Buffer does not hold byteCount bytes
          // If tail is null or does not satisfy any of the above conditions, then tail will walk
          // Else does not affect splitting the head, because the outer if constraintsbytecount to be transferred is smaller than the unread data in the head
          source.head = source.head.split((int) byteCount); }}// Remove the source's head segment and append it to our tail.
      // If the source. Head is split, if the source is not split, nothing is done
      Segment segmentToMove = source.head;
      long movedByteCount = segmentToMove.limit - segmentToMove.pos;
      // Remove segmentToMove from source as it will be added to the current Buffer
      source.head = segmentToMove.pop();
      if (head == null) {
        head = segmentToMove;
        head.next = head.prev = head;
      } else {
        Segment tail = head.prev;
        // Add to the current Buffer
        tail = tail.push(segmentToMove);
        // Compress the segment as fully loaded as possible
        tail.compact();
      }
      
      // Set the source and current Buffer variables, source data is less, this side is more
      source.size -= movedByteCount;
      size += movedByteCount;
      byteCount -= movedByteCount;
      
      // If byteCount data is not moved in one run, the next loop is performed}}Copy the code

read

public long read(Buffer sink, long byteCount) {
    if (sink == null) throw new IllegalArgumentException("sink == null");
    if (byteCount < 0) throw new IllegalArgumentException("byteCount < 0: " + byteCount);
    if (size == 0) return -1L;
    if (byteCount > size) byteCount = size;
    // Write the current Buffer to sink. This seems to be our first reaction when we see this method
    // A little different, we might think: since read, we want to read data from outside, but in this case, we want to read out
    // We need to understand the concepts of source and sink
    sink.write(this, byteCount);
    return byteCount;
  }

Copy the code