“This is the 14th day of my participation in the First Challenge 2022.

io

Description and positioning of packages

Provides a basic IO primitive interface. Because it is implemented based on the underlying operation, it should not be considered parallel safe unless otherwise specified.

Relationships between interfaces or structures

The first part is the core part of the IO package, including four interface: Reader/Writer/Closer/Seeker, corresponding IO close to read and write and migration.

The first part of the extension, is based on the core 4 interface combination: ReadWriter/ReadCloser/WriterCloser/ReadWriteCloser/ReadSeeker WriteSeeker/ReadWriteSeeker. Composition is done by embedding interfaces within interfaces.

So far, there are 11 generic interfaces included.

The second part is based on the first part of the 11 interface, the extension of some functional DS/func.io package with the most or the first part of the core interface.

Details (including highlights)

The core 4 interface will be analyzed, and the functional part will be briefly reviewed.

Reader interface

It contains only one method, Read, with an interface named Reader, which is perfectly compliant with the coding specification.

The Reader interface encapsulates basic IO reads.

type Reader interface {
  Read(p []byte) (n int, err error)
}
Copy the code

Read method description:

  • Read data to slice P, length len(p) at most
    • The return value is the exact number of bytes read, or any errors
    • If the data to be read is less than len(p) in length, the actual length is returned without waiting
  • The first return value n will return the actual number of bytes read, whether there is an error in the read or the end of the file is encountered
    • The second return value, Err, is a little different
      • It might return EOF, it might return non-nil
      • A second call to Read returns a certain value: 0,EOF
  • The correct way for the caller to handle the return value is like this
    • Check whether n is greater than 0 and then err
  • For implementing
    • It is not recommended to return 0,nil unless len(p) is 0
    • Because 0 nil means nothing happened, right
    • In practice, EOF is EOF, you can’t replace it with 0,nil

As a bonus, the Read method takes a slice, a reference, and can be modified internally. There is, of course, a premise: the pointer method set of the implementation type matches the Reader. It makes no sense for the value method set to implement the Reader interface. Like LimitedReader to Reader implementation:

func (l *LimitedReader) Read(p []byte) (n int, err error) {
  if l.N <= 0 {
    return 0, EOF
  }
  if int64(len(p)) > l.N {
    p = p[0:l.N]
  }
  n, err = l.R.Read(p)
  l.N -= int64(n)
  return
}
Copy the code

Writer interface

Basic IO write operations.

type Writer interface {
  Write(p []byte) (n int, err error)
}
Copy the code

Analysis of the

  • Write slice data P to the relevant stream, the number of bytes written is Len (p)
    • The return value n indicates the number of bytes actually written, and err is non-nil if n is less than len(p)
  • The method will not change the data of slice P, even temporary
  • On the implementation
    • Make sure that slice P is not modified

In addition, it is best to use the value method set to match Writer. If you must use the pointer method set to match Writer, you must pay attention to not modifying slice data.

Closer interface

Like reading and writing IO,Closer encapsulates a close operation.

type Closer interface {
  Close() error
}
Copy the code
  • After closing, close is called again, and the behavior is undefined
  • Implementation-specific actions, if any, should be documented

Seeker interface

IO offset operation encapsulation.

type Seeker interface {
  Seek(offset int64, whence int) (int64, error)
}
Copy the code
  • This offset is for the next read or write
  • The parameters offset and whence determine the specific offset address
    • Whence defines three values (SeekStart/SeekCurrent/SeekEnd)
    • The first return value is based on the offset of the document header
  • It’s illegal to offset before the header,err is non-nil
    • The offset position is ok for any positive integer
  • On the implementation
    • It is illogical for the offset position to be any positive integer
    • What kind of behavior does it have to do with implementation

IO. LimitReader analysis

LimitReader is a functional structure. This is an extension of part 2 of the IO package.

func LimitReader(r Reader, n int64) Reader { return &LimitedReader{r, n} }

type LimitedReader struct {
  R Reader // underlying reader
  N int64  // max bytes remaining
}

func (l *LimitedReader) Read(p []byte) (n int, err error) {
  if l.N <= 0 {
    return 0, EOF
  }
  if int64(len(p)) > l.N {
    p = p[0:l.N]
  }
  n, err = l.R.Read(p)
  l.N -= int64(n)
  return
}
Copy the code

LimitedReader is a type that implements the IO.Reader interface. LimitedReader is a type that limits the number of bytes read. When the threshold is exceeded, (0,EOF) is returned.

Second, Read uses the pointer receiver and Writer uses the value receiver.

IO. SectionReader analysis

Functional extensions to the second part of the IO package.

func NewSectionReader(r ReaderAt, off int64, n int64) *SectionReader {
  return &SectionReader{r, off, off, off + n}
}

type SectionReader struct {
  r     ReaderAt
  base  int64
  off   int64
  limit int64
}
Copy the code

There is a big difference between SectionReader and LimitReader for functional structures.

  • The LimitReader constructor is not Newxxx
    • The author’s intention should be to construct an object that satisfies the Reader
    • Newxxx should be used to construct a new type, and NewSectionReader should not return a Reader type
  • The LimitedReader structure’s fields are exposed
    • The author’s intention is that after construction, you can go through type assertion and look up attributes by field
    • SectionReader fields are not exposed, meaning that all access is done through methods

Now let’s focus on SectionReader. The first argument to the constructor is of type ReaderAt, the interface type.

type ReaderAt interface {
  ReadAt(p []byte, off int64) (n int, err error)
}
Copy the code

The ReadAt() method, in contrast to the Read method, has an off parameter, indicating that the number of bytes off is offset before len(p) is Read into the slice. Let’s look at the ReaderAt interface:

  • The function of the interface is to read from the offset off of the input source
  • If len(p) is not read, a non-nil error is returned
    • Error description is more strict than the Reader interface in describing why less reading was done
  • If the ReadAt does not read len(p) bytes, it either blocks and waits or returns an error
    • This is completely different from Reader, which does not block
  • Even if the ReadAt reads len(p) bytes,err may be EOF, otherwise nil
  • ReadAt is not affected by seek offsets (head/current position/end), nor by seek factors
  • Parallel reading is OK
  • Implementationally, no reference to P can be saved

From this point of view,ReadAt is a complement to Read. They have the same thing in common: the return value n is handled first, followed by err.

Look back at the SectionReader constructor. Because the fields of a structure are not exposed, it can only be constructed correctly in other packages through the constructor NewSectionReader. As you can see from the documentation,SectionReader has two extensions to Reader: it adds offsets; Added the maximum number of bytes read.

Let’s look at the implementation of Reader:

func (s *SectionReader) Read(p []byte) (n int, err error) {
  if s.off >= s.limit {
    return 0, EOF
  }
  if max := s.limit - s.off; int64(len(p)) > max {
    p = p[0:max]
  }
  n, err = s.r.ReadAt(p, s.off)
  s.off += int64(n)
  return
}
Copy the code

If the number of bytes has been exceeded, return (0,EOF), and the rest is read by calling the constructor’s first argument, the ReaderAt interface variable. Because ReaderAt’s method ReadAt is concurrency safe, sectionReader.read is also concurrency safe.

SectionReader implements Seeker as well as Reader:

func (s *SectionReader) Seek(offset int64, whence int) (int64, error) {
  switch whence {
  default:
    return 0, errWhence
  case SeekStart:
    offset += s.base
  case SeekCurrent:
    offset += s.off
  case SeekEnd:
    offset += s.limit
  }
  if offset < s.base {
    return 0, errOffset
  }
  s.off = offset
  return offset - s.base, nil
}
Copy the code

This implementation is very interesting. Let’s look at two places before we talk about this implementation.

ReadAt does not affect the offset factors at all. Read/Seek updates the offset factors at all. Offset factors are not called offset variables because they are maintained by the IO operation itself. This is why SectionReader has three fields (base/ OFF /limit), which are used to simulate the underlying OFFSET factors of the IO.

The first place to look is the three variables in the structure. Base represents the file header,limit represents the file end, and off represents the current offset (or current cursor). Initialization looks at the constructor Settings, and both Read() and Seek() have offset updates.

The second place is the Seeker interface. The Seeker updates offsets with two parameters, and there are rules about how offsets are handled: they cannot be updated to the header. SectionReader’s implementation of Seeker determines that the new offset cannot precede base.

With these two things in mind, it’s easy to look at SectionReader’s implementation of Seeker. The process is not much, just two details:

  • The first return value is based on the offset of the file header, and SectionReader does this very well
  • Switch default can be advanced, especially when handling some errors

Let’s look at the implementation of ReaderAt:

func (s *SectionReader) ReadAt(p []byte, off int64) (n int, err error) {
  if off < 0 || off >= s.limit-s.base {
    return 0, EOF
  }
  off += s.base
  if max := s.limit - off; int64(len(p)) > max {
    p = p[0:max]
    n, err = s.r.ReadAt(p, off)
    if err == nil {
      err = EOF
    }
    return n, err
  }
  return s.r.ReadAt(p, off)
}
Copy the code

The ReaderAt interface is not affected by the offset factor, nor by the offset factor. ReaderAt is only affected by the IO file header and tail. SectionReader uses base and limit to simulate the header and tail.

Sectionreader.readat () blocks or returns an error if len(p) is not enough to read. This is done by doing a normal read and then changing the return value of err.

After defining the ReadAt interface, we can use the simulated offset SectionReader to implement ReaderAt perfectly.

IO. TeeReader analysis

This is the third functional structure.

As revelatory as the name suggests,tee is used to change the flow.

func TeeReader(r Reader, w Writer) Reader { return &teeReader{r, w} } type teeReader struct { r Reader w Writer } func (t *teeReader) Read(p []byte) (n int, err error) { n, err = t.r.Read(p) if n > 0 { if n, err := t.w.Write(p[:n]); err ! = nil { return n, err } } return }Copy the code

Let’s talk about the documentation first, and then analyze the details.

TeeReader implements the Reader interface. The processing logic is: read from a Reader to a Writer. If an error occurs, it is returned as a read error. From the source view, is a pure read data to write Writer.

Detail analysis:

  • The constructor is not Newxxx, because instead of building a new type, it is Reader
  • Meanwhile, the structure teeReader is not exposed
    • This means that the author does not want the caller to go through type assertions and retrieve information through fields
    • Perfectly hiding information about teeReader is encapsulation
  • If multiple operations may have errors, use one of them

Both LimitReader and SectionReader/teeReader have structure in common, and the use of IO interfaces: named fields instead of inline.

We can also generalize the names of constructors:

  • If you are constructing a new type, use Newxxx
  • If it is a return interface type, use a name close to the implementation type

io.CopyN

This is the functional function. Take a layer-by-layer approach and summarize at the end.

func CopyN(dst Writer, src Reader, n int64) (written int64, err error) {
  written, err = Copy(dst, LimitReader(src, n))
  if written == n {
    return n, nil
  }
  if written < n && err == nil {
    // src stopped early; must have been EOF.
    err = EOF
  }
  return
}
Copy the code

Document Description:

  • Copy n bytes of content from SRC to DST
    • The first return value is the number of bytes copied
    • The second return value err specifically refers to the first error
    • When err is nil, the number of bytes copied is the same as the number of bytes specified by the parameter n
  • If DST (Writer) implements ReaderFrom, the internal implementation uses ReaderFrom

Fortunately, functional structures were analyzed first, so again, you can make some guesses. We analyze three functional constructs :LimitReader extends the maximum number of bytes read, SectionReader extends the maximum number of bytes read, and the last teeReader extends the junction between Reader and Writer. CopyN extends the connection hub as well as the maximum number of bytes. So it’s only natural that LimitReader is in CopyN.

func Copy(dst Writer, src Reader) (written int64, err error) {
  return copyBuffer(dst, src, nil)
}
Copy the code

Copy is responsible for copying, and the underlying copyBuffer function called is the concrete work function.

func CopyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error) { if buf ! = nil && len(buf) == 0 { panic("empty buffer in io.CopyBuffer") } return copyBuffer(dst, src, buf) }Copy the code

The above everywhere method copyBuffer is also implemented based on Copyuffer, which is reusable.

func copyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error) { if wt, ok := src.(WriterTo); ok { return wt.WriteTo(dst) } if rt, ok := dst.(ReaderFrom); ok { return rt.ReadFrom(src) } if buf == nil { size := 32 * 1024 if l, ok := src.(*LimitedReader); ok && int64(size) > l.N { if l.N < 1 { size = 1 } else { size = int(l.N) } } buf = make([]byte, size) } for { nr, er := src.Read(buf) if nr > 0 { nw, ew := dst.Write(buf[0:nr]) if nw > 0 { written += int64(nw) } if ew ! = nil { err = ew break } if nr ! = nw { err = ErrShortWrite break } } if er ! = nil { if er ! = EOF { err = er } break } } return written, err }Copy the code

CopyBuffer is the final working function.

  • Check whether the Writer/Reader type is extended
    • Judge by type assertion
    • If there is an extension, call the extension directly to implement it
      • At this time, there is no tunnel third parameter BUF
  • If it is a regular Writer/Reader, go to the normal flow
    • As you can see, the third parameter, BUf, is a temporary buffer
    • If copyBuffer is not specified, the default maximum size is 32 megabytes
    • Then there’s a for loop, where you read and write until you read or get an error

This is a copy class function using the ReaderFrom/WriteTo interface.

IO. ReadAtLeast analysis

Functional functions, similar to the Copy series.

func ReadAtLeast(r Reader, buf []byte, min int) (n int, err error) {
  if len(buf) < min {
    return 0, ErrShortBuffer
  }
  for n < min && err == nil {
    var nn int
    nn, err = r.Read(buf[n:])
    n += nn
  }
  if n >= min {
    err = nil
  } else if n > 0 && err == EOF {
    err = ErrUnexpectedEOF
  }
  return
}
Copy the code

Documentation:

  • Read from Reader in at least min of bytes
  • The exposed behavior is one read, but in the BUF buffer, if the buffer length is less, a special error is reported
  • If the number of bytes read is less than min when EOF is encountered, a special error is reported
  • Two scenarios that behave correctly:
    • N is greater than or equal to min,err is nil
    • N is greater than or equal to min. Err is non-nil

In general, as long as n is greater than or equal to min, it is understood as normal behavior, other behaviors will report errors, but the error type is different.

The intent of this function is to read at least how many bytes. Failure to do so returns an error message.

func ReadFull(r Reader, buf []byte) (n int, err error) {
  return ReadAtLeast(r, buf, len(buf))
}
Copy the code

This function byte is the same read size as the slice size. The intent is to read data of the specified slice size.

IO. WriteString analysis

The function is simple:

func WriteString(w Writer, s string) (n int, err error) {
  if sw, ok := w.(StringWriter); ok {
    return sw.WriteString(s)
  }
  return w.Write([]byte(s))
}
Copy the code

The intent is to write the string to Writer, but it provides an opportunity to modify the write logic. Is done through the interface. This is the typical application of dependency inversion principle, but also the embodiment of the open and closed principle. Dip/OCP.

Similar extensible interfaces are: ReaderAt WriterAt/ByteReader ByteScanner/ByteWriter/RuneReader/RuneScanner, StringWriter also calculate a.

It’s a brilliant way to write it

There are two kinds of external exposure, one is an interface, one is a functional structure or function.

The whole notation has the following hierarchy:

  • The most basic interface
    • Single function, consistent with SRP (Single responsibility principle)
    • Reader/Writer Is the most popular Writer
    • There is only one method, which is more scalable
  • An interface composed of basic interfaces
    • Combine the method set to 2-3 methods
    • The usage scenarios are not as broad as the basic interface
    • If the combined interface is larger than the required size, it does not comply with LSP (Richter Substitution Principle).
  • A functional structure based on a basic interface
    • On the basis of the basic interface, some input attributes are qualified
    • Eg: Maximum number of bytes that can be read, data to be written from a Reader, etc
    • This is the best embodiment of the Richter replacement LSP
    • Multiple qualifiers can be specified at the same time
    • A combination of different qualifiers, and it follows the LKP principle.
    • The functional Copy/ReadFull functions go the same way
  • A function that can be extended externally
    • IO packages also provide a set of interfaces for other packages to implement
    • At the same time, these interfaces can be combined or embedded, which conforms to ISP (interface isolation principle).
    • Some functions are implemented internally using these interfaces
    • This is a classic dependency inversion DIP, fully in line with the OCP (Open and Closed principle)

The intent of the entire IO package is to read and write, covering the largest number of scenarios, and then to provide functional structures or functions for a specific level of scenario by combining or adding constraints. Another approach is to provide some interfaces or functions in a dependency inversion approach.

The principle of SRP/OCP/LSP/LKP/ISP/DIP can be found in the whole package.

Here are the little details:

If only one interface is implemented, the constructor should return the interface type according to LSP (Richman substitution principle), and the constructor name should not be Newxx(). If the structure implements more than one interface, the constructor should return the structure type and the constructor name should be Newxxx().

Core interfaces, which generally have few methods, should have extensive documentation describing their behavior.

Possible application scenarios

LimitReader can be used in scenarios where the maximum number of bytes read is limited; SectionReader can be used in scenarios where the maximum number of bytes is limited and the read area is limited.