Why avoid ioutil.ReadAll in Go?

The main function of ioutil.ReadAll is to ReadAll data from an IO.Reader until the end.

A GitHub search for ioutil.ReadAll with type Code and language Go yielded 637,307 results.

This shows that ioutil.ReadAll is popular, mainly because it’s really easy to use.

However, when encountering large files, this function exposes two obvious drawbacks:

  1. Performance issues. The larger the file, the worse the performance.
  2. If the file is too large, it may directly burst the memory, causing the program to crash.

Why is that? This article will analyze the reasons behind the source code and try to give a better solution.

So let’s get started.

ioutil.ReadAll

First, let’s look at a usage scenario for ioutil.readall with an example. For example, using http.Client to send a GET request and then read what is returned:

func main(a) {
	res, err := http.Get("http://www.google.com/robots.txt")
	iferr ! =nil {
		log.Fatal(err)
	}
	
	robots, err := io.ReadAll(res.Body)
	res.Body.Close()
	iferr ! =nil {
		log.Fatal(err)
	}
	fmt.Printf("%s", robots)
}
Copy the code

The data returned by http.get () is stored in res.body and read out by ioutil.readall.

On the surface, this code looks fine, but on closer analysis, it doesn’t. To explore the reasons behind this, you can only rely on the source code.

ReadAll ioutil.ReadAll

// src/io/ioutil/ioutil.go

func ReadAll(r io.Reader) ([]byte, error) {
	return io.ReadAll(r)
}
Copy the code

IO.ReadAll() ¶ Go 1.16: IO.ReadAll() ¶

// src/io/io.go

func ReadAll(r Reader) ([]byte, error) {
    // Create a 512-byte buF
	b := make([]byte.0.512)
	for {
		if len(b) == cap(b) {
			// If buF is full, an element is appended to reallocate memory
			b = append(b, 0)[:len(b)]
		}
		// Read the content to buF
		n, err := r.Read(b[len(b):cap(b)])
		b = b[:len(b)+n]
		// If an end is encountered or an error is reported
		iferr ! =nil {
			if err == EOF {
				err = nil
			}
			return b, err
		}
	}
}
Copy the code

I have added necessary comments to the code, which is executed in three main steps:

  1. Create a 512-bytebuf;
  2. Keep reading content tobufwhenbufWhen it is full, an element is appended, prompting it to reallocate memory.
  3. Returns until the end or an error is reported;

We know the execution steps, but to analyze its performance problems, we also need to understand the expansion strategy of Go slices, as follows:

  1. If the desired capacity is more than twice the current capacity, the desired capacity is used;
  2. If the current slice length is less than 1024, the capacity will be doubled;
  3. If the current slice length is greater than 1024, the capacity will be increased by 25% each time until the new capacity is greater than the desired capacity.

That is, if the size of the data to be copied is less than 512 bytes, the performance is not affected. But if it exceeds 512 bytes, slice expansion begins. The larger the data volume, the more frequent the capacity expansion, the greater the impact on performance.

If the amount of data is large enough, the memory may simply burst, which can have a big impact.

Is there a better alternative? Of course there is. Let’s move on.

io.Copy

IO.Copy function can be used instead, the source definition is as follows:

src/io/io.go

func Copy(dst Writer, src Reader) (written int64, err error) {
	return copyBuffer(dst, src, nil)}Copy the code

Its function is to read data directly from SRC and write to DST.

The main difference with ioutil.ReadAll is that it doesn’t pull out all the data at once, but reads and writes.

Copy logic is implemented in copyBuffer function:

// src/io/io.go

func copyBuffer(dst Writer, src Reader, buf []byte) (written int64, err error) {
	// If the source implements the WriteTo method, WriteTo is called directly
	if wt, ok := src.(WriterTo); ok {
		return wt.WriteTo(dst)
	}
	// Again, if the target implements the ReaderFrom method, ReaderFrom is called directly
	if rt, ok := dst.(ReaderFrom); ok {
		return rt.ReadFrom(src)
	}
	// If buF is empty, create a 32KB BUF
	if buf == nil {
		size := 32 * 1024
		if l, ok := src.(*LimitedReader); ok && int64(size) > l.N {
			if l.N < 1 {
				size = 1
			} else {
				size = int(l.N)
			}
		}
		buf = make([]byte, size)
	}
	// Loop to read data and write
	for {
		nr, er := src.Read(buf)
		if nr > 0 {
			nw, ew := dst.Write(buf[0:nr])
			if nw < 0 || nr < nw {
				nw = 0
				if ew == nil {
					ew = errInvalidWrite
				}
			}
			written += int64(nw)
			ifew ! =nil {
				err = ew
				break
			}
			ifnr ! = nw { err = ErrShortWritebreak}}ifer ! =nil {
			ifer ! = EOF { err = er }break}}return written, err
}
Copy the code

This function performs the following steps:

  1. If the source implementsWriteToMethod is called directlyWriteToMethods;
  2. Similarly, if the target implements the ReaderFrom method, the ReaderFrom method is called directly;
  3. ifbufIf the value is empty, a 32KB file is createdbuf;
  4. And then finally the loopReadWrite;

After comparison, it can be found that the IO.Copy function does not read all the data at one time, nor does it perform slice expansion frequently. Therefore, it is a better choice for large data volume.

Ioutil other functions

Look at the other functions of the ioutil package:

  • func ReadDir(dirname string) ([]os.FileInfo, error)
  • func ReadFile(filename string) ([]byte, error)
  • func WriteFile(filename string, data []byte, perm os.FileMode) error
  • func TempFile(dir, prefix string) (f *os.File, err error)
  • func TempDir(dir, prefix string) (name string, err error)
  • func NopCloser(r io.Reader) io.ReadCloser

The following examples illustrate:

ReadDir

// ReadDir reads all directories and files in the specified directory (excluding subdirectories).
// Returns a sorted list of read file information and errors encountered.
func ReadDir(dirname string) ([]os.FileInfo, error)
Copy the code

For example:

package main

import (
	"fmt"
	"io/ioutil"
)

func main(a) {
	dirName := ".. /"
	fileInfos, _ := ioutil.ReadDir(dirName)
	fmt.Println(len(fileInfos))
	for i := 0; i < len(fileInfos); i++ {
		fmt.Printf("%T\n", fileInfos[i])
		fmt.Println(i, fileInfos[i].Name(), fileInfos[i].IsDir())

	}
}
Copy the code

ReadFile

// ReadFile reads all data in the file and returns the read data and errors encountered
// If the read succeeds, err returns nil instead of EOF
func ReadFile(filename string) ([]byte, error)
Copy the code

For example:

package main

import (
	"fmt"
	"io/ioutil"
	"os"
)

func main(a) {
	data, err := ioutil.ReadFile("./test.txt")
	iferr ! =nil {
		fmt.Println("read error")
		os.Exit(1)
	}
	fmt.Println(string(data))
}
Copy the code

WriteFile

// WriteFile writes data to a file and clears the file before writing.
// If the file does not exist, it will be created with the specified permissions.
// Return the error encountered.
func WriteFile(filename string, data []byte, perm os.FileMode) error
Copy the code

For example:

package main

import (
	"fmt"
	"io/ioutil"
)

func main(a) {
	fileName := "./text.txt"
	s := "Hello AlwaysBeta"
	err := ioutil.WriteFile(fileName, []byte(s), 0777)
	fmt.Println(err)
}
Copy the code

TempFile

// TempFile Creates a temporary file prefixed with prefix in the dir directory and reads it
// Write mode is enabled. Returns the file object created and the error encountered.
// If dir is empty, the file is created in the default temporary directory (see os.tempdir) multiple times
// The call creates a different temporary file. The caller can get the full path to the file through f.name ().
// Temporary files created by calling this function should be deleted by the caller himself.
func TempFile(dir, prefix string) (f *os.File, err error)
Copy the code

For example:

package main

import (
	"fmt"
	"io/ioutil"
	"os"
)

func main(a) {
	f, err := ioutil.TempFile(". /"."Test")
	iferr ! =nil {
		fmt.Println(err)
	}
	defer os.Remove(f.Name()) // Delete when used up
	fmt.Printf("%s\n", f.Name())
}
Copy the code

TempDir

// TempDir has the same function as TempFile, except that it creates a directory and returns the full path of the directory.
func TempDir(dir, prefix string) (name string, err error)
Copy the code

For example:

package main

import (
	"fmt"
	"io/ioutil"
	"os"
)

func main(a) {
	dir, err := ioutil.TempDir(". /"."Test")
	iferr ! =nil {
		fmt.Println(err)
	}
	defer os.Remove(dir) // Delete when used up
	fmt.Printf("%s\n", dir)
}
Copy the code

NopCloser

// NopCloser wraps r as a ReadCloser type, but the Close method does nothing.
func NopCloser(r io.Reader) io.ReadCloser
Copy the code

The usage scenario for this function is as follows:

Sometimes we need to pass an instance of IO.ReadCloser, and now we have an instance of IO.Reader, for example, strings.reader.

This is where the NopCloser comes in handy. It wraps an IO.Reader, returns an IO.ReadCloser, and the corresponding Close method does nothing but return nil.

For example:

package main

import (
	"fmt"
	"io/ioutil"
	"reflect"
	"strings"
)

func main(a) {
	/ / return * strings. Reader
	reader := strings.NewReader("Hello AlwaysBeta")
	r := ioutil.NopCloser(reader)
	defer r.Close()

	fmt.Println(reflect.TypeOf(reader))
	data, _ := ioutil.ReadAll(reader)
	fmt.Println(string(data))
}
Copy the code

conclusion

Ioutil provides several useful utility functions, and the logic behind the implementation is not complicated.

This article starts with a problem and focuses on the ioutil.ReadAll function. The main reason is that with small amounts of data, this function is fine, but with large amounts of data, it becomes a ticking time bomb. The performance of the program may be affected, and even the program may crash.

Next, the corresponding solution is given. In the case of large data volume, it is best to use io.copy function.

The article concludes with an introduction to several other functions of ioutil and examples. Relevant codes will be uploaded to GitHub, students who need to download their own.

Well, that’s all for this article. Follow me, take you through the problem read Go source.


Source code address:

  • Github.com/yongxinz/go…

Recommended reading:

  • How to convert []byte to io.reader in Go?
  • Start reading Go source code

Reference article:

  • Haisum. Making. IO / 2017/09/11 /…
  • Juejin. Cn/post / 697764…
  • zhuanlan.zhihu.com/p/76231663