This is the 19th day of my participation in the More text Challenge. For details, see more text Challenge

Chan implementation principle sharing in GO

Hi, I’m Nezha, remember when we shared the use of the GO channel and sync pack? Let’s review

  • What are the shared channels? What kind of channels
  • No buffering, buffering, one-way channel what exactly does it correspond to
  • Concrete practices for channels
  • Shared the exception case collation about the channel
  • The use of the Sync package is simply shared

If you’re still interested, check out the Posts on the GO channel and the Sync package

What is Chan?

Is a special type of pipe that connects concurrent Goroutines

Channel A channel is a communication mechanism that allows one Goroutine coroutine to send a specific value to another.

A channel is like a conveyor belt or queue, always following a First In First Out rule to ensure that data is sent and received In the same order as a pipe

One coroutine puts data from one end of the channel, and the other reads data from the other end

Each channel is a specific type of conduit, and when declaring a channel, you need to specify an element type for it.

The main purpose of this article is to share how channels are implemented. For details on how to use channels, check out the GO channel and Sync package in this article

Chan’s underlying data structure in GO

To understand the implementation of each component or each data type, we will look at the source code of the data structure is designed

Similarly, let’s take a look at GO’s Chan data structure

The Chan source implementation of GO is at: SRC /runtime/chan.go

type hchan struct {
   qcount   uint           // total data in the queue
   dataqsiz uint           // size of the circular queue
   buf      unsafe.Pointer // points to an array of dataqsiz elements
   elemsize uint16
   closed   uint32
   elemtype *_type // element type
   sendx    uint   // send index
   recvx    uint   // receive index
   recvq    waitq  // list of recv waiters
   sendq    waitq  // list of send waiters

   // lock protects all fields in hchan, as well as several
   // fields in sudogs blocked on this channel.
   //
   // Do not change another G's status while holding this lock
   // (in particular, do not ready a G), as this can deadlock
   // with stack shrinking.
   lock mutex
}
Copy the code

Hchan is the core data structure of the implementation channel, the corresponding members are also a lot, let’s take a look at a parameter according to the source code annotation

tag instructions
qcount The number of remaining elements in the current queue
dataqsiz The number of elements that a ring queue can hold, which is the length of the ring queue
buf Pointer to the ring queue
elemsize Refers to the size of each element in the queue of
closed Indicates the disabled status
elemtype See name for meaning, element type
sendx The subscript of the send queue, where data is stored in the queue when it is written to the queue
recvx Take the subscript of the queue and start reading from this position in the queue
recvq Coroutine queue, queue of coroutines waiting to read a message
sendq Coroutine queue, a queue of coroutines waiting to send a message
lock Mutexes, in chan, cannot concurrently read and write data

According to the above parameters, we can more or less know what knowledge points are designed for the channel implementation principle in GO:

  • Pointer to the
  • The circular queue
  • coroutines
  • The mutex

Let’s also take a look at the data structure corresponding to the coroutine queue waitQ of the above member

type waitq struct {
   first *sudog
   last  *sudog
}
Copy the code

The sudog structure is in SRC/Runtime /runtime2.go, so let’s learn more about it

// sudog represents a g in a wait list, such as for sending/receiving
// on a channel.
type sudog struct {
   // The following fields are protected by the hchan.lock of the
   // channel this sudog is blocking on. shrinkstack depends on
   // this for sudogs involved in channel ops.

   g *g

   next *sudog
   prev *sudog
   elem unsafe.Pointer // data element (may point to stack)

   // The following fields are never accessed concurrently.
   // For channels, waitlink is only accessed by g.
   // For semaphores, all fields (including the ones above)
   // are only accessed when holding a semaRoot lock.

   acquiretime int64
   releasetime int64
   ticket      uint32

   // isSelect indicates g is participating in a select, so
   // g.selectDone must be CAS'd to win the wake-up race.
   isSelect bool

   // success indicates whether communication over channel c
   // succeeded. It is true if the goroutine was awoken because a
   // value was delivered over channel c, and false if awoken
   // because c was closed.
   success bool

   parent   *sudog // semaRoot binary tree
   waitlink *sudog // g.waiting list or semaRoot
   waittail *sudog // semaRoot
   c        *hchan // channel
}
Copy the code

Based on the source code comments, we have a rough idea of what sudog does

Sudog represents a G in a waiting list, such as send/receive on a channel

Sudog is necessary because the g to synchronization object relationship is many-to-many

A G may be in many waiting queues, so a g may have many sudogs

And many Gs may be waiting for the same synchronization object, so one object may have manysudogs

Let’s grasp the principal contradiction

Sudog’s data structure, the main thing is a G and an ELEm,

G, it says here that he corresponds to Sudog

Elem is required for both read and write channels

  • Read the passage

The data is copied from hchan’s queue to SuDog’s ELEM

  • Write channels

Similar to the read channel, data is copied from suDog’s ELEM to hchan’s queue

Let’s draw a picture

Here let’s draw a structure of hchan, mainly draw recvQ waiting to read the message of the coroutine queue, here the queue, is actually using a linked list to achieve

Recvq corresponds to the WaitQ structure, which is divided into the first head node and the last tail node structure, which is sudog

The elem in sudog holds the specific data, and the next pointer points to the next sudog until it reaches the last sudog

Through the above, you should understand the basic structure of Chan in GO

Let’s take a closer look at what the other parameters in Hchan mean

  • dataqsizWhat the corresponding circular queue looks like
  • writesendqAnd readrecvqWhat the waiting queue looks like
  • elemtypeWhat is the element type information

dataqsizWhat the corresponding circular queue looks like

A ring queue, so the name is a ring queue connected from end to end

A ring queue inside a chan in GO that serves primarily as a buffer

The length of this ring queue is specified when we create the queue, that is, when we create the hchan structure

Dataqsiz, the length of the ring queue

Let’s draw a picture to wake us up

The above queue is circular. By default, the queue is connected first to last:

  • Dataqsiz indicates that the length of the circular queue is 8
  • Qcount indicates that there are five elements in the current queue
  • Buf is a pointer to the circular queue header
  • Sendx is the index of the send queue, where 1 points to the second region of the queue. This parameter is optional in the range of [0, 8].
  • Recvx is the index of the receive queue. If recvx is 4, it points to the fifth region of the queue for reading

By the way, in hchan, to read or write data, you need to get the lock mutex. On the same channel, only one coroutine can read or write data at a time

writesendqAnd readrecvqWhat the waiting queue looks like

There are two coroutine queues in hchan structure. One is used for reading data, and the other is used for sending data. They are both waiting queues

When reading or sending data from a channel:

  • Coroutines are blocked if the channel buffer is empty or if there is no buffer and data is read from the channel
  • If the channel buffer is full, or if there is no buffer, and data is written from the channel, the coroutine is still blocked

The blocked coroutines are then placed on the waiting queue and classified as write sendQ and read RECvQ according to the read and write actions

So when are these blocked coroutines going to wake up?

You should know this from the previous post about the GO channel and the Sync package

Let’s review the table from this article for some of the exceptions that a channel can have:

The channel state Uninitialized channel (nil) The channel is not empty The channel is empty The channel is filled with Channel under
Receive data blocking Receive data blocking Receive data Receive data
To send data blocking To send data To send data blocking To send data
Shut down panic Channel closed successfully

After the data is read

Returns a zero value
Channel closed successfully

Return zero directly
Channel closed successfully

After the data is read

Returns a zero value
Channel closed successfully

After the data is read

Returns a zero value

At this point, we know, exactly when is the blocked coroutine going to wake up

  • Because a read-blocking coroutine is awakened by a writing coroutine in the channel, and vice versa

  • Because coroutines that block by writing are also awakened by coroutines that read data in the channel

elemtypeWhat is the element type information

Elemtype = elemType; elemType = elemType; elemType = elemType; elemType = elemType; elemType = elemType

Elemsize is also a member of hchan, which represents the size of the element type above

So what do these two guys do?

The values of elemType and elemsize can be used to calculate the size of the data of the specified type

The former is used to assign values during data transfer

The latter can be used to locate specific elements in a circular queue

How do I create chan?

Let’s look at the source code implementation of Chan. go again and see the makechan implementation in the source code

func makechan(t *chantype, size int) *hchan {
   elem := t.elem

   // compiler checks this but be safe.
   if elem.size >= 1<<16 {
      throw("makechan: invalid channel element type")}ifhchanSize%maxAlign ! =0 || elem.align > maxAlign {
      throw("makechan: bad alignment")
   }

   mem, overflow := math.MulUintptr(elem.size, uintptr(size))
   if overflow || mem > maxAlloc-hchanSize || size < 0 {
      panic(plainError("makechan: size out of range"))}// Hchan does not contain pointers interesting for GC when elements stored in buf do not contain pointers.
   // buf points into the same allocation, elemtype is persistent.
   // SudoG's are referenced from their owning thread so they can't be collected.
   // TODO(dvyukov,rlh): Rethink when collector can move allocated objects.
   var c *hchan
   switch {
   case mem == 0:
      // Queue or element size is zero.
      c = (*hchan)(mallocgc(hchanSize, nil.true))
      // Race detector uses this location for synchronization.
      c.buf = c.raceaddr()
   case elem.ptrdata == 0:
      // Elements do not contain pointers.
      // Allocate hchan and buf in one call.
      c = (*hchan)(mallocgc(hchanSize+mem, nil.true))
      c.buf = add(unsafe.Pointer(c), hchanSize)
   default:
      // Elements contain pointers.
      c = new(hchan)
      c.buf = mallocgc(mem, elem, true)
   }

   c.elemsize = uint16(elem.size)
   c.elemtype = elem
   c.dataqsiz = uint(size)
   lockInit(&c.lock, lockRankHchan)

   if debugChan {
      print("makechan: chan=", c, "; elemsize=", elem.size, "; dataqsiz=", size, "\n")}return c
}
Copy the code

The size of buf is determined by the type information passed to makechan and the length of the buffer, that is, makechan’s input parameters

This can be seen in three places in the above code

/ / 1
func makechan(t *chantype, size int) *hchan
/ / 2
mem, overflow := math.MulUintptr(elem.size, uintptr(size))
/ / 3
var c *hchan
   switch {
   case mem == 0:
      // Queue or element size is zero.
      c = (*hchan)(mallocgc(hchanSize, nil.true))
      // Race detector uses this location for synchronization.
      c.buf = c.raceaddr()
   case elem.ptrdata == 0:
      // Elements do not contain pointers.
      // Allocate hchan and buf in one call.
      c = (*hchan)(mallocgc(hchanSize+mem, nil.true))
      c.buf = add(unsafe.Pointer(c), hchanSize)
   default:
      // Elements contain pointers.
      c = new(hchan)
      c.buf = mallocgc(mem, elem, true)}Copy the code

Basic process of reading and writing chan

The first diagram illustrates the process of writing data to Chan

Writing data to a channel, we will involve resources for sendq, RecvQ queues, and circular queues

According to the figure, data can be written into the channel in three cases:

  • When writing data, ifrecvqIf the queue is empty and the circular queue is empty, the data can be directly written to the end of the circular queue
  • ifrecvqIf the queue is empty and the circular queue is empty, the current coroutine is placed insendqThe waiting queue blocks, waiting to be woken up, and by the time it is woken up, the data that needs to be written has already been read and the write operation has completed
  • ifrecvqIf the queue is not empty, then there is no data in the circular queue, or the circular queue is empty, that is, there is no buffer (Writes data to an unbuffered channel), at this point, directlyrecvqWait for a G to be retrieved from the queue, write data, wake up G, and complete the write operation

The second diagram illustrates the process of reading data from Chan

To read data into a channel, we will cover the resource issues of sendq, RecvQ queues, and circular queues

According to the figure, reading data into the channel can be divided into four cases:

  • ifsendqIs null, and the circular queue has no elements, then the current coroutine is addedrecvqWait in the queue, putrecvqWait for a coroutine to be retrieved from the queue, wake up, and read
  • ifsendqIf the value is null and the circular queue has elements, the data in the circular queue can be read directly
  • ifsendqWhen there is data, and the circular queue has elements, the data in the circular queue can be read directly, and thesendqThe queue takes a G and puts it in the circular queue to supplement it
  • ifsendqWhen there is data and there are no elements in the circular queue, then fromsendqTake out a G and wake it up for the data read operation

It says channel creation, read and write, so how do channels close?

When we close a channel, we just close it when we apply it, what does the underlying queue do when we close it?

If you close the current channel, then the system will wake up all the coroutines in the waiting queue for recvQ to read, and every single one of those G’s will write nil by default, because the channel is closed, and if you read from the closed channel, you’ll read nil

The system will also wake up every coroutine in the waiting queue for sendq to write data, but this will cause a problem. If you write data to a closed coroutine, panic will be reported

Let’s comb through again, when will the channel operation, panic, let’s now add a wave to the table mentioned before

The channel state Uninitialized channel (nil) The channel is not empty The channel is empty The channel is filled with Channel under Closed channel
Receive data blocking Receive data blocking Receive data Receive data nil
To send data blocking To send data To send data blocking To send data panic
Shut down panic Channel closed successfully

After the data is read

Returns a zero value
Channel closed successfully

Return zero directly
Channel closed successfully

After the data is read

Returns a zero value
Channel closed successfully

After the data is read

Returns a zero value
panic
  • Close a channel that has already been closed, it will reportpanic
  • To close an uninitialized channel isnilThe channel, will also reportpanic
  • An alarm is generated when data is written to a closed channelpanic

You think that’s the end of it?

Chan is usually used with select in GO, so let’s talk briefly about how to use the GO channel with select

Select * from select * from select * from select * from select * from select * from select * from SELECT

  • SELECT
  • POLL
  • EPOLL

Can simulate their own multiplex IO multiplexing, each has advantages and disadvantages, the most commonly used is EPOLL, and C/C++ also has the corresponding network library

That’s pretty cool when we’re writing about multiplex IO in GO, which supports the select keyword by default

Simple use of SELECT

Let’s see how it works, no nonsense, let’s go to the DEMO

package main

import (
   "log"
   "time"
)

func main(a) {

   // Simply set log parameters
   log.SetFlags(log.Lshortfile | log.LstdFlags)

   // Create 2 channels with element data type int and buffer size 5
   var ch1 = make(chan int.5)
   var ch2 = make(chan int.5)

   // Write data to each channel separately
   // Write an anonymous function to add data to the channel
   go func (a){
      var num = 1
      for {
         ch1 <- num
         num += 1
         time.Sleep(1 * time.Second)
      }
   }()

   go func (a){
      var num = 1
      for {
         ch2 <- num
         num += 1
         time.Sleep(1 * time.Second)
      }
   }()

   for {
      select {// Read the data
      case num := <-ch1:
         log.Printf("read ch1 data is %d\n", num)

      case num := <-ch2:
         log.Printf("read ch2 data is: %d\n", num)

      default:
         log.Printf("ch1 and ch2 is empty\n")
          // Rest for 1s and read again
         time.Sleep(1 * time.Second)
      }
   }
}
Copy the code

Running effect

2021/06/18 17:43:06 main.go:54: ch1 and ch2 is empty
2021/06/18 17:43:07 main.go:48: read ch1 data is  1
2021/06/18 17:43:07 main.go:48: read ch1 data is  2
2021/06/18 17:43:07 main.go:51: read ch2 data is: 1
2021/06/18 17:43:07 main.go:51: read ch2 data is: 2
2021/06/18 17:43:07 main.go:54: ch1 and ch2 is empty
2021/06/18 17:43:08 main.go:48: read ch1 data is  3
2021/06/18 17:43:08 main.go:51: read ch2 data is: 3
2021/06/18 17:43:08 main.go:54: ch1 and ch2 is empty
2021/06/18 17:43:09 main.go:48: read ch1 data is  4
2021/06/18 17:43:09 main.go:51: read ch2 data is: 4
2021/06/18 17:43:09 main.go:54: ch1 and ch2 is empty
2021/06/18 17:43:10 main.go:51: read ch2 data is: 5
2021/06/18 17:43:10 main.go:48: read ch1 data is  5
Copy the code

From the run result, select monitoring 2 channels, read data is random

“Case” means “switch”. case… Select (select (select (select (select (select (select))); select (select (select));

If you’re interested, you can dig into it, but let’s stop there for today.

conclusion

  • What is the channel in GO
  • The underlying data structure of the channel is analyzed in detail
  • How are channels implemented in the GO source code
  • Chan’s basic principles of reading and writing
  • What exceptions occur when the channel is closed? Panic
  • Simple application of SELECT

Welcome to like, follow and collect

Dear friends, your support and encouragement are the motivation for me to keep sharing and improve the quality

Ok, that’s all for now, and share the implementation principles of Defer in the next GO

Technology is open, our mentality, should be more open. Embrace change, live in the sun, and strive to move forward.

I am nezha, welcome to like the collection, see you next time ~