Dig101: dig more, simplified more and know more

Today we’re going to talk about the underlying design of interfaces that can do anything.

Interface is defined as the signature of a set of methods.

With it, we can make method contracts to de-abstract and constrain implementation.

The basic type of Go can be thought of as an empty interface that implements no methods, that is, an interface that does everything.

(The Go language has no generics and interfaces can be used as an alternative implementation.)

Interfaces are also expected, with lead developer Russ Cox saying:

From a language design perspective, Go’s interface is static, checked at compile time, and dynamic when needed. If there was one feature of Go that I could export to other languages, it would be the interface. Go Data Structures: Interfaces[1]

So what is the underlying structure of interface design?

How about duck typing[2]?

What happens with type assertions?

With these questions in mind, let’s move on

The article directories

  • 0x01 same underlying structure

    • eface

    • iface

  • How do types 0x02 convert to each other

    • The naming of convXXX

    • Initial convT2{I,E} and convI2I

    • ConvXXX optimized for type

  • How is type 0x03 assertion implemented

    • Check whether the table matches

    • Try to insert updates

    • Dynamic decision efficiency optimization

0x01 same underlying structure

We know that there are two ways to define an interface. Is the underlying structure the same?

// mode 1var a interface{}// mode 2type Stringer interface{String() String}var b StringerCopy the code

The answer is no.

We use GDB to print the corresponding type (see Tips- How to gracefully Debug Go using GDB).

// Null interface type>>> ptype atype = struct runtime.eface { runtime._type *_type; void *data; }// Interface types defined by functions>>> ptype btype = struct runtime.iface { runtime.itab *tab; void *data; }// Related Types itable>>> ptype b.tabtype = struct Runtime. itab {// InterfaceType *inter; // Construct type runtime._type *_type; uint32 hash; [4]uint8 _; [1]uintptr fun; } *>>> > ptype b.tab.intertype = struct runtime.interfacetype {// interfacetype runtime. typ; runtime.name pkgpath; []runtime.imethod MHDR; } *Copy the code

Go internally defines two types of interface (but both are machine words)

eface

An empty interface that has no defined methods

Concrete Type type and data are stored internally

eface

iface

Interfaces with methods

We have a richer ITAB field than eFace’s type, which records the types and methods of the constructed type and the interface type implemented

iface

How do types 0x02 convert to each other

When we do an interface assignment, how does Go fill in the underlying structure?

type Binary uint64func (i Binary) String() string { return strconv.Itoa(int(i))}func conversion() { var b Stringer var i Binary = 1 b = I // <= what happened here println(b.string ())}Copy the code

When GDB goes to b = I, it calls runtime/iface. Go :convT64 to assign iface

If you look at the source code, you’ll find a lot of convXXX functions. What do they do?

The naming of convXXX

ConvFrom2To refers To the conversion of To=From

From and To the type of the three: (see CMD/compile/internal/types/type go: Tie)

  • E (eface)
  • I (iface)
  • T (Type)

This list of functions is dizzyingly large, but a deeper analysis with reference to the submitted convT2x, Don’t Alloc for Zero Vals [3] makes it much clearer

Initial convT2{I,E} and convI2I

Initially there were only convT2{I,E} and convI2I

Allocate memory (newObject) and copy assignment (TypedMemMove)

ConvI2I is also going to have getitab, which is what we’re going to say later on in the type assertion

We also optimized them before calling them (Walkexpr)

  • Reduced value copy

If ToType is pointer-shaped or int, it can be stored directly into the data field of the interface.

Poor-shaped types: PTR, chan, map, func, unsafe.Pointer

With type storage, it’s just a two-word copy

  • Reduce memory allocation

Zero values, bool/byte can be used instead of allocating memory to existing values (Zerobase,staticbytes)

Read-only global variables can be used directly

The stack can be initialized for non-interface variables that do not escape to the heap up to 1KB.

This type of value is finally converted to interface: {type/itab, &value} in the form of an address fetch.

  • Interface Empty interface (EFACE)

Itabs other than Type can be discarded

tmp = i.itabif tmp ! = nil { tmp = tmp.type}e = iface{tmp, i.data}Copy the code

ConvXXX optimized for type

But there are some points that can be optimized, such as:

  • Can allocated memory be cleared?

The type of the class pointer needs to be cleared, otherwise there may be dirty data in memory

However, the pointer-free type is not required if it can directly overwrite the corresponding memory during copying

If int is copied within a machine word, it does not need to be cleared at allocation time. (Without calling convT64 on 32-bit systems, accessing memory is a safe atomic operation.)

  • Can value copying be simplified?

Int, string, slice: *(*Type)(x) = val

  • Can I copy memory without adding GC calls (write barriers)?

ConvT2 {E,I} gc calls when it needs to be copied (typedmemmove)

ConvT2 {E,I}noptr does not need to be copied when GC is called (memmove)

So you can see the purpose of these functions, or targeted to improve conversion efficiency

Finally, combined with the list of convXXX where it was called:

// cmd/compile/internal/gc/walk.go:walkexprcase OCONVIFACE:  ...  fnname, needsaddr := convFuncName(fromType, toType)Copy the code
fnname fromType needsaddr
convI2I iface no
16,32,64 convT {} Integer data (no pointer, machine word) no
convTstring string no
convTslice slice no
convT2E Type is
convT2Enoptr There is no pointer Type is
convT2I Type is
convT2Inoptr There is no pointer Type is

ConvE2E and convE2I will not exist

Needsaddr: Type with no pointer, larger than a 64-bit word or unknown size, stored using the address of the value

How is type 0x03 assertion implemented

Interface supports type assertion to dynamically determine its construction type,

Success returns the corresponding construct type to facilitate the invocation of its methods

Constructible types implement interfaces without a display declaration,

How does the following code determine that interface B implements Stringer?

type Binary uint64func (i Binary) String() string {  return fmt.Sprint(i)}func typeAssert() {  var b interface{} = Binary(1)  v, ok := b.(Stringer)  println(v, ok)}Copy the code

After debugging, it calls assertE2I2

There are two classes of function names, as follows

assertE2I: v := eface1.(iface1)

assertE2I2: v,ok := eface1.(iface1)

There is one thing here, type assertion is not V, OK, assertion failure will panic.

Originally its internal ITAB table (itabTable) query interface and construction type of mapping table, if the match is explained to achieve

The following code analysis is as follows

Start with an initial table of 512 entries

Const itabInitSize = 512Type itabTableType struct {// uintptr // uintptr entries [itabInitSize]*itab}Copy the code

Check whether the table matches

Call getitab(inter, TYp, canfail) in the type assertion to look up the table

  • Atomic read itabTable without lock
  • Lock not found check again, find return
  • Create an ITAB and add it to the table
  • If there is no match, panic (canfail) is returned

Itabtable.find (inter, TYp);

Insert itabAdd(m)

Try to insert updates

  • Use before insertingm.inter/m._type pairInitialize them.funArray, does not matchm.fun[0]==0

[1] Uintptr = uintptr Func (m *itab) init()

  • If count exceeds 75% of the upper limit, capacity expansion is triggered and the size is more than two times (memory alignment must be upward). Updating the itabTable after capacity expansion is an atomic operation

  • The initial offset of the itabTable is calculated using the hash of the interface type and construction type of the ITABM, and then inserted into the first non-empty entry after it. If it already exists, return it directly

The open address detection method is used here, and the formula is:

h(i) = h0 + i*(i+1)/2 mod 2^k

Insert itabtable.add (m)

This is similar to the logic used for map insertion

Dynamic decision efficiency optimization

But there’s a problem, right?

Assuming that interface defines NI methods and the constructor type implements NT methods,

Whether the conventional matching construction type implements all ni methods requires two layers of traversal, complexity O(Ni *nt)

This is less efficient when initializing ITab.fun or type assertion matches.

Go was designed with this in mind, reducing complexity to O(Ni +nt).

This is one of the reasons for using HashTable:

  • First, the interface function definition list itab.inter. MHDR and the constructor type function list itab.fun are both sorted by function name

  • The first time the ITAB is initialized, the list of functions that the constructor type implements can be traversed in O(ni+nt)

  • Then update to itABTable with open address detection method, query can also use the same way to locate the existence of the ITAB.

The traversal matching code for two (ordered) lists is simplified as follows:

// runtime/iface.go:init()j:=0imethods: k < ni; K++ {// iterate over the constructor function list for; j < nt; J++ {// if both types (type), Continue iMethods}} // Not fully matched m.sun [0] = 0} m.sun [0] =  uintptr(fun0)Copy the code

To summarize the underlying interface design:

  • Interface is classified into two types: empty interface (EFACE) and interface (IFACE), but both are two-word storage structures
  • Interface conversion is optimized for different types, focusing on improving memory allocation and value copy efficiency
  • Dynamic determination of interface type assertion, using ordered list traversal + global hash table cache optimization determination efficiency

See More: Official explanation of InterfaceSlice [4] Why not directly convert


One last question:

There is no internal convT64 in this conversion code. Why?

var b Stringer = Binary(1)_ = b.String()Copy the code

This question will be answered in the next article.

See NewbMiao/Dig101-Go for the code for this article [5]

The resources

Go Data Structures: Interfaces: https://research.swtch.com/interfaces

duck typing: https://en.wikipedia.org/wiki/Duck_typing

specialize convT2x, don’t alloc for zero vals: https://go-review.googlesource.com/c/go/+/36476

InterfaceSlice: https://github.com/golang/go/wiki/InterfaceSlice

NewbMiao/Dig101-Go: https://github.com/NewbMiao/Dig101-Go/blob/master/types/interface/interface.go


Recommended reading

  • Dig101: Go for range Drainage guide
  • Dig101: Go’s flexible Slice
  • Dig101: Go string stuff
  • Dig101:Go to understand the underlying design of map
  • Dig101:Go talk about struct memory alignment
  • Tips- How to gracefully debug Go using GDB

Original is not easy, if useful, welcome to forward, point to see, share to more people.

The internal and external links of wechat cannot be jumped


Welcome to pay attention to me, a love of development (dad) : love to move bricks, price investment.