Cause 0.

Flatbuffers have been in use for quite some time.

In several commercial projects, flatBuffers have also benefited from rapid deserialization.

Flatbuffers are particularly good for transferring small blocks of data, serializing it once and deserializing it in many places.

But GO’s flatbuffers have a few small regrets:

  1. Support for the go flatbuffers feature lags behind the c++ version, and the go codebase has not been updated for a long time. The GO version lacks some features compared to C ++. For example, vector of Unions contains struct/strings.
  2. Lack of Verifier validator (which I need)
  3. Go Flatbuffers serialization is slower than gogo Protobuf. Flatbuffers serialization takes about twice as long as Gogo Protobuf.
  4. Go FlatBuffers do not support the Go Module. Import is not friendly, especially if auto-generated GO code has cross-references.
  5. The serialization code for Go Flatbuffers is not very elegant and does not fit into go’s customary style

In this case, I came up with the idea of improving Go Flatbuffers.

A compiler for flatbuffers, written in c++ I haven’t developed in c++ in years. It could be an interesting adventure for me.

1. My toss of go flatbuffers

To start, I write a flatBuffers verifier. After the local validation is approved, I send a PR to Google FlatBuffers. It was suggested that I re-read the design specification documentation for FlatBuffers. Uh-huh. This is where it gets interesting.

Over the next two weeks or so, I wrote a brand new Serialization Builder while reading the key specification documentation for FlatBuffers (see appendix reference list).

I split the memory blocks of flatbuffers, using Goroutine to concurrently process the separate memory blocks into binary sequence data, and then merge/sort/optimize them. When the handwritten serializer seemed to work, I found that I had a small problem embedding the handwritten code into the Flatbuffers compiler to support automatic code generation. I almost forgot how to write C++.

To that end, I reread several brochures such as Effective C++, running along with a few lines of code. A week later, I reacquainted myself with C++, and gained a better understanding of go’s memory management.

I’m still trying out how to make go Flatbuffers serialization faster.

Once familiar with C++, I first made the go flatbuffers API clear and easy to use.

Support vector of unions (C++)

Union is an interesting and useful feature of flatbuffers, and struct is also useful. And it does not support arrays (called vectors of unions)

IDL

Union Character {MuLan: Attacker, // table, equivalent to the message Rapunzel of protobuf, // struct, akin to c++ struct Belle: BookReader, BookFan: BookReader, Other: string, // string Unused: string } table Movie { main_character: Character; // Union characters: [Character]; // vector of unions }Copy the code

3. Support go Module via Attribute (in IDL definition).

Each FBS IDL definition file is supported by the respective module, format like this: “go_module:github.com/tsingson/flatbuffers-sample/go-example/”;

weapons.fbs

namespace weapons;

attribute "go_module:github.com/tsingson/flatbuffers-sample/samplesNew/";

table Gun {
  damage:short;
  bool:bool;
  name:string;
  names:[string];
}
Copy the code

monster.fbs

include ".. /weapons.fbs";

namespace Mygame.Example;

attribute "go_module:github.com/tsingson/flatbuffers-sample/go-example/";

enum Color:byte { Red = 0, Green, Blue = 2 }
union Equipment {   MuLan: Weapon, Weapon, Gun:weapons.Gun, SpaceShip,   Other: string } // Optionally add more tables.

......
Copy the code

The generated GO code

package Example

import (
	"strconv"
	flatbuffers "github.com/google/flatbuffers/go"

	weapons "github.com/tsingson/flatbuffers-sample/samplesNew/weapons"/// ahem!)type Equipment byte

..........

Copy the code

4. Add some API/generated code that is clear and easy to use.

	weaponsOffset := flatbuffers.UOffsetT(0)
	ift.Weapons ! = nil { weaponsLength := len(t.Weapons) weaponsOffsets := make([]flatbuffers.UOffsetT, weaponsLength)for j := weaponsLength - 1; j >= 0; j-- {
			weaponsOffsets[j] = t.Weapons[j].Pack(builder)
		}
		MonsterStartWeaponsVector(builder, weaponsLength)            //////// start
		for j := weaponsLength - 1; j >= 0; j-- {
			builder.PrependUOffsetT(weaponsOffsets[j])
		}
		weaponsOffset = MonsterEndWeaponsVector(builder, weaponsLength)   /////// end 
	}
Copy the code

shortcut for []strings vector

// native object 

	Names []string


// builder

namesOffset := builder.StringsVector( t.Names...)


Copy the code

getter for vector of unions


func (rcv *Movie) Characters(j int, obj *flatbuffers.Table) bool {
	o := flatbuffers.UOffsetT(rcv._tab.Offset(10))
	ifo ! = 0 { a := rcv._tab.Vector(o) obj.Pos = a + flatbuffers.UOffsetT(j*4) obj.Bytes = rcv._tab.Bytesreturn true
	}
	return false
}

Copy the code

so get struct or table

// GetStructVectorAsBookReader shortcut to access struct in vector of unions
func GetStructVectorAsBookReader(table *flatbuffers.Table) *BookReader {
	n := flatbuffers.GetUOffsetT(table.Bytes[table.Pos:])
	x := &BookReader{}
	x.Init(table.Bytes, n+ table.Pos)
	return x
}

// GetStructAsBookReader shortcut to access struct in single union field
func GetStructAsBookReader(table *flatbuffers.Table) *BookReader {
	x := &BookReader{}
	x.Init(table.Bytes, table.Pos)
	return x
}

Copy the code

for object-api , comments in generated code to make it clear


// UnPack use for single union field
 func (rcv Character) UnPack(table flatbuffers.Table) *CharacterT {
	switch rcv {
	case CharacterMuLan:
		x := GetTableAsAttacker(&table)
		return &CharacterT{ Type: CharacterMuLan, Value: x.UnPack() }
 .............

// UnPackVector use for vector of unions 
func (rcv Character) UnPackVector(table flatbuffers.Table) *CharacterT {
	switch rcv {
	case CharacterMuLan:
		x := GetTableVectorAsAttacker(&table)
		return &CharacterT{ Type: CharacterMuLan, Value: x.UnPack() }
	case CharacterRapunzel:
.........
Copy the code

Perhaps, more later, let Go Flatbuffers…… Better to use.

5. About memory leaks and Go GC

C++ code in CI prompts memory leak, check all day……….

Look at c + + code

  // Save out the generated code for a Go Table type.
  bool SaveType(const Definition &def, const std::string *classcode,
                const bool needs_imports, const bool is_enum) {
    if(! classcode->length())return true;

    // fix  miss name space issue
   auto dns=  new Namespace();
    if ((parser_.root_struct_def_) &&
        (def.defined_namespace->components.empty())) {
      dns->components.push_back(parser_.root_struct_def_->name);
    } else {
      dns = def.defined_namespace;
    }

    Namespace &ns = go_namespace_.components.empty() ? *dns : go_namespace_;

Copy the code

auto dns= new Namespace(); ———–> defines a pointer variable and initializes the pointer variable used in the following if statement, but in the if else code block, the DNS pointer variable is pointed to another Namespace pointer, so that the pointer variable in the if statement becomes a wild pointer, Causing memory leaks

Note: Move pointer usage code to if else block, where define pointer usage where

  // Save out the generated code for a Go Table type.
  bool SaveType(const Definition &def, const std::string *classcode,
                const bool needs_imports, const bool is_enum) {
    if(! classcode->length())return true;

    // fix  miss name space issue

    if((parser_.root_struct_def_) && (def.defined_namespace->components.empty())) { auto dns = new Namespace(); dns->components.push_back(parser_.root_struct_def_->name); Namespace &ns = go_namespace_.components.empty() ? *dns : go_namespace_; . }else{ Namespace &ns = go_namespace_.components.empty() ? *def.defined_namespace : go_namespace_; . }Copy the code

In GO, if the same surrogate is used, for example

typeNamespace struct { Components Stack; // This is a FILO stack structure that supports Push/Pop and Pushback and Popback to add or Pop elements to the end of the stack. } func SaveType ( def Definition, classcode *string , needs_imports, is _enum bool ) bool { ....... dns = new Namespace;if(...) { dns. Components.Pushback( .........) }else{ dns = def.DefinedNamespace; // This is an existing Namespace pointer}Copy the code

The new DNS will leak the memory, but the DNS will not have any references in Go, so it will be GC by the Go Runtime.

*** Go is big C, enhanced C++***, and the presence of GC makes the development process much easier.

However, a reference to a pointer type variable can cause a memory leak, and you need to be aware of which situations may and may not be GC. For example, use go unsafe as carefully as you would C++.

6. happy hacking……. The toss continues

This article is continuously updated at………..

.

.


This article was first published in GolangChina, where gocn. VIP /topics/1022…


I wish you health and happiness!

_

_

About me

Tsingson (Sanwise)

Original Ustarcom IPTV/OTT Business Unit Broadcast control product line Technical architecture Wet/solution engineering Wet role (8 years), freelancer,

Like music (harmonica, one of the main planners of the 3rd / 4th / 5th Guangdong International Harmonica Carnival), photography and cross-country,

Golang language (postgres + Golang for commercial projects)

Tsingson Sanwise is located in Nanshan, Shenzhen. Xiao Luo Harmonica Music Centre 2020/04/09