Thrift & IDL

This article mainly is a translation of official IDL files, official documents are in accordance with a way to have a full picture to the details described in IDL to thrift, and document directly using regular to describe the grammar, so for the regular grammar don’t understand students, does have some difficult to read, it is also one of the goals of this article: Introduces the syntax of Thrift IDL in an easy-to-understand way

In the regular grammar of official documents, ‘single quotes are used to indicate terminals, which are usually keywords in the language. What is not enclosed is a non-terminal, which is eventually deduced as a terminal and then terminated.

Thrift

Thrift is an extensible, cross-language service development framework that, through its powerful code generator, Can work with C++, Java, Python, PHP, Ruby, Erlang, Haskell, C#, Cocoa, Javascript, node.js, Smalltalk, OCaml, Golang and other languages work efficiently and seamlessly.

Thrift was originally developed by Facebook as an internal project, opened source in April 2007, and entered the Apache incubator in May 2008. In November 2010, it became the Apache top-level Project (TLP). It has been 10+ years. It has strong thrift function and uses binary for transmission with faster speed.

IDL(Interface Description Language)

IDL is the core of Thrift and the source code for direct operations by the Thrift compiler. Since IDL is a language, it has its own syntax. In the following sections, we will focus on IDL syntax

Keywords & Identifer

The keywords in IDL are as follows:

include, cpp_include, namespace, const, typedef, enum, senum, struct, union, exception, service, extends, required, optional, oneway, void, throws, bool, byte, i8, i16, i32, i64, double, string, binary, slist, map, set, list, cpp_type

Some keywords that are only used internally by Facebook are not listed because they are not recommended for non-Fb services. If there is an extension to the Thrift compiler, it is recommended to define keywords that are relevant to the organization or business they belong to

IDL identifiers are used to define structure names, variable names, service names, and so on, as in other languages:

The definition in the official documentation is as follows:

Identifier ::= ( Letter | '_' ) ( Letter | Digit | '. ' | '_') *Copy the code

That is, a valid identifier meets the following conditions:

  1. The identifier can contain only letters, numbers, underscores (under score), and periods (dot)
  2. It must start with a letter, _

Letter and Digit are defined as follows:

Letter ::= [ 'A'-'Z'"|"'a'-'z' ]
Digit  ::= [ '0'-'9' ]
Copy the code

Translation:

  • Letters can only be uppercase ‘A’ – ‘Z’ and lowercase ‘A’ – ‘Z’ in the alphabet
  • The number can only be 0-9

The base type

The syntax is defined as follows:

BaseType ::= 'bool' | 'byte' | 'i8' | 'i16' | 'i32' | 'i64' | 'double' | 'string' | 'binary' | 'slist'
Copy the code

That is, there are 10 base types in IDL (the current Thrift version is 0.13.0, new types may be introduced with subsequent iterations, and the version used in the project is lower than the current version, which requires compatibility).

  • boolThe Boolean, false | true
  • byte byte
  • i8 int8
  • i16 int16
  • i32 int32
  • i64 int64
  • doubleA double precision floating point type
  • stringstring
  • binaryBinary byte []
  • slist

Container type

The official definition is as follows:

ContainerType ::= MapType | SetType | ListType
MapType       ::= 'map' CppType? '<' FieldType ', ' FieldType '>'
SetType       ::= 'set' CppType? '<' FieldType '>'
ListType      ::= 'list' '<' FieldType '>' CppType?
FieldType     ::= BaseType | ContainerType | Identifier
CppType       ::= 'cpp_type' Literal
Literal       ::= ('"' [^"] * '"') | (" '" [^']* "'")
Copy the code

Start by explaining literals, or literals, which in IDL are strings surrounded by single ‘or double’ quotation marks

CppType is the type declared using cpp_type

cpp_type 'test'

cpp_type "test"
Copy the code

The usage of CppType is not yet clear

FieldType, a FieldType, will have more types in this section. Officially, FieldType can be a base type, a container type, or a legal identifier. The legal identifier here is the type declared in the key typedef, enum, struct, etc., below, so the following are legal container type declarations

// Map simple key-value pair, key can not be repeated, '<keyType, valueType>' is used to specify the type of key and value // Simple map map <string, i8> // nested map map <map <string, i32>, I64 > // More nesting might be like map<string, map<string, string>> map<string, set<i32>> // set a non-repeating data set, Set <string> set <map <string, bool>> // List is like an array, List <i64> List <set <string>> List <map<string, set<i8>>> list <i64> list <set <string>>Copy the code

IDL definitions or declarations

IDL is used for interface description, description files usually have.thrift as the extension type, IDL keywords, basic data types we’ve seen, how do you use these keywords and basic data types to describe interfaces? So we can look at a demo file, and then we can talk about this thrift file.

fb303.thrift

namespace java com.facebook.fb303
namespace cpp facebook.fb303
namespace perl Facebook.FB303
namespace netstd Facebook.FB303.Test

enum fb_status {
  DEAD = 0,
  STARTING = 1,
  ALIVE = 2,
  STOPPING = 3,
  STOPPED = 4,
  WARNING = 5,
}

service FacebookService {
  string getName(),
  string getVersion(),
  fb_status getStatus(),
  string getStatusDetails(),
  map<string, i64> getCounters(),
  i64 getCounter(1: string key),
  void setOption(1: string key, 2: string value),
  string getOption(1: string key),
  map<string, string> getOptions(),
  string getCpuProfile(1: i32 profileDurationInSec),
  i64 aliveSince(),
  oneway void reinitialize(),
  oneway void shutdown(),
}
Copy the code

The above demo file provided by FB is relatively simple and does not cover all functions. From the above examples, we can already see some keywords and basic types introduced previously. If you cannot fully understand them, don’t worry, we will introduce all functions in IDL next

Thrift file composition

Syntax definition:

Document   ::= Header* Definition*
Header     ::= Include | CppInclude | Namespace
Definition ::= Const | Typedef | Enum | Senum | Struct | Union | Exception | Service
Copy the code

There are many new definitions from the above syntax definitions, but don’t worry about them now.

First, a thrift file is just a Document. A thrift file can also be an empty file consisting of several headers and definitions (the Thrift Compiler does not generate any files).

Let’s move on to the Header, in the example above

namespace java com.facebook.fb303
namespace cpp facebook.fb303
namespace perl Facebook.FB303
namespace netstd Facebook.FB303.Test
Copy the code

So this part right here is four headers, and he defines four namespaces

Header

The Header can be Include, CppInclude, or Namespace, as shown in the following syntax.

Include syntax definition

Include ::= 'include' Literal
Copy the code

In terms of syntax rules, Include consists of the Include keyword + thrift file path.

In real business development, it is not possible to define all the services in one file. It is usually broken down by business modules and then included into an entry file. Then when the final service release comes online, the thrift compiler only needs to compile the entry file. You can generate code for all incoming files and make the contents defined in the included files visible. Such as:

base.thrift

namespace go base

struct Base {
  ...
}
Copy the code

example.thrift

Struct Example {1: base.Base ExampleBase} struct Example {1: base.Copy the code

CppInclude syntax definition

CppInclude ::= 'cpp_include' Literal
Copy the code

CppInclude is primarily used to add a custom C++ import declaration to the code generated by the current thrift file

Currently there is no use scenario, do not make too much statement

Namespace syntax Definition

Namespace ::= ( 'namespace' ( NamespaceScope Identifier ) )
NamespaceScope  ::=  The '*' | 'c_glib' | 'cpp' | 'csharp' | 'delphi' | 'go' | 'java' | 'js' | 'lua' | 'netcore' | 'perl' | 'php' | 'py' | 'py.twisted' | 'rb' | 'st' | 'xsd'
Copy the code

Namespace is used to declare which language is used to handle the types defined in the current thrift file. NamespaceScope is the identifier of each language. It can also be specified as a wildcard * identifier to indicate that the thrift file definitions apply to all languages. In addition, Namespace also serves to avoid naming conflicts between different Identifier definitions.

The above example has four Namespace declarations that indicate that the current thrift file applies to Java, CPP, PREL, and NetSTD.

The Identifier immediately following the NamespaceScope behaves differently in different languages, for example:

Namespace go tutorial.thrift. Example // The generated code will be a directory structure 'tutorial/thrift/example/*. Go' where each go file will contain 'package Example' Represents belonging to the same packageCopy the code

Definition

The syntax of Definition is as follows:

Definition ::= Const | Typedef | Enum | Senum | Struct | Union | Exception | Service
Copy the code

Syntactically, Definition can be Const, Typedef, Enum, Senum, Struct, Union, Exception, Service, These definitions are the core of the Thrift file, and we’ll go through them one by one.

Senum, Slist is no longer recommended and will not be covered below

Const & ConstValue Constant & constant value

A syntactic definition of a constant

Const          ::= 'const' FieldType Identifier '=' ConstValue ListSeparator?
ConstValue     ::= IntConstant | DoubleConstant | Literal | Identifier | ConstList | ConstMap
IntConstant    ::= ('+' | The '-')? Digit+
DoubleConstant ::= ('+' | The '-')? Digit* ('. 'Digit+)? (('E' | 'e') IntConstant )?
ConstList      ::= '['(ConstValue ListSeparator?) *'] '
ConstMap       ::= '{' (ConstValue ':'ConstValue ListSeparator?) *'} '
ListSeparator  ::= ', ' | '; '
Copy the code

“ListSeparator” is the same separator that follows a sentence in Java. In IDL, the delimiter can be, or; And for the most part, you can ignore it

IDL is declared by the const keyword

const string testConst = 'hello,thrift'; / / `; 'can be replaced by', 'can not be writtenCopy the code

Constant declaration statement after = is a constant value, IDL valid constant values are as follows:

Const i8 count = 100 const i8 count = 100 -2) // const double money = '13.14' // const double rate = 1.2e-5 Const list<string> names = [' Tom ', const list<string> names = [' Tom ', 'joney', 'catiy'] // Of course, 'can be replaced with '; Const map<string, string> = {'name': 'Johnson ', 'age': '20'}Copy the code

Note: 没有 ConstSet

Typedef type definition

IDL can also define a type. The syntax is as follows:

Typedef ::= 'typedef' DefinitionType Identifier
Copy the code

There is no introduction to this, for example 🌰 :

Typepedef i8 int8 // const int8 count = 100Copy the code

Enum enumerated values

The syntax is defined as follows:

Enum ::= 'enum' Identifier '{' (Identifier ('='IntConstant)? ListSeparator?) *'} '
Copy the code

In the example above

enum fb_status {
  DEAD = 0,
  STARTING = 1,
  ALIVE = 2,
  STOPPING = 3,
  STOPPED = 4,
  WARNING = 5,
}
Copy the code

From the syntactic definition, (‘=’ IntConstant)? Is optional, which means that we do not need to specify a value. The default value is incremented from 0. If it is specified, it must be an integer constant. So the enumerated values in the above example could also be written as follows:

enum fb_status {
  DEAD,
  STARTING,
  ALIVE,
  STOPPING,
  STOPPED,
  WARNING,
}
Copy the code

Stuct structure

Structs are probably the most commonly used type definition statements. The syntax is as follows:

Struct     ::= 'struct' Identifier 'xsd_all'? '{' Field* '} '
Field      ::=  FieldID? FieldReq? FieldType Identifier ('=' ConstValue)? XsdFieldOptions ListSeparator?
FieldID    ::=  IntConstant ':'
FieldReq   ::=  'required' | 'optional'
Copy the code

Xsd_all is an internal Facebook field, so ignore it and it won’t matter if you write it

XsdFeildOptions in Field is also an internal Facebook Field and is ignored

From the perspective of syntax definition, the core of a Struct definition is the Field Field, and the name of each Field in a Struct must be unique, Struct cannot inherit, but can be nested, that is, can be used as the type of Struct Field.

A valid Field requires only a FieldType and its Identifier. But usually we add FieldId and ListSeparator, while FieldReq depends.

FieldId must be an integer constant plus:.

FieldReq specifies whether the field is required or optional. The default is a combination of required and optional. In theory, fields should be serialized. Unless the field is something thrift cannot transmit, the field is ignored.

Furthermore, we can specify default values for fields, which can only be ConstValue

Struct BaseExample {1: i8 sex = 1, // specify default value} struct BaseExample {1: i8 age, 255: BaseExample base, // nested using BaseExample}Copy the code

Union

The Union syntax is basically the same as Struct except for the key, but the semantics are very different. A Union can define a structure that can be thrift transmitted as long as the fields in the structure have valid values to be assigned. Fields in the Union structure are optional by default. The required declaration cannot be used. The syntax is as follows:

Union ::= 'union' Identifier 'xsd_all'? '{' Field* '} '
Copy the code

Imagine a scenario where we collect information about a user by filling in either a mobile phone number or an email address, and we can use the Union structure to identify the type

union UserInfo {
  1: string phone,
  2: string email
}
Copy the code

Exception Exception definition

The Exception syntax definition is similar to Struct, but this type is usually suitable for use with the Exception handling mechanism of the target language. The syntax definition is as follows:

Exception ::= 'exception' Identifier '{' Field* '} '
Copy the code

Throws (🌰); throws (Function); throws (Function

exception Error {
  1: required i8 Code,
  2: string Msg,
}

service ExampleService {
  string GetName() throws (1: Error err),
}
Copy the code

Service Service definition

Finally, we come to the last very core concept, Service. Everything in the previous section is used to Service services. Services provide the interfaces we want to expose, and services can be inherited. Service A inherits Service B. In addition to the interfaces defined by Service A, Service A also provides interfaces inherited from Service B. The syntax is defined as follows:

Service      ::= 'service' Identifier ( 'extends' Identifier )? '{' Function* '} '
Function     ::= 'oneway'? FunctionType Identifier '(' Field* ') ' Throws? ListSeparator?
FunctionType ::= FieldType | 'void'
Throws       ::= 'throws' '(' Field* ') '
Copy the code

If a Service is a Function, it is a Function. If a Service is a Function, it is a Function. If a Service is a Function, it is a Function.

service ExampleService {
  oneway void GetName(1: string UserId),
  void GetAge(1: string UserId) throws (1: Error err),
}
Copy the code

Oneway is a key word. It is a one-way street. A non-oneway modified function is a response-type, i.e., req-resp, in which the client sends the request and the server returns the response. A oneway modified function means that the client only initiates the request and does not care about the return, and the server does not respond. The difference with void is that methods of void can also return exceptions.

FunctionType is any legal FieldType or void keyword that indicates no return type

Throws as the name implies, refer to the preceding example.

The IDL syntax of Thrift is used in general, and you can read more References

References

  • Thrift interface description language
  • Thrift Tutorial
  • Apache Thrift – A scalable cross-language service development framework
  • kite
  • Thrift serialization resolution