Original address: www.howtographql.com/

1. Introduction

GraphQL is a new API standard with efficient, powerful, and flexible features designed to replace REST patterns. GraphQL was developed and open-source by Facebook and is currently maintained by companies and developers around the world.

Apis have become ubiquitous in software infrastructure. In summary, an API defines how the client loads data from the server.

As the core of GraphQL, it supports declarative data fetching, allowing clients to specify exactly what data they need from the interface. Instead of multiple back-end interfaces returning fixed data structures, a GraphQL service exposes only one interface and responds precisely to the data the client wants.

GraphQL – An API query language

Today, most applications need to retrieve data stored in a database from a server, and apis need to provide an interface to retrieve data to meet the needs of the application.

GraphQL is often mistaken for a database technology. GraphQL is an API query language, not a database query language. Therefore, it is database independent and can be used in any scenario where the API is used.

A more effective alternative to the REST pattern

The REST pattern is a popular way to get data from the server. When the REST pattern was introduced, client programs were relatively simple and grew much faster than they do today. The REST pattern also works well for many applications. Over the past few years, however, the API landscape has changed radically. In particular, three factors challenge the way apis are designed:

1. The increasing use of mobile terminals requires a more efficient data loading process

Facebook developed GraphQL in the first place due to the increasing use of mobile devices, low power devices and poor network environment. GraphQL improves application usage in this environment by minimizing the amount of data required for network transmission.

2. Various front-end frameworks and platforms

In the face of a variety of front-end frameworks and applications in different terminals, it is increasingly difficult to develop and maintain a common SET of apis to meet the needs of the left and right. But with GraphQL, clients can get exactly what they want.

3. Develop quickly & add features quickly

Continuous deployment has become the norm in many companies, with rapid iteration and frequent product updates becoming essential. With RESTFUL apis, the server often needs to modify the exposed data interface to accommodate client changes in requirements and design. This prevents rapid development and product iteration.

History, context and adoption

GraphQL isn’t just for React developers

Facebook started using GraphQL in their native mobile apps in 2012. Interestingly, however, GraphQL is currently being adopted primarily in web-related technology environments and has gained little traction in mobile applications.

Facebook first publicly talked about GraphQL back in React. Js Conf 2015 and announced their open source plans shortly after. Because Facebook is always talking about GraphQL in conjunction with React, it takes a while for non-React developers to realize that GraphQL is by no means a technology limited to React.

Dan Schafer & Jing Chen introduced GraphQL in React.JS Conf 2015.

A rapidly growing community

In fact, GraphQL can be used anywhere a client communicates with the API. Interestingly, several companies, including Netflix and Coursera, have explored similar approaches to make API interactions more efficient. Coursera envisions a similar technology that lets clients specify their data requirements, and Netflix has even open-source their Falcor solution. But after GraphQL became open source, Coursera canceled its plans and jumped on the GraphQL bandwagon.

Today, GraphQL is produced by many different companies (such as GitHub, Twitter, Yelp, and Shopify).

[cdn.hicous.com/pic/bism5.p…].

Despite this young technology, GraphQL has been widely adopted. Click here to see who else uses GraphQL.

GraphQL Summit, GraphQL Europe, GraphQL Radio, and GraphQL Weekly.

2.GraphQL is a better REST

Over the past decade, REST has become the standard for designing Web apis. It offers some great ideas such as stateless services and structured resource access. However, reST-based apis are too rigid to meet the rapidly changing invocation needs of clients.

GraphQL emerged in response to the need for more flexibility and efficiency! It addresses many of the REST shortcomings and inefficiencies that developers encounter when working with REST apis.

To illustrate the key differences between REST and GraphQL when fetching data from apis, consider a simple example scenario: in a blog application, the application needs to display the post title of a specified user. The same page also displays the names of the user’s latest three followers. How do REST and GraphQL solve this business?

REST VS GraphQL data acquisition

When using REST apis, multiple interfaces can often be invoked to retrieve data. For the above scenario, the /user/< ID > interface can be called first to get user data. Next, a call to /users/

/posts returns all of that user’s posts. Finally, a third interface, /users/

/ Followers, returns a list of followers for that user.

With REST, you have to make three requests to three different interfaces to get the data you need. You also get extra data because the interface returns other information that you don’t need.

In GraphQL, on the other hand, you just send a query to the GraphQL server that owns the data. The server responds and returns a JSON object with the query result.

With GraphQL, the client can specify exactly what data is needed in the query. Notice that the structure of the server response matches exactly the structure defined in the query.

No more superfluous or insufficient data is retrieved

One of the most common drawbacks of REST is the inability to accurately capture data. This is because the only way a client can get data is by calling an interface, which always returns a fixed data structure. It is very difficult to design a perfect API that provides accurate data for the client.

“Think graphically, not at endpoints.” Lessons From 4 Years of GraphQL From Co-inventor Lee Byron

Excessive retrieval: Redundant data has been downloaded

Overfetching means that the client gets more data than it really needs. Imagine a page where you only need to display a list of user names. However, in REST apis, the/Users interface is usually called to get a JSON array containing user data. However, this return result may also contain additional information about the user, for example. User’s birthday, address, etc. We just want to show the user name.

Insufficient access and N +1 problems

Another problem is the problem of not getting enough data and then requesting it again. Underfetching usually means that the interface cannot return the required information. The client will have to make another request to get the data. The likely scenario is that the client will need to get the list data first and then request again for each element in the list to get the element-related data.

For example, suppose you had an application where you wanted to show the last three followers of each user. The API also provides /users/

/followers. To display the required information, the app needs to make a request to the/Users interface and then call the/Users /

/ Followers interface for each user.

Rapid product iteration on the front end

A common pattern for using REST apis is to design the interface according to the needs of the application page. This is very convenient, as you can simply call the corresponding interface on the client side and get all the data you need on the page.

But the main disadvantage of this model is that it does not allow for rapid iteration on the front end. With each change at the UI level, the interface data matches, exposing a high level of risk. As a result, the back end needs to adapt to address the new data requirements. This leads to reduced productivity, especially the ability to quickly integrate user suggestions into the product.

If you use GraphQL, this problem is solved. Thanks to the flexibility of GraphQL, client-side changes do not require any additional work on the server side. The client can specify its exact data requirements, so back-end engineers do not need to adjust as the front-end design and data requirements change.

Back-end deep analysis

GraphQL performs fine-grained analysis of the requested data. Since each client specifies exactly what data it is interested in, we can gain insight into how the data is used. Armed with this information, you can use it to optimize the API and remove unwanted fields.

Using GraphQL, you can also perform low-level performance monitoring on the server as it processes requests. GraphQL uses the concept of resolver functions to collect data from client requests. The measurement and performance data provided by the Resolver can provide important insights into solving bottlenecks in the system.

Benefits of Schema and type systems

GraphQL uses a rich type system to define the functionality of the API. All types exposed in the API are recorded in the Schema using the GraphQL Schema Definition Language (SDL). A Schema is a contract between a client and a server that defines how a client accesses data.

Once the Schema is defined, teams working on both the front and back end can work without extra communication because they all know the definite structure of the data that will be transmitted over the network.

Front-end engineers can easily test the data structures required by mocks. Once the server is ready, the client can quickly switch to the real service and then load data from the real API.

3. Core concepts

In this chapter, you’ll learn about some of the basic language constructs of GraphQL. The content includes the syntax for first understanding, defining types, querying and mutation. We also have a graphQL-up sandbox environment for you to experiment with what you’ve learned.

View Definition Language (SDL)

GraphQL has its own type system for defining the API’s Schema. The syntax for writing schemas is called SDL.

Here is an example of using SDL to define the simple type Person:

type Person {
  name: String!
  age: Int!
}
Copy the code

This Person type has two fields, name and age, which are of type String and Int, followed by! Indicates that the field is required.

Can be used between types. In the blog application example, a Person can be associated with a Post:

type Post {
  title: String!
  author: Person!
}
Copy the code

Meanwhile, on the other hand, add the corresponding type to the Person type.

type Person { name: String! age: Int! posts: [Post!] ! }Copy the code

Notice that we just created a one-to-many relationship between Person and Post, because the Posts field on Person is actually an array of posts.

Retrieve data using a query statement

When using the REST API, data is loaded from the specified interface. Each interface clearly defines the data structure to be returned. This means that the data the client needs is already specified in the URL.

GraphQL takes a very different approach. GraphQL’s APIS typically expose only one interface, rather than multiple interfaces that return fixed data structures. This is because the data structure returned is not static, but flexible, allowing the client to decide what data it really needs.

This means that the client needs to send more information to the server to inform it of the requested data – this additional information is called a query.

The basic query

Let’s look at an example of a simple query that the client sends to the server:

{
  allPersons {
    name
  }
}
Copy the code

In this query, the allPersons field is called the root field of the query. Anything after the root field is called the payload of the query. In this query, the only payload field specified is name.

This query returns a list of all people currently stored in the database. The following is an example returned:

{
  "allPersons": [
    { "name": "Johnny" },
    { "name": "Sarah" },
    { "name": "Alice" }
  ]
}
Copy the code

Note that each Person only has a name attribute in the response result, and the server does not return an age attribute. This is precisely because name is the only field specified in the query.

If the client also wants to get the age attribute, all it has to do is adjust the query statement and add a new field to the payload of the query:

{
  allPersons {
    name
    age
  }
}
Copy the code

An important advantage of GraphQL is that it allows you to query information directly in a nested manner. For example, if you want to get all posts written by a Person, you can simply get the data from the type structure:

{
  allPersons {
    name
    age
    posts {
      title
    }
  }
}
Copy the code

Query with parameters

In GraphQL, every field defined in a view can have parameters (zero or more). For example, the allPersons field can contain the last parameter to return a specific number of Persons. The corresponding query is as follows:

{
  allPersons(last: 2) {
    name
  }
}
Copy the code

Manipulate data by Mutation

After getting data from the server, mainstream applications always change the data stored in the back end. With GraphQL, these changes are done using Mutation. Mutation generally has three types:

  • Create new data
  • Update existing data
  • Delete existing data

Mutation follows the same syntax structure as queries, but they always need to start with the Mutation keyword. Here’s an example of how we can create a new Person:

mutation {
  createPerson(name: "Bob", age: 36) {
    name
    age
  }
}
Copy the code

Notice that, like the query written earlier, mutation has a root field, which in the above example is createPerson. We’ve talked about query parameters before. In this example, createPerson takes name and age.

Like a query, we can also specify a payload for mutation in which we can request additional properties of the newly created Person object. In our example, we specify name and age, although our example doesn’t make much sense because we’re getting data that we already know. However, it is clear that the query information can also be specified when the request is mutation, which demonstrates the power of mutation, allowing new data to be retrieved from the server in a single round trip!

The response data for the mutation above looks like this:

"createPerson": {
  "name": "Bob"."age": "36",}Copy the code

We’ve always seen that GraphQL generates a unique ID type on the server when new data is added. Extending the Person type we defined earlier, we can add an ID field like this:

type Person {
  id: ID!
  name: String!
  age: Int!
}
Copy the code

Now, when a new Person is created, the ID can be queried directly in the mutation payload, which is previously unknown to the client:

mutation {
  createPerson(name: "Alice", age: 36) {
    id
  }
}
Copy the code

Use Subscription for live updates

Another important requirement for many applications today is a real-time connection to a server to respond to important events in real time. For this scenario, GraphQL provides the concept of subscriptions.

When a client subscribes to an event, it establishes a stable connection to the server and maintains that connection. At any time, when an event is triggered, the server pushes the relevant information to the client. Unlike query and mutation, subscriptions represent a pattern of streams pushing data out, rather than the classic “request-response” pattern.

Subscriptions use the same syntax as queries and mutations. We used to subscribe to events on the Person type as follows:

subscription {
  newPerson {
    name
    age
  }
}
Copy the code

After the client sends this subscription to the server, a link is established. Later, each time mutation is performed and a new Person is created, the server sends the information about that Person to the client:

{
  "newPerson": {
    "name": "Jane"."age": 23}}Copy the code

Define a Schema

Now that you have a basic understanding of queries, mutation, and subscriptions, let’s put it all together and learn how to write a view that lets you execute all the examples yourself so far.

Views are one of the most important concepts when using the GraphQL API. It specifies the capabilities of the API and defines how clients request data. It is often thought of as a protocol between a server and a client.

In general, views are just collections of GraphQL types. However, when writing views for apis, there are some special root types:

type Query { ... }
type Mutation { ... }
type Subscription { ... }
Copy the code

Queries, mutations, and subscriptions are the entry points for clients to send requests. To enable the allPersons query we saw earlier, the query type must be defined as follows:

type Query {
  allPersons: [Person!] ! }Copy the code

AllPersons is called the root field of the API. Given that we added the last parameter to the allPersons example, we must define Query as follows:

typeQuery { allPersons(last: Int): [Person!] ! }Copy the code

Similarly, for createPersonmutation, we must add a root field to the Mutation type:

type Mutation {
  createPerson(name: String! , age:String!). : Person! }Copy the code

Notice that the root field takes two arguments, Person’s name and age.

Finally, for subscriptions, we must add the newPerson root field:

type Subscription {
  newPerson: Person!
}
Copy the code

Putting it all together, here’s the complete pattern for all the queries and mutations you saw in this chapter:

typeQuery { allPersons(last: Int): [Person!] ! }type Mutation {
  createPerson(name: String! , age:String!). : Person! }type Subscription {
  newPerson: Person!
}

type Person {
  name: String! age: Int! posts: [Post!] ! }type Post {
  title: String!
  author: Person!
}
Copy the code

4. Big picture (structure)

GraphQL is released as a specification only. This means that GraphQL is, in fact, not a complete document detailing the behavior of the GraphQL server.

If you want to use GraphQL, you have to build the GraphQL service yourself. You can implement it in any programming language (such as these reference implementations) or by using a service like Graphcool, which provides a powerful GraphQL API.

Use cases

In this section, we’ll look at three different types of architectures that use GraphQL services:

1. Connect to the Database GraphQL service

2.GraphQL services as a lightweight layer on top of third-party interfaces, or some existing systems. Integrated through a GraphQL API.

3. Mix the two approaches, using the same GraphQL API, to access both databases and third-party interfaces or existing legacy systems.

These three architectures represent the main use cases of GraphQL and demonstrate flexibility in their use.

1. Connect to the Database GraphQL service

This architecture is the most common in Greenfield projects. In setup, you have a (Web) server implementing the GraphQL specification. When the query request is sent to the GraphQL server, the server reads the payload of the query and reads the required information from the database. This is called query parsing. It then constructs the response object as described in the official specification and returns it to the client.

Note that GraphQL is actually transport layer independent. This means it can be used with any network protocol. Therefore, the GraphQL server could be based on TCP, Websockets, or other network protocols.

GraphQL also doesn’t care what database it is or what format it uses to store the data. You can use SQL databases like AWS Aurora or NoSQL databases like MongoDB.

A standard GraphQL server greenfield architecture that connects to a single database.

2. Integrate GraphQL layer of existing system

Another major use case for GraphQL is the integration of multiple existing systems behind a unified GraphQL API. This is particularly compelling for companies with legacy infrastructure and many different apis that have been maintained for years and are now creating a heavy maintenance burden. A major problem with these legacy systems is that it is difficult to make innovative products that require access to multiple older systems.

In this case, GraphQL can be used to unify these systems and hide these complexities behind a friendly GraphQL API. This allows you to develop new client applications that simply communicate with the GraphQL server to get the data you need. The GraphQL server is then responsible for extracting the data from the existing system and responding in GraphQL format.

Just like in the previous architecture, the GraphQL server didn’t care about the type of database being used, this time it didn’t care about the data source needed to get query results.

GraphQL allows you to hide the complexity of existing systems by integrating microservices, legacy infrastructure, and third-party apis behind a unified GraphQL interface.

3. Hybrid way of connecting database and existing system integration

Finally, you can combine these two approaches to build a GraphQL server that connects to the database while integrating with legacy or third-party systems.

When the server receives a query, it parses it and retrieves the required data from the connected database or some integrated API.

The two approaches can also be combined, allowing the GraphQL server to pull data from a single database or from an existing system, allowing for flexibility and transferring all the complexity of data management to the server.

Resolver methods

But how do you achieve this flexibility with GraphQL? How does it fit into these distinctly different types of use cases?

As you learned in the previous chapter, the payload of a GraphQL query (or mutation) consists of a set of fields. In the GraphQL server implementation, each of these fields actually corresponds to a function called resolver. The function of the resolver is to obtain the data of the corresponding field.

When receiving a query, the server invokes functions corresponding to fields payload in the query to retrieve the data of each field. When all resolvers return the result, the server packages the data in the format described in the query and returns the result to the client.

Each field in the query corresponds to a resolver function. When the specified data needs to be queried, GraphQL calls all the required resolver methods.

GraphQL client library

GraphQL is particularly good for front-end developers because it completely eliminates many of the inconveniences and drawbacks of the REST API, such as excessive and poor data loading. Complexity is pushed to the server side, where powerful servers can handle the heavy computing. The client doesn’t have to know where the data is actually coming from, just a single, consistent, and flexible API.

Let’s consider the major change introduced with GraphQL, moving from a fairly compelling approach to data retrieval to a purely declarative approach. When retrieving data from REST apis, most applications must perform the following steps:

1. Construct and send HTTP requests (fetch in Javascript, for example) 2. Receive and parse server responses 3. Store data locally (simply in memory or persistent storage) 4. Display data in the UI

With an ideal declarative data retrieval approach, the client should not do the following two steps:

1. Describe data requirements 2. Display data on the UI

All the underlying network tasks and data stores should be abstracted, and declarations of data dependencies should be a major part.

This is exactly what GraphQL’s client libraries like Relay or Apollo can do. They provide an abstraction of the important parts, so you don’t have to repeat the basic methods. So you can focus on the app itself.

Advanced – 1. The client

Using the GraphQL API on the front end is a great opportunity to abstract and implement basic functionality. Let’s consider some “basic” features you might want in your app:

  • Send the query and mutation directly without building the HTTP request
  • View layer integration
  • The cache
  • Verify and optimize queries based on schema

Of course, there’s nothing stopping you from just using HTTP to get your data and then processing it yourself until the correct information is finally displayed on the UI. But GraphQL offers the ability to avoid a lot of manual labor and let you focus on the core of your application! We discuss these capabilities in more detail below.

There are currently two main GraphQL client libraries. The first is Apollo Client, a community-supported project that aims to build a powerful and flexible GraphQL client for all major development platforms. The second, called Relay, is Facebook’s official GraphQL client, which is greatly optimized for performance but only available on the Web.

Send the query and mutation directly

The main advantage of GraphQL is that it allows you to get and update data declaratively. In other words, we take the API abstraction a step further and don’t have to deal with low-level network tasks ourselves.

Previously you used pure HTTP (such as FETCH in JavaScript or NSURLSession in iOS) to load data from apis. With GraphQL, all operations are written to a query, and once the data requirement is declared, the system is responsible for sending the request and handling the response. That’s exactly what the GraphQL client does.

View layer integration and UI updates

After the response on the server side is received by the GraphQL client, the data needs to be finally presented in the UI. Depending on the platform and framework chosen, the UI will be updated in different ways.

Using React as an example, the GraphQL client uses the idea of higher-order components to fetch the required data and make it available in the component’s porps. In general, the declarative nature of GraphQL works well with functional reactive programming (FRP). The two can form a powerful combination, where the view simply declares its data dependencies and the UI connects to the selected FRP layer.

Caching query results: Concepts and strategies

In most applications, you need to cache data that you previously fetched from the server. This is critical to providing a smooth user experience and maintaining user data.

In general, when caching data, the intuition is to put remotely acquired information into local storage for later retrieval. With GraphQL, the intuitive approach is to simply put the results of the GraphQL query into storage and return the previously stored data as long as the exact same query is executed again. It turns out that this approach is very inefficient for most applications.

A more effective way is to normalize the data in advance. This means that (possibly) nested query results are flattened and store individual record content that can be queried using globally unique ids. For more information on this, the Apollo blog has a great introduction to this content.

Validation and optimization at build time

Because the schema contains all the information about the GraphQL API that the client can use, it is convenient for the client to validate the query at build time.

When the build environment has access to the schema, it can basically parse all the GraphQL code that resides in the project and compare it to the information in the Schema. This can catch typos and other errors, avoiding some serious consequences.

Combine the view layer with the data layer

One of the powerful ideas of GraphQL is that it allows you to deal with UI code and data requirements in parallel. The tight integration of interface and data greatly enhances the developer experience. You don’t have to worry about how to properly populate your UI.

The size of the benefits of the combination depends on the platform you are developing. In a Javascript application, for example, you can put data dependencies and UI code in the same file. In Xcode, the Assistant Editor can be used to work with both view controller and GraphQL code.

Advanced – 2. The server

GraphQL is often considered a front-end API technology because it enables clients to fetch data in a better way. But since it’s an API, of course it’s implemented on the server side. Because GraphQL enables server developers to focus on describing data rather than implementing and optimizing specific interfaces, there are many benefits on the server as well.

GraphQL perform

GraphQL retrieves data from a Schema by defining a Schema and a query language, but that’s not the only way. It is an actual execution algorithm that transforms these queries into results. The heart of the algorithm is simple: the query iterates field by field, performing a “parser” for each field. Let’s suppose we have the following pattern:

type Query { author(id: ID!) : [Author] } type Author { posts: [Post] } type Post { title: String content: String }Copy the code

Here is the query we sent to the server using this Schema:

query {
  author(id: "abc") {
    posts {
      title
      content
    }
  }
}
Copy the code

The first thing to notice is that each field in the query can correspond to a type:

query: Query {
  author(id: "abc"): Author {
    posts: [Post] {
      title: String
      content: String
    }
  }
}
Copy the code

Now we can easily find and run the parser for each field in the server. Execution starts with the query type and precedes the outer layer. This means we run the query.author parser first. We then pass the results of that parser to its child parser, the parser of author.posts. At the next level, the result is a list, in which case the algorithm executes once on each element in turn. So the final execution is as follows:

Query.author(root, { id: 'abc' }, context) -> author
Author.posts(author, null, context) -> posts
for each post in posts
  Post.title(post, null, context) -> title
  Post.content(post, null, context) -> content
Copy the code

Finally, the execution algorithm places all the resulting data correctly in the defined structure and returns it. Note that most GraphQL server implementations will provide a “default parser” – so you don’t have to specify a parser function for each individual field. For example, in graphqL.js, you don’t need to specify a parser when the parser’s parent object contains a field with the correct name. Learn more about GraphQL execution in “GraphQL Explained” on the Apollo blog.

The batch analytical

One thing you might notice about the implementation strategy above is that it’s a bit naive. For example, if you have a parser extracted from a back-end API or database, that backend may be called multiple times during the execution of a query. Let’s say we want to get the author of a few posts, like this:

query {
  posts {
    title
    author {
      name
      avatar
    }
  }
}
Copy the code

If these are blog posts, chances are many will have the same author. So if we need an API call to get each author object, we might accidentally make multiple requests for the same object. Such as:

fetch('/authors/1')
fetch('/authors/2')
fetch('/authors/1')
fetch('/authors/2')
fetch('/authors/1')
fetch('/authors/2')
Copy the code

How can we solve this problem? Let’s be smart. We can wrap the fetch function in a utility function that will wait for all parsers to run before ensuring that each element is fetched only once:

authorLoader = new AuthorLoader()

// Queue up a bunch of fetches
authorLoader.load(1);
authorLoader.load(2);
authorLoader.load(1);
authorLoader.load(2);

// Then, the loader only does the minimal amount of work
fetch('/authors/1');
fetch('/authors/2');
Copy the code

Can we do better? Of course, if our API supports bulk requests, we can only perform one fetch on the back end, as shown below:

fetch('/authors? Ids = 1, 2 ')Copy the code

This can also be encapsulated in the utility functions above. In JavaScript, you can implement this strategy using DataLoader’s tools, and similar tools are available in other languages.

Advance-3. Tools and ecosystems

As you’re probably aware, the GraphQL ecosystem is growing at a phenomenal rate. One reason for this is that GraphQL makes it easy to develop great tools. In this section, we’ll see why this is the case and some of the amazing tools that already exist in the ecosystem.

If you’re familiar with GraphQL basics, you probably know how GraphQL’s type system helps us quickly define the outermost layer of the API. It allows developers to clearly define the functionality of the API and validate incoming queries against the Schema.

The magic of GraphQL is that these features aren’t just known to the server. GraphQL allows clients to ask the server for information about their Schema. GraphQL calls this Introspection.

introspection

The schema designer already knows what the schema looks like, but how does the client know what data is accessible through the GraphQL API? We can query GraphQL by querying the __schema meta-field, which is always available on the canonical query root type.

query {
  __schema {
    types {
      name
    }
  }
}
Copy the code

Take this Schema as an example:

type Query { author(id: ID!) : Author } type Author { posts: [Post!] ! } type Post { title: String! }Copy the code

If we send the introspection query above, we get the following result:

{
  "data": {
    "__schema": {
      "types": [{"name": "Query"
        },
        {
          "name": "Author"
        },
        {
          "name": "Post"
        },
        {
          "name": "ID"
        },
        {
          "name": "String"
        },
        {
          "name": "__Schema"
        },
        {
          "name": "__Type"
        },
        {
          "name": "__TypeKind"
        },
        {
          "name": "__Field"
        },
        {
          "name": "__InputValue"
        },
        {
          "name": "__EnumValue"
        },
        {
          "name": "__Directive"
        },
        {
          "name": "__DirectiveLocation"}]}}}Copy the code

As you can see, we queried all types on the Schema. We get our defined object types and predefined types. We can even query the Introspection type again!

For the Introspection type, you get more than just the name. Look at the following example:

{
  __type(name: "Author") {
    name
    description
  }
}
Copy the code

In this example, we use the __type meta-field to query for a type, and we get its name and description. Result of this query:

{
  "data": {
    "__type": {
      "name": "Author",
      "description": "The author of a post.",
    }
  }
}
Copy the code

As you can see, Introspection is a very powerful feature of GraphQL, and we’ve only scratched the surface. In the specification, details are given about which fields and types are available in the Introspection pattern.

Many of the tools in the GraphQL ecosystem provide magic through the Introspection system. Such as document browser, autocomplete, code generation, and everything else possible! When building and using the GraphQL API, the most useful tool when you use Introspection heavily is GraphiQL.

GraphiQL

GraphiQL is an IDE that runs in the browser for writing, validating, and testing GraphQL queries. It has an editor for GraphQL queries, complete with auto-completion and validation, and document browsing that quickly renders the structure of the schema (provided by Introspection).

This is a very powerful development tool. It allows you to debug and try queries on the GraphQL server without having to curl to write GraphQL queries.

Give it a try! Graphql.org/swapi-grap….

Advance-4. Security

GraphQL provides powerful capabilities for clients. But with great power comes greater risk.

Because the client can use very complex queries, our server must be able to handle them properly. These queries may be abusive queries from malicious clients, or they may just be very large queries used by legitimate clients. In both cases, the client might crash your GraphQL server.

In this chapter, we present some strategies to mitigate these risks. We’ll explain them in order from simplest to most complex, and look at the pros and cons of each approach.

Timeout strategy

The first and simplest strategy is to use simple timeouts to protect against large queries. The server does not need to know anything about the incoming query. All the server needs to know is the maximum time allowed to query.

For example, a server with a five-second timeout configured will stop any queries that are executed for more than five seconds.

Advantages of Timeouts

  • The operation is simple
  • Most policies use timeouts as the ultimate protection

Disadvantages of timeouts

  • Even if there is a timeout policy, it can have bad consequences
  • Sometimes difficult to implement. Cutting the connection after a certain period of time can lead to strange behavior.

Maximum query depth

As we mentioned earlier, customers using GraphQL can write as complex queries as they want. Since the GraphQL schema is usually nested, this means that clients can write queries like this:

query IAmEvil {
  author(id: "abc") {
    posts {
      author {
        posts {
          author {
            posts {
              author {
                # that could go on as deep as the client wants!
              }
            }
          }
        }
      }
    }
  }
}
Copy the code

What if we could prevent customers from abusing this query depth? Understanding the defined schema gives you an idea of the depth of legitimate queries. This is actually possible and is often referred to as the maximum query depth.

By analyzing the AST of the query document, the GraphQL server is able to reject or accept requests based on their depth.

For example, a server with a maximum query depth of 3 is configured, and the following query documents. Anything checked in the red box is considered too deep and the query is invalid.

Using the GraphQL-Ruby service with the maximum query depth set, we get the following return:

{
  "errors": [
    {
      "message": "Query has depth of 6, which exceeds max depth of 3"
    }
  ]
}
Copy the code

Maximum query depth advantage

  • Because the AST of the document is statically parsed, the query is not even executed, so there is no additional burden on the GraphQL server.

Maximum query depth disadvantages

  • Depth alone is often not enough to cover all abusive queries. For example, requesting a large number of queries on the root node would be costly, but unlikely to be blocked by the query depth analyzer.

Query complexity

Sometimes the depth of the query is not deep enough to really understand the overhead of GraphQL queries. In many cases, some fields in our schema are more complex than others.

Query complexity allows you to define the complexity of these fields and limit the maximum complexity of queries. The idea is to define the complexity of each field by using a simple number. A common default setting is to give each field a complex 1. Take this query as an example:

query {
  author(id: "abc") { # complexity: 1
    posts {           # complexity: 1
      title           # complexity: 1
    }
  }
}
Copy the code

A simple addition tells us that the query complexity is 3. If we set the maximum complexity to 2 on our schema, this query will fail.

What if the posts field is actually much more complex than the author field? We can set different complexities for this domain. We can even set different complexity depending on the parameters! Let’s look at a similar query where posts determines complexity based on the parameters passed in:

query {
  author(id: "abc") {    # complexity: 1
    posts(first: 5) {    # complexity: 5
      title              # complexity: 1
    }
  }
}
Copy the code

Advantages of query complexity

  • More use cases can be covered.
  • Analyze the complexity statically and reject the query before execution.

Disadvantages of query complexity

  • It’s hard to achieve perfection
  • How do we keep up to date if we need to estimate complexity at development time? How do we know the query cost in the first place?
  • Mutations are difficult to estimate. What if they have a hard-to-measure additional operation, such as a task queued in the background?

The throttle

The solutions we’ve seen so far have been queries that prevent server abuse. The problem with using them like this is that they stop a lot of queries, but they don’t stop the client from generating a lot of queries!

In most apis, a simple throttling method is to prevent clients from requesting resources frequently. GraphQL is a bit special because tuning the number of requests doesn’t really help us. Even a small number of requests can be a large number of queries.

In fact, we don’t know how many requests the client defines as acceptable. So how do we limit the client side?

Tuning based on server execution time

We can estimate the complexity of a query by the server time it takes to execute it. We can use this expression to limit queries. With your knowledge of the system, you can suggest the maximum server time that a client can use within a specific time range.

We also decide how much server time the client adds over time. This is a classic Leaky bucket algorithm. Note that there are other throttling algorithms, but these are beyond the scope of this chapter. We will use leaky bucket in the following example.

Let’s imagine that we set the maximum allowed Bucket Size to 1000ms, and the client gets a Leak Rate of 100ms per second. The mutation is as follows:

mutation {
  createPost(input: { title: "GraphQL Security" }) {
    post {
      title
    }
  }
}
Copy the code

This mutation takes an average of 200ms to complete. In practice, the time may vary, but let’s assume that for this example, it always takes 200ms to complete.

This means that clients that call this operation more than 5 times in 1 second will be blocked until more available server time is added to the client.

After two seconds (100ms plus seconds), our customer can call createPost once.

As you can see, time-based tuning is a good way to limit GraphQL queries because complex queries will end up consuming more time, which means you can’t call them as often, while smaller queries may be called more often because they will be evaluated very quickly.

But if the GraphQL API is public, it’s not so easy to impose these restrictions on the client. In this case, the server time is not a good indicator of the client, nor can the client accurately estimate how long their query will take without first trying to request it.

Remember the maximum complexity we talked about earlier? What happens if we adjust to that?

Adjustment based on query complexity

Tuning based on query complexity is a good way to work with clients that can follow the constraints in the schema.

We use the same complexity example we used in the “Query complexity” section:

query {
  author(id: "abc") {    # complexity: 1
    posts {              # complexity: 1
      title              # complexity: 1
    }
  }
}
Copy the code

We know that the cost of this query is based on complexity 3. As time goes by, we can see the maximum Bucket Size that the customer can use.

If the maximum cost is 9, our customer can only run this query three times and is not allowed to run more queries.

These principles are the same as our time abstinence, but now we pass these limits on to the client. Customers can even calculate their own query costs without having to estimate server time!

The GitHub public API actually uses this approach to strangle clients. See how they express these restrictions to the user: https://developer.github.com/v4/guides/resource-limitations/.

conclusion

GraphQL is great for clients because it gives them more power. However, the power comes with the risk that clients will abuse the GraphQL server with very expensive queries.

There are many ways to protect your GraphQL server from these queries, but none of them are foolproof. It is important that we know what methods are available to limit, understand their strengths and weaknesses, and then take the best decision!

Advanced – 5. Frequently asked Questions

Is GraphQL a database technology?

It isn’t. GraphQL is often confused with database technology. It’s a misconception that GraphQL is an API query language, not a database. In this sense, it is database independent and can be used with any type of database, or even no database at all.

GraphQL only for React/Javascript developers?

GraphQL is an API technology, so it can be used in any context where an API is required.

On the back end, the GraphQL server can be implemented in any programming language that can be used to build a Web server. In addition to JavaScript, Ruby, Python, Scala, Java, Clojure, Go and. .net has popular implementations to refer to.

Since the GraphQL API typically operates over HTTP, any client that can make an HTTP request can query data from the GraphQL server.

Note: GraphQL is actually transport layer agnostic, so you can choose another protocol than HTTP to implement your server.

How to do server-side caching?

A common problem with GraphQL, especially when compared to REST, is the difficulty of maintaining server-side caches. With REST, you can easily cache data for each endpoint because it ensures that the structure of the data does not change.

On the other hand, with GraphQL, it’s not clear what the client is going to require next, so putting the caching layer after the API doesn’t make sense.

Server-side caching remains a challenge for GraphQL. More information about caching can be found at the GraphQL website.

How do I authenticate and authorize?

Authentication and authorization are often confused. Authentication describes the process of declaring an identity. This is when you authenticate when you log in to the service using a username and password. Authorization, on the other hand, describes permission rules that specify the access rights of individual users and groups of users to parts of the system.

Authentication in GraphQL can be implemented using common schemas such as OAuth.

For authorization purposes, it is recommended that any data access logic be delegated to the business logic layer rather than handled directly in the GraphQL implementation. For inspiration on how to implement authorization, check out Graphcool’s permissions query.

How do you handle errors?

A successful GraphQL query should return a JSON object with the root field named “data”. If the request fails or partially fails (for example, because the user requesting the data does not have the correct access rights), a second root field called “errors” is added to the response:

{
  "data": { ... },
  "errors": [ ... ]
}
Copy the code

Refer to the GraphQL specification for more details.

Does GraphQL support offline use?

GraphQL is a query language for the (Web) API, in the sense that it only works online. However, offline support on the client side is an interesting issue. Relay and Apollo’s caching capabilities may be sufficient for some use cases, but currently, there is no popular solution for storing data. You can get more insight on Relay and Apollo’s GitHub question, which discusses offline support.

An interesting approach to offline use and persistence can be found here.

Copy the code