• GraphQL Concepts Visualized
  • Originally written by Dhaivat Pandya
  • Translation from: The Gold Project
  • This article is permalink: github.com/xitu/gold-m…
  • Translator: Jessica
  • Proofreader: Jiang Wuzhi, Baddyo

Visualize the GraphQL concept

We will use diagrams to visualize the GraphQL mental model

GraphQL is usually interpreted as “a unified interface for accessing data from different sources.” While this explanation is correct, it does not reveal the nature and motivation behind GraphQL, and why it is called “GraphQL” — just as, if you see the stars and the night sky, you do not see the “Starry Night”.

I think the real core of GraphQL is applying data graphs. In this article, I’ll introduce the application data graph, discuss how GraphQL performs queries on the application data graph, and how to utilize the tree structure of GraphQL queries to cache their query results.

Updated on 2/7/2017: Now you can see the content of this article in the video below.

GraphQL’s thinking model – Dhaivat Pandya

Application data graph

Much of the data in modern applications can be represented using graphs with nodes and edges, where nodes represent objects and edges represent relationships between objects.

For example, we are setting up a catalogue system for the library. Simply put, our catalog has a bunch of books and authors, and each book has at least one author. In addition, there are co-authors, authors who have collaborated with others to write a book or books.

If we visualize these relationships in graph form, they look like this:

The figure represents the relationship between our various data fragments and the entities we are trying to represent, such as Book and Author. Almost all applications run on this kind of graph: they read data from it and write to it. This graph is where GraphQL comes in.

In GraphQL, we can extract a tree from the application data graph.

This may seem confusing to you, but let’s explain what it means. Basically, a tree is a graph with a starting point (root) and attributes. It has properties that you can’t trace your finger along the edge of a node and return to the same node, that is, it doesn’t loop.

Use GraphQL to traverse the application data graph

Let’s look at an example GraphQL query to understand how it “extracts the tree” from an application data graph. This is the GraphQL query code corresponding to our book catalog system application data graph above, as follows:

query {
  book(isbn: "9780674430006") {
    title 
    authors {
      name
    }
  }
}
Copy the code

Once the server parses the query, it returns this query result:

{book: {title: "CapitalinAuthors: [{name: 'Thomas Piketty'}, {name: 'Arthur Goldhammer'},]}}Copy the code

This is what the application data graph looks like:

Query path

Let’s take a look at how this data is extracted from the graph using the GraphQL query.

In GraphQL, we can define the RootQuery type (which we will call RootQuery), which defines where the GraphQL query should start when traversing the applied data graph. In our example, we start with a “Book” node that uses its ISBN number as the query field “Book (ISBN:…)” “Chosen. The GraphQL query then traverses the graph by tracking the edges of each nested field tag. In our query, it jumps from the “Book” node through the “title” field in the query to the node that contains the title string of the Book. It also gets the Author node by tracking the edge on the “Book” node marked with the “authors” field, and gets the “name” of each Author.

To see how this results in a tree, simply move the node around to make it look more like a tree:

For each piece of information returned by a query, there is a query path associated with it. The path consists of the fields in the GraphQL query that we use to get the data. For example, the search path for the book title “Capital” is as follows:

RootQuery → Book (ISBN: “9780674430006”) → title

The fields in our GraphQL query (i.e., book, authors, name) specify which edges in the application data graph should be selected to get the results we want. That’s where GraphQL gets its name: GraphQL is a query language that traverses a graph of data to generate a tree of query results.

Cache the GraphQL results

To build a really fast, smooth application where users don’t waste a lot of time waiting for loading animations to load, we wanted to use caching to reduce the number of requests between the client and server. The tree structure of GraphQL turns out to be a good fit for client-side caching.

As a simple example, suppose you have some code on your page that gets the following GraphQL query result:

query {
  author(id: "8") {
    name
  }
}
Copy the code

Later, other parts of the page will request the same query again. Unless we completely need the latest data, we can respond to the second query with the data we already have! This means that the cache needs to be able to resolve queries without sending them to the server to make our application run faster. However, we can do much better than just caching the exact query we previously obtained.

Let’s look at how the Apollo Client caches GraphQL results. Basically, the GraphQL query result is an information tree formed from the server-side data graph. To avoid reloading them every time we need them again, we want to be able to cache these result trees. To do this, let’s make a key assumption:

The Apollo Client assumes that each path in the application data graph (specified by the GraphQL query) points to a stable block of information.

In cases where this is not true (for example, when the information pointed to by a particular query path changes very frequently), we can use the concept of object identifiers to prevent the Apollo Client from making such assumptions, which we will discuss later. But, in general, this is a reasonable assumption to make when it comes to caching.

Same path, same object

This last “same path, same object” assumption is extremely useful. For example, suppose we have two queries, firing one after the other:

query particularAuthor {
  author(name: "Thomas Piketty") {
    name
    age
  }
}

query authorAndBook {
  book(isbn: "9780674430006") {
    title
  }

  author(name: "Thomas Piketty") {
    name
    age
  }
}
Copy the code

Just by looking at the query, you can see that the second query does not require going to the server to get the author’s name. This information can be found in the cache from the results of the previous query.

The Apollo Client uses this logic to strip part of the query based on the data already in the cache. It supports this comparison query because of the path assumption. It assumes that the path RootQuery→author(ID: 6)→name retrieves the same information in both queries. Of course, if you don’t want to use this assumption, you can use the forceFetch option and the cache will be completely overridden.

This assumption is useful because the query path also includes the parameters we used in GraphQL. Such as:

RootQuery → author(id: 3) → name

Is not the same as

RootQuery → author(id: 6) → name

So the Apollo Client does not assume that they represent the same information and tries to merge the results of one with the other.

Object identifiers are used when path assumptions are insufficient

As it turns out, we can do better than tracking the query path from the root. Sometimes, you may access the same object through two completely different queries, thus providing two different query paths for the object.

For example, assuming that each of our authors has some co-authors, then we can eventually access some “Author” objects through this field:

query {
  author(name: "Arthur Goldhammer") {
    coauthors {
      name
      id
    }
  } 
}
Copy the code

But we can also access an author directly from the root node:

query {
  author(id: "5") {
    name
    id
  }
}
Copy the code

Suppose an author named “Arthur Goldhammer” and an author with ID 5 are co-authors of a book. Then, we’ll save the same information twice in the cache (that is, about the author with ID 5, Thomas Piketty).

So, the tree cache structure in the cache looks like this:

The problem is that both queries refer to the same piece of information in the application data graph, but the Apollo Client doesn’t know that yet. To solve this problem, Apollo Client introduces a second concept: object identifiers. Basically, you can specify a unique identifier for any object you query. The Apollo Client then assumes that all objects with the same object identifier represent the same information.

Once the Apollo Client knows this, it can rearrange the cache in a better way:

This means that object identifiers must be unique throughout the application. Therefore, you cannot use the SQL ID directly, because then the SQL ID of the author might be 5, and the SQL ID of the book might also be 5. But this is easy to fix: To generate a unique object identifier, simply append the __typename returned by GraphQL to the ID generated by the back end. Thus, an Author with SQL ID 5 can have an object identifier of Author:5 or similar.

Ensure the consistency of the query results

Continuing with the last two queries we just worked on, let’s consider what happens if some data changes. For example, what if you get another query and find that the author with ID 5 has changed his name? In the meantime, what happens to the UI part of the old name used by the author with ID 5?

Here’s the big deal: They update automatically. This leads to another feature provided by the Apollo Client: If the value of any node in the observed query tree changes, the query is updated with the new results.

So, in this case, we have two queries that depend on authors, with an object identifier of “Author:5.” Since both query trees refer to the author attribute, any updates to the author information are propagated to both queries:

If you use a view integration package like React-Apollo or Angular2-Apollo in the Apollo Client, you don’t need to set this up: your components will get the new data directly and automatically rerender it. If you are not using the View integration package, the core method watchQuery will do the same, providing you with an observer object that will be updated whenever the store changes.

Sometimes it’s not reasonable for your application to use object identifiers on everything, or you may not want to deal with them directly in your code, but you still need to update specific information in the cache. We solve this problem by providing handy and powerful apis, such as updateQueries or fetchMore, that allow you to incorporate new information into these query trees with very fine control.

conclusion

The backbone of any application is located in the application data graph. In the past, when we had to send our HTTP requests to REST endpoints to import and export data to the application data graph, caching on the client side was difficult because the data fetch was specific to the client application. Now GraphQL gives us a lot of information that we can use to automatically cache data.

If you understand these five simple concepts, you can understand how responsiveness and caching (that is, all the magic tricks that make your application fast and smooth) work in the Apollo Client. Here, we repeat:

  1. The GraphQL query means a way to get the tree from the application data graph. We call these trees query result trees.
  2. The Apollo Client caches the query result tree. To do this, it applies two assumptions:
  3. Same path, same object — The same query path usually points to the same information.
  4. When path assumptions are insufficient, use object identifiers — if two query results are given the same object identifier, they represent the same node or information.
  5. If any cache node in the query result tree is updated, the Apollo Client updates the query with the new result.

In general, this should be enough to make you an expert on Apollo Clients and GraphQL caching. Think this article is too informative? Don’t worry — we’ll continue to post more conceptual information like this if we can, so everyone understands the purpose behind GraphQL, where its name came from, and how to clearly explain various aspects of GraphQL result caching.

If you find any errors in the translation or other areas that need improvement, you are welcome to revise and PR the translation in the Gold Translation program, and you can also get corresponding bonus points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.


Diggings translation project is a community for translating quality Internet technical articles from diggings English sharing articles. The content covers the fields of Android, iOS, front end, back end, blockchain, products, design, artificial intelligence and so on. For more high-quality translations, please keep paying attention to The Translation Project, official weibo and zhihu column.