GraphQL Stack Overview: How does it all fit together

This article originally appeared on the Apollo blog and was translated and shared with permission from InfoQ Chinese.

Facebook has been releasing GraphQL as an open source project for two years. Since then, the community has grown exponentially, and thousands of companies now use GraphQL in production environments. At the GraphQL Summit in October 2017, I was honored to be invited to speak the next day. You can watch the full video on YouTube, or read this article to get an overview of the talk.

I’ll start with a brief overview of GraphQL today, and then explain what benefits it can bring to developers as it evolves over time, with a particular focus on three examples of full-stack GraphQL integration: caching, performance tracing, and Schema stitching.

What makes GraphQL different?

There are three main factors that make GraphQL stand out in comparison to other API technologies:

GraphQL has a great query language, which is a great way to describe data requirements, and a well-defined schema that exposes API capabilities. GraphQL is the only major technology that specifies both sides of the equation, and all of its benefits come from the interaction of these two concepts.
GraphQL helps us decouple the consumer and provider of the API. In an endpoint-based API like REST, the form of the data returned is determined by the server. In GraphQL, the form of the data is determined by the UI code that uses it, which is more natural and allows us to focus on separation of concerns rather than technology.
The GraphQL query is tied to the code that uses it, so we can think of the query as a unit of data acquisition. GraphQL understands all of the data requirements of UI components up front and is therefore able to implement some new types of server functionality. For example, using batch processing and caching in the underlying API calls to a query that represent the data needed for the UI part is easy to implement with GraphQL.

Separate concerns, not technologies: GraphQL puts the need for data on the client side

Next, let’s take a look at three oft-mentioned aspects of data retrieval, and then discuss how GraphQL can use these features to improve it.

It’s important to note that many of the features I’m talking about are available now, and some will be implemented in the future. If these features excite you, scroll to the bottom of the page to learn how to get involved.

1. Cross-request caching

One of the most frequently asked questions is how to implement cross-request caching for the GraphQL API. When applying regular HTTP caching to GraphQL, there are a few issues:

HTTP caches generally do not support POST requests or long cache keys;
Diversity of requests generally means lower cache hit ratios;
GraphQL is transport-layer independent, so HTTP is not always valid.

However, GraphQL also presents a number of new opportunities:

Declare cache control information on the schema and resolver that accesses the back end;
Automated fine-grained cache control brought about by the pattern solution, regardless of the hit ratio per request.

How can we make better use of caching in GraphQL, and how can we take advantage of these new opportunities?

Where should you put caching capabilities?

We first need to decide where to put the caching capabilities. The original idea might have been to put the caching logic inside the GraphQL server. However, a simple tool like DataLoader doesn’t work well across GraphQL requests, and putting caching capabilities into server-side code can lead to implementation complexity. So, we should put it somewhere else.

Just like REST, it makes sense to cache on both sides of the API layer:

The entire response is cached in the infrastructure layer outside the GraphQL API;
The results obtained from the underlying database and microservice access are cached under the GraphQL server.

For the second, the existing caching infrastructure is still available. For the first item, we need to create a new layer outside the API that can implement functions like caching in a GraphQL-aware way. Essentially, this architecture allows us to take complexity out of the GraphQL server:

Move complexity to a new layer between client and server

I call this component the GraphQL Gateway. On the Apollo team, we felt that this new gateway layer was so important that everyone needed it as part of the GraphQL architecture.

That’s why, during this year’s GraphQL Summit, we launched Apollo Engine as the first GraphQL gateway.

GraphQL response extension for cache control

As I mentioned earlier in the preface, one of GraphQL’s strengths is that it has a huge ecosystem of tools that run through GraphQL’s queries and schemas. I thought caching should work the same way, so we introduced Apollo cache control, which uses a built-in feature of the GraphQL specification called extension to include cache control information in the response.

With our JavaScript reference implementation, it is easy to add cache control information to the schema:

Define the cache information in the schema via Apollo-cache-Control-js

Here, we’ve created this new cache control specification on top of GraphQL’s main features, and I’m excited about the implementation. It is able to specify information about the data in a fine-grained manner and use GraphQL’s extension mechanism to send relevant cache control information to consumers. It is completely language – and transport-independent in its implementation.

After my talk at the GraphQL Summit, Oleg Ilyenko, who maintains the Scala GraphQL implementation, released a working cache control feature for Sangria.

Caching is implemented through the gateway

Now back to the cache control information in the GraphQL server, we were able to implement caching in a clean way with the help of the gateway. Each part of the stack can function in its own place:

The cache coordinates each component of the technology stack

One other cool thing worth mentioning is that most people already use caching in the GraphQL stack: for example, using Apollo Client and Relay to cache data on the front end. In future versions of Apollo Client, cache control information from the response will automatically expire old data from the Client. So, just like any other component in GraphQL, the server describes its functionality, the client specifies its data requirements, and all components work well together.

Next, let’s take a look at another sample GraphQL functionality that spans the entire stack.

2. Tracking

GraphQL allows front-end developers to use data in a much more granular way than end-to-end systems. They can request the data they want precisely, ignoring fields they won’t use. This gives you the opportunity to detect detailed performance information and to track performance in ways that were not possible in the past.

Don’t settle for an opaque total query time — GraphQL lets us get the detailed timing of each field

We can say that GraphQL is the first API technology that can capture internal information at a fine-grained level. This wasn’t just a tool — For the first time, GraphQL was able to give front-end developers a legal way to get the execution timing of each field, and then let them modify the query based on that to solve related problems.

Trace across the stack

Tracing is similar to caching, where coordinating the entire stack is really useful.

Each component has a role to play in providing trace information and can participate

The server can provide additional information in the results, just as it provides cache-related information, which the gateway can extract and aggregate. Similar to caching, complex functions that you don’t want to care about in the server are handled by the gateway component.

Here, the primary role of the client connects the query to the UI component. This is important because we can relate the performance of the API layer to its impact on the front end. For the first time, we were able to show the performance of the back-end fetch data and the UI components it affected on the page.

GraphQL trace extension

Much like caching, GraphQL’s response extensions can be implemented in a server-independent manner. The Apollo Tracing specification, currently available in Node, Ruby, Scala, Java, and Elixir, defines how the GraphQL server returns timing data, which is parsed by the resolver in a standard way. Other tools can use the parsed performance data.

Let’s assume that all GraphQL tools have access to performance data:

Abstract sharing makes information like trace data available to all tools

With Apollo Tracing, we can use performance data in GraphiQL, editors, or anywhere else.

So far, we’ve looked at the interaction between a client and a server. As a final example, let’s take a look at how GraphQL can help us modularize our architecture.

3. Mode stitching

One of the biggest benefits of GraphQL is the ability to access all of your data in one place. Until recently, however, there was a cost to this approach: we needed to implement the entire GraphQL schema as a code base so that all the data could be queried in a single request. What if your architecture is modular and you want to take advantage of the benefits of a unified GraphQL API?

Schema splicing is a simple idea: GraphQL can easily merge multiple apis into one, so that we can implement the components of the schema as separate services. These services can be deployed independently, written in different languages, and even belong to different organizations.

Here is an example:

Combine GraphQL Summit ticketing system data and a weather API data into a query:launchpad.graphql.com/130rr3r49

In the screenshot above, we can see how the concatenated API joins separate queries for two different services in a way that is completely invisible to the client. In this way, we can compose GraphQL patterns just like lego bricks.

We are currently providing an implementation of this feature that readers can try out now, as part of the Apollo GraphQL-Tools library, through documentation for more information.

Splicing is performed in the gateway

Pattern concatenation also works well across the stack. In the long run, we think it’s a good idea to concatenate in a new gateway layer so that you can build patterns using any technology you want, such as Node.js, Graphcool, or Neo4j.

Eventually, the concatenation is associated with each component in the stack

Clients can also join in. Just as you can load multiple backend data from a single query, you can also combine data sources on the client side. State management has been added in the recently released Apollo Client 2.0, which allows us to load data from Client state and any number of backends in a single query.

conclusion

By reading this article or watching the talk, I hope readers realize how powerful the GraphQL tool is today and how much potential it has for the future. We’ve just scratched the surface of GraphQL’s abstraction and functionality.

Finally, I’d like to share a list of Todos based on the above ideas:

There is still a lot of work to be done to integrate these new features, especially in the area of development tools and editors

There is much more to be done to unlock GraphQL’s full potential. At Apollo, we’re doing our best to do this, but no one person, team or organization can do it all. We need to work collaboratively to build these solutions in order for future blueprints to materialize.

Either way, one thing is clear: GraphQL has been used as a transformative technology in thousands of enterprises, but that’s just the beginning! I can’t wait to see how we’re going to build apps over the next two years, five years and 10 years, because it’s going to be amazing!

How to participate in

If, like Apollo, you believe in GraphQL’s potential, get involved in the community. To get readers off to a fast start, we created a help page.

Thanks to Xu Chuan for correcting this article.