Catching Breaking API Changes Early

Jan 24, 2017

In search for the source of truth

At Credit Karma, we’ve been designing and building REST APIs for several years. These APIs drive the applications that help 60 million members make financial progress. Over the years, we found it to be very important to design, document and review APIs as early in the development process as possible. Preferably before any code is constructed.

We also found it critical that the API documentation be the source of truth. In projects with many cross functional teams the API documentation represents the agreement between the teams and the different layers of software. If you can’t count on this documentation, it can create communication issues across teams and introduce defects throughout the entire codebase.

We initially tried using collaborative pages and quickly found them hard to maintain and often out of sync with the actual implementation. We realized that any form of API documentation that isn’t used as the source for generating code is very likely to be wrong.

As a result, we tried our hand at RESTful API Modeling Language (RAML). If you haven’t heard, it makes it easy to manage the whole API lifecycle from design to sharing. Based on YAML, RAML files are both machine and human readable. Which is fundamental in acting as the source of truth for the API.

The challenges

RAML certainly helps address some issues around the source of truth by supporting the use of code generation. However, there are many more issues with RAML and REST for that matter:

Often it requires multiple round trips by the client to get the information needed to render a single page
Many designs result in a CRUD API that simply exposes the database
Managing versions and breaking changes can be difficult

As an API matures and changes over time, it can be difficult to identify and manage breaking changes, especially in a REST architecture. This is usually managed by diligent code reviews and design meetings – all requiring humans and process to catch a problem before it arises. We would typically improve this kind process with tools to identify issues automatically. Again, REST makes this difficult. It is nearly impossible to introspect how clients are using an API and more specifically which data elements of the API they are accessing.

As we evaluated tools and solutions for managing these common REST problems, we embarked on an entirely new direction.

A new frontier

During our evaluation, we found GraphQL and Falcor, which were taking a different approach to APIs. These technologies used a query language as the API to retrieve and mutate data. We decided on GraphQL because it offers:

Client-centric API that returns only the data the client requests
Strongly typed schema that makes it easier to build high quality client tools
Queries are shaped just like the data they return, which is a natural way for product engineers to describe data

At Credit Karma we are actively building our GraphQL platform to incrementally replace our existing REST API. As part of this process we are designing and updating our Schema following similar procedures to building our REST APIs using RAML.

Instead of using RAML, we are now defining our API using the GraphQL Schema Language. This is a shorthand notation to succinctly express the basic shape of your GraphQL schema and its type system. By defining the schema using the shorthand stored in text files we can meet the following requirements:

Design and document the API before coding
The Schema and type system is guaranteed to be the source of truth
Both machine and human readable

Below is an example of how a schema would be defined using GraphQL files:

schema.graphql
schema {  query: RootQuery }
rootQuery.graphql
type RootQuery { me: User }
user.graphql
type User { 
id: ID! 
name: String 
age: Int 
is_active: Boolean 
 friends: [User]!
}

These files are not directly usable by a typical GraphQL server, which is why we developed and open sourced a utility to make this possible:

creditkarma/graphql-loader instantiates a GraphQL Schema by loading GraphQL Schema Language files based on a glob pattern.

We package and publish our schema files as an NPM module and use the graphql-loader to make the module useable by our GraphQL Server with a single function call.

Client operations

With the Schema defined and the GraphQL server up and running, we can finally start to take advantage of the real value of GraphQL by building our client operations (query).

All of our operations are defined using the GraphQL Schema Language stored in text files just as we do with our schema. By separating the client queries from client source we gain:

The ability to easily validate queries against the schema
The ability to create a whitelist query store

An example query based on above example could be:

{ 
 me {   
  id,   
  name,   
  age,   
  friends {     
   id,     
   name,     
   age   
  } 
 }
}

We also package and publish all of our client queries by client as NPM modules.

What about breaking changes?

One of the key advantages we have found in using GraphQL is the ability to easily build tooling that validates all known queries against any version of our schema, especially during the design phase while the schema is changing. By packaging our schema and client operations as NPM modules we can build continuous integration (CI) tasks that execute on each PR to the Schema. This allows us to immediately identify breaking changes to the API.

To help facilitate this process we built another GraphQL utility that combines the use of the graphql-loader and graphql-tools to easily validate queries against a GraphQL Schema. This tool can be found on our GitHub site at:

creditkarma/graphql-validator – A CLI tool to validate queries against a GraphQL Schema. The primary use case for this tool is to validate schema changes against an existing query store.

Unexpected values

By applying the the graphql-validator to our Schema pull request CI jobs, we have been able to substantially decrease the required number of reviewers required to approve a change. The tooling can now immediately identify breaking changes and notify everyone involved. This helps us to assure we have alignment on the need to introduce breaking changes. We are also able to preprocess and prevalidate all schemas and operations at build time, eliminating some common GraphQL runtime errors that can result when using GraphQL files.

Conclusion

By using GraphQL, you’re able to replace your existing REST API. With GraphQL Schema, you can design and document the API before coding, have it be both machine and human readable, and get a source of truth. You can better validate queries and create a query whitelist when you separate the client queries from client source. By combining graphql-loader and graphql-validator to your build and CI pipeline, you can take advantage of the power GraphQL provides by assuring your schema supports all your client requests.

We’re looking forward to hearing how the new tools work for you @CreditKarmaEng. If you’re interested in contributing to our team, check out ouropen roles.

general engineering, continuous integration, frontend, graphql // Category: General Engineering