Secure web applications with GraphQL and Elixir

In traditional applications, the web application talks directly to the database. It has rights to do anything, relying on application rules to control access. If an attacker compromises it, then they can do anything, e.g. grab all the data or create a funds transfer transaction.

When security is critical, e.g. in health care and financial services applications, there are benefits to separating the application that interacts with users from the back end data using a well defined API.

In health care applications, it's common to have users with different roles looking at the same information (patient, family member, nurse, doctor, admin). Those rules can be very complex, and bugs may leak information. The data that a user can access depends on their roles and relationships. A family member can view a patient's medical information once they have been authorized. A doctor in a clinic can view today's appointments and active cases. A specialist can view the cases that they have been referred.

A banking customer can view their own account and make transactions, and they may give third parties access to data. Inside the bank, we need to control staff access to data based on their roles, e.g. only staff handling KYC should have access to scans of IDs.

Using an API lets us clearly define operations and the permissions needed to execute them. We tie the operations to a user, and the API server ensures that they have rights to access data. This clean interface provides a central place for access control and audit trail. It is easier to understand and to test. There is a single API to access the data shared between web front end, mobile API and other services.

Sounds great, but doesn't it have a lot of overhead? Not with GraphQL and Elixir. We originally started using GraphQL for mobile APIs, then realized that it was also great for security.

API design using REST and GraphQL

It is popular to build APIs using REST, modeling our systems in terms of "resources," e.g. users, accounts, medical cases, transactions. We then express all actions in terms of create, read, update, and delete operations on those resources.

While conceptually simple, REST can be quite "chatty," requiring a lot of requests to build complex pages. We might make one request to get a list of patients, then one per patient to get their details. We make another request for open cases associated with each patient, then another to get the case details.

Things that would be joins in a relational database end up being multiple web requests. If the requests are all local, the performance is not too bad, but it certainly adds up. In a mobile context, round trips can be very slow, and complex pages can take seconds to load.

REST apps are supposed to use HTML-style links in messages, allowing the application to navigate between resources. Many "REST" applications don't follow the principles, however, they are just ad-hoc blobs of poorly defined JSON delivered over HTTP.

In REST, there is no standard way to specify search parameters, filtering or subsets of a resource's fields. We might want to restrict sensitive fields based on user role. An app listing cases may end up getting the body of each case and throwing it away, only to fetch it again when the user opens the case. Developers have to write custom code to handle and validate parameters. We end up with multiple versions of APIs which differ only in the number of fields.

In order to avoid making multiple requests to the back end, mobile app developers may use "view APIs," e.g. a /home-page API which gets all the data for the home page in one shot. This results in an explosion of API functions and versions as we add pages and data fields to the front end.

GraphQL was invented at Facebook to solve this problem. The client sends a query identifying the objects it wants, as well as fields in associated objects. The GraphQL API server returns all the data in a single request. It can authenticate the user and check their access permissions, filtering the result set to ensure that they only see what they should.

For example, here is a query for an article summary list:

{
  article {
    title
    published_at
    author {
      name
    }
  }
}

Standard schemas to define a data model, allowing the framework to handle validation without hand coding. It handles filtering, selection and paging in a standard way. It's not necessary to force complex actions into the REST model, as GraphQL supports named operations with well defined parameters.

It also has a standard mechanism to publish real-time event messages between parts of the system, using the same schemas to define the structure. A client can select cases and display them, then subscribe to see new cases as they are created.

Access control

Every request has a user context associated with it, represented by an access token.

On the web, when a user logs into the system, they pass their username / password to the front end, which calls the API to authenticate the user. The back end verifies the information and returns the token. The front end stores the token in the user's session and uses it on subsequent requests.

Mobile applications work the same way, calling the same API and storing the token on the device while the session is active.

Rich front end apps running in the browser can talk directly to the GraphQL server, bypassing the front end web server entirely while sharing the login session.

If an attacker compromises the front end machine, then all they can do is execute operations as currently active users. They can only see a small subset of the data, and they lose access when the sessions expire.

Elixir for the win

We use the Absinthe GraphQL server, written in the Elixir programming language. It handles GraphQL queries along with our own custom application logic, combining traditional web development and GraphQL services on the same platform (Phoenix).

Modern stateful-web applications use Web Sockets or HTTP/2, making the user interface more interactive and powerful. Phoenix Channels let us combine web, mobile and other data sources like IoT using the same system. The Erlang platform can easily handle the load, while staying manageable and secure.

Integration

The GraphQL server provides a common interface to multiple back end servers. We can even make a single query resolve each field to a different back end server, combining the results into one response.

When interfacing with a REST back end, we can take advantage of the Repo application pattern used by Elixir's Ecto db library, but talking HTTP. That fits into the standard Phoenix structure, allowing easy filtering of queries via input parameters. For example:

q = from(i in GitHub.Issue,
         select: {i.title, i.comments},
         where: i.repo == "elixir-ecto/ecto" and
                i.state == "open" and
                "Kind:Feature" in i.labels,
         order_by: [desc: :comments])
Repo.all(q)

[{"Introducing Ecto.Multi", 60},
{"Support map update syntax", 14},
{"Create test db from development schema", 9},
{"Provide integration tests with ownership with Hound", 0}]