Thursday, November 27, 2014

#ReadFielding

A Whiskey-Soaked Surprise

A couple of Fridays ago I was hanging out with Drewz, the PM for Cortex, EP's Hypermedia API engine, and we were discussing the topic of REST as we often do over scotch. Roy Fielding's dissertation came up and we were both surprised to learn that I have never read it. I mean, I've read bits of Chapter 5, but never cover-to-cover. We both thought this is a wrong that must be righted so I downloaded it to my tablet and started reading that weekend.

Book Review

Now, Ph.D. dissertations are not generally what I like to read, but Roy's writing engages the reader and doesn't dally much. Perhaps it was the Monty Python reference that started it off. I learned quite a bit, and I finally understand Mike Amundsen now when he says "You keep using that word 'REST'. I do not think it means what you think it means." I, along with many others, have conflated the term "REST" with web APIs.

REST is not an API style, but rather the Architectural style of the web. REST a collection of Architectural constraints that apply to all content and interactions on the internet, not just APIs. These constraints are immediately familiar to web designers (or should be) but are rather foreign to API designers. Having cut my teeth on the internet in the 1990's with dynamic web pages via Perl and /cgi-bin, I always intuitively understood these constraints and strove to live within them in the APIs I develop. My never-ending surprise is how non-obvious these constraints are to server-side developers who haven't written a client.

Backfilling Reality

REST is Fielding's definition of how the internet works, and should continue to work. As an author of the HTTP 1.1 specification he drove the standardization of what was a highly-chaotic mix of competing interests that were trying to establish the future of the tremendously-popular phenomenon of "logging on" and checking your email. At that time many people used AOL, or Compuserve, and thought that was the internet. You had to open a special app to actually "browse" the web itself. At that time the internet was still small enough that Yahoo could categorize every web page by hand.

Through the efforts of the W3C, IETF and various other organizations, the WWW became the internet, and dial-up services ended. One big reason for this was that the experience on the WWW was way better than what one experienced on dial-up services. This superior experience can be attributed to REST—self-contained hyperlinked resource representations that can be cached across the internet, and interactions between them that use a simple, uniform interface.

In defining REST, Fielding lays out four constraints:
  • Resource Identification. The unique identifier of a resource. Basically the URL, but with the notion that it should be semantically consistent over time. For instance, the URL http://rest-apithy.blogspot.ca points to the "current" blog post. The current post will change over time, and that's OK. However, the URL http://rest-apithy.blogspot.ca/2014/07/rest-just-wants-to-be-normal.html points to a specific blog post that should be the same over time. The author of the resource makes this choice and determines the stable URI for this resource. 
  • Resource Representations. A resource can have more than one type of representation. The client and server negotiate what kind of representation (media type) to use. Don't understand HTML? How about XML? How about French HTML? The negotiation of the representation format is governed by a set of rules that try to gracefully degrade to something consumable, but without any back-and-forth between the client and server.
  • Self-Describing Messages. A representation needs no further services once created. You may need special software to render it, like a PDF reader or an HTML renderer, but the content itself is complete.
  • HATEOAS. Representations contain links to related resources. These links are part of the content itself, not a secondary delivery from a link server somewhere. Back In The Day hypermedia systems had a separate link server that you called to find out what links existed for a document. REST representations have them inline, in-context, and self-defining. This means you make up the link yourself without having to create the linked content first.


Statelessness


Hypermedia as the engine of application state is a one-liner that comes out in the context of describing the stateless nature of REST. Statelessness is an aspect of self-describing messages, where everything needed to understand a request is in the message itself.

Not surprisingly, Roy clearly despises Cookies:
6.3.4.2 Cookies
An example of where an inappropriate extension has been made to the protocol to support features that contradict the desired properties of the generic interface is the introduction of site-wide state information in the form of HTTP cookies [73]. Cookie interaction fails to match REST’s model of application state, often resulting in confusion for the typical browser application.


My interpretation of Statelessness is that it really means stateful in one place, the system of record. Changes to state are requested by the client via messages paired with state change HTTP verbs, and the server responds with the status result of the change. Current state is stored on the client, but with a set of caching rules to ensure that state does not drift into staleness beyond what the system of record will tolerate.

Cacheability

I haven't counted, but I suspect that Roy uses words like "efficient", "performance", and "user-perceived" on every page of his dissertation. The key to REST performance is Cacheability. One must think about how representations can be cached and refreshed over time. I cannot think of any REST API frameworks that make this at all easy. I also think this is one of the biggest missing pieces of REST APIs today.

Last year, I would have said that links in representations were the biggest missing piece, but I have been happy to see the emergence of Hypermedia (née HATEOAS) APIs. Now, cacheability is the biggest missing piece. API developers tend not to think about cacheability because API requests are considered RPC calls; they are not.

An API request is really a representation vending event. The representation should have thoughtfully-considered caching semantics that can yield remarkable performance.

Cacheability is hard because one has to consider every resource and make several decisions:
  • Is this resource shared?
  • Is this resource static?
  • If dynamic, how often does it change?
  • How quickly can a state change be determined?
I am working on this topic right now at my current job and I will blog about it in the new year.

All This and More!

Roy's dissertation covers a lot of ground, far more than I can sufficiently review in a blog post. Some interesting concepts that I am still thinking about are the evaporation of rationale in a realized architecture; how cacheability can lead to Shared Repositories, and how some information, like user identifiers and locale, should never be in the URL.

#ReadFielding For Yourself

I'd like to encourage the reader to take up Fielding's dissertation and read through it with enough persistence to allow your view of REST APIs to shift towards the happy place of the real REST. Share with others and tweet/post/whatever with the #ReadFielding hashtag. I am really interested to see what you learn that I missed.

No comments:

Post a Comment

Hello HAL, Welcome to Cortex!

Back in 2011, Elastic Path decided our future was in Headless Commerce, and to succeed we needed to offer the best API experience possible....