albertofem's blog

Personal blog about programming
cv - contact - github - twitter

RESTful webservices under HTTP: Introduction and theoretical foundations

In this series of entries I will try to introduce and get a little in in-depth about the concept of RESTful webservices, how this architecture works and how can be use it to give our applications more reliability and efficiency.

This article will be divided in three parts: this first part will introduce the basic theoretical concepts that defines this architecture; this concepts are well spread across the internet, but I try to put them in context with the forthcoming parts of the series. In the second part I will talk about specific implementation considerations to take in account, but without an specific programming language. I will also analyze real APIs with common use cases. Finally, in the last part I will introduce a practical example, from analysis to implementation. I will be using the PHP programming language under Symfony framework.

Representational State Transfer

Making use of a pure theoretical definition, REST would be an “architectural style that abstracts the architectural elements within a distributed hypermedia system” [1]. Well, this is not making the concept very clear, is it? So let’s try again with more mundane words. REST is a bunch of principles, ways of doing things, that defines the interactions between several different and independent components and the rules that govern this interactions. One of the most used protocols that actually makes uses of this definition is HTTP, widely used nowadays.

The immediate consequence of this is that every web application under HTTP is also a REST application. However, this does not mean that every web application is in itself a RESTful webservice. The later must comply to a very specific and well-defined requirements in order to be called so. There are also many other technologies to implement web services, which you have probably heard before, like RPC, SOAP or WSDL. Still, it is preferred a RESTful approach over these technologies, as RESTful is more easy to understand and implement.

REST constraints

REST is defined by the following constraints, which any application based on this architecture must follow. These requirements are already met if we use a protocol like HTTP to base our applications on (maybe under a web server like Apache). However, it’s appropriate that we take a more detailed look onto these in order to have a more in-depth vision on the architecture, which will certainly have it’s utility when implementing our applications:

  • Client-server architecture: this basically consist in a clear separation of concerns between the server and the client. These 2 agents must be independent, which will guarantee a high degree of flexibility between their back and forth interactions.
  • Stateless: this means that it’s required for the server to treat each request independently from each other. This limitation has always been controversial in the industry, and we are starting to see revised protocols that removes this constraint, like WebSockets.
  • Cacheable: servers should have some way of telling clients that the served request can and should be cached in order to increment performance and efficiency. As we shall see, this is extensively used and it’s a crucial part in serving applications at high speed and reliability.
  • Layered system: in other words, clients should not care about the request passing through multiple layers of a networked system, as long as the interface is consistent.
  • Consistent interfaces: as we stated before, we need a consistent interface that any REST webservice must follow. This is to guarantee that it does not matter which machine makes the request and which one sends the response, as long both of them agree to an interface previously defined.

REST y HTTP

After this little introduction to the basic concepts and constraints of REST, we could just dive in an implement our own protocol using the REST architecture. However, we don’t want to reinvent the wheel, do we? So we can make use of already REST-implemented protocols, like HTTP (see also [SYPY] [2]). There are 4 basic principles, or practices, required to any REST webservice:

  • Resource identification: any REST application should identify their resources on a uniform way. HTTP implements this using URI (Uniform Resource Identifier). This is the commonly named URL, and although there is a very subtle difference between an URI and an URL [3], we can assert that every URL is an URI.
  • Resource representation: REST also specifies the ways in which we can interact with a given resource, and it’s representation, whether to edit or remove it from the server. These representations are handled the client’s program, but HTTP defines some headers that helps in the process of processing the response content.
  • Self-descriptive messages: when we do a request to the server, the later should send a response that allows us to understand unambiguously it’s status, whether it’s cacheable or not, errors that may had happened while processing the request, etc. HTTP implements this using Response codes and status headers. How these are used is enterily up to the client. This is not constrained by the REST protocol itself, so it’s very common to find server responses which does not correspond with the real status of the operation, for instance, when we receive a clearly error response (blank page or the like), and the response is still a 200 OK one.
  • HATEOAS: finally, we need a way to link one resource to another and it’s actions, so we can delegate the application’s flow to the client. A good HATEOAS webservice will allow us to effectively work with it only with a fixed entry point. However, few webservices out there put this onto practice, so we need to rely of documentation in order to know which actions or resources are related to the one we are making requests to.

RESTful webservices

Knowing these rules, it’s time to see details about the HTTP protocol, how it handles the REST underlying protocol and how we can implement it on our RESTful applications:

  • Resource URI: For example: http://api.service.com/resource/house/1 (this would give us the resource house with id: 1.
  • Resource type: We can use the HTTP header Content-type to specify the response content type. Clients will know what to do with the content, for example, if we have a response header like Content-type: application/json, we know that we need to do some sort of json parsing in order to handle the response. It’s prefered to use an easy to process format, but this is entirely up to the server. Common formats are: json, xml, txt, or even html.
  • Methods: HTTP specifies ways of interacting with are resource, commonly called methods. Some of them are: GET, POST, DELETE, PUT, PURGE, etc. It’s important to understand the differences between them, because the server expects the client to use them correctly, and can even define contraints to allow or disallow any given methods. (See 405 Method Not Allowed header) [4]
  • Hyperlinks: responses can also define hyperlinks to actions or related resources to the one we are requesting. HTTP defines an specific header, Link [5], that serves this purpose.

Here we have an example of request made to the GitHub API webservice. We can see how it includes everything we have talked about.

Conclusions

In this first part we have introduced the theoretical foundations of the REST architecture and one of it’s implementations: HTTP. I will dedicate the next chapter to analyze a couple of REST APIs in order to get an idea of what we should and shouldn’t implement in ours.

References

[1] http://en.wikipedia.org/wiki/Representational_state_transfer
[2] http://en.wikipedia.org/wiki/SPDY
[3] http://www.ietf.org/rfc/rfc3986.txt (1.1.3)
[4] https://tools.ietf.org/html/rfc2616#page-66
[5] http://tools.ietf.org/html/rfc5988#page-6