A dive into HTTP 1.1 message formatting

It’s time to take a moment and think about Hypertext Transfer Protocol (HTTP) message formatting, specifically HTTP 1.1.  To quote Wikipedia’s article on HTTP, “HTTP functions as a request-response protocol in the client-server computing model”. The article also provides an example I find helpful:

In HTTP, a web browser, for example, acts as a client, while an application running on a computer hosting a web site functions as a server. The client submits an HTTP request message to the server. The server, which stores content, or provides resources, such as HTML files, or performs other functions on behalf of the client, returns a response message to the client. A response contains completion status information about the request and may contain any content requested by the client in its message body.

Requests

In general, what does an HTTP request look like? We can see one by requesting yahoo.com via cURL on the command line:

$ curl -v yahoo.com

> GET / HTTP/1.1
> User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
> Host: yahoo.com
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Date: Sat, 22 Jan 2011 20:36:32 GMT
< Location: http://www.yahoo.com/
< Vary: Accept-Encoding
< Content-Type: text/html; charset=utf-8
< Cache-Control: private
< Age: 0
< Transfer-Encoding: chunked
< Connection: keep-alive
< Server: YTS/1.18.5

In the output above, the request follows the request message format defined in the HTTP 1.1 specification (hereafter referred to as “the spec”):

Request = Request-Line
          *(( general-header
            | request-header
            | entity-header ) CRLF)
            CRLF
            [ message-body ]

 

As an aside, I have to draw attention to the usage of carriage return line feed (CRLF) in there. Douglas Crockford puts this in perspective in volume 1 of his lecture series Crockford on JavaScript:


One thing that is odd about ASCII is that it has a carriage return character and a line feed character. This was to model the way that Teletypes actually worked, where the carriage return character would take the print element and push it over to the left. The line feed character would take the platen and spin it one line. So most lines are going to end with going back and rolling the paper, and it took two separate codes to do that. Most timesharing systems didn’t require people to type in both codes — generally they would allow people to hit the return key, and then they would echo the line space key, just because there’s no reason to make people type both characters. Also, other devices don’t work that way. Most other printers of the time would just take a line of text and print it and advance; there was no way to separate the carriage return from the line feed function. So this was a pretty device specific thing.

Most systems who adopted ASCII as their character set chose one or the other. The systems that tended to be more hardware focused in their orientation tended to pick line feed, and the systems that tended to be more human focused tended to pick carriage return, and that was fine until they needed to interoperate. Then you’d have a committee of people, some using line feed, some using carriage return — how do you resolve that? You could just pick one. You could even flip a coin, because it really doesn’t matter. But these committees could not decide. Nobody wanted to be the guy who got it wrong, and nobody wanted to be the guy who had to change, so they came up with a mutually disagreeable compromise, which is: We will always require both. So that’s the way the internet protocols work. We haven’t been using Teletype machines in I don’t know how many years — they’re decades obsolete — but we’re still forcing both sets of control codes to be transmitted in HTTP because of this Teletype heritage.

Back to the yahoo.com example, the Request-Line is “GET / HTTP/1.1”. The spec defines the components of this line as:

Request-Line = Method SP Request-URI SP HTTP-Version CRLF

The Method is “GET”, the Request-URI is “/” (relative to the domain being called, i.e., we’re requesting the root of yahoo.com), the HTTP-Version is “HTTP/1.1”.

We also have a few headers in there: User-Agent, Host, and Accept.  The spec defines headers as follows:

The request-header fields allow the client to pass additional information about the request, and about the client itself, to the server. These fields act as request modifiers, with semantics equivalent to the parameters on a programming language method invocation.

All of the headers in our request are request-headers, as opposed to general- or entity-headers.  The User-Agent header describes client making the request.  It is optional, but helpful for the service receiving the request, and so “User agents SHOULD include this field with requests.”  The Host header is required (a “client MUST include a Host header field in all HTTP/1.1 request messages”).The Accept header tells the server what type of media is acceptable for the response.  In the yahoo.com request, we’re saying all media types are acceptable, i.e., we’ll accept PDFs, HTML, RSS, etc.

We don’t have a message-body, so there’s not much more to say about the request message.

 

 

Responses

The format for responses is very similar to that for requests:

Response = Status-Line
           *(( general-header
            | response-header
            | entity-header ) CRLF)
            CRLF
            [ message-body ]

 

In Yahoo!’s response, the Status-Line is “HTTP/1.1 301 Moved Permanently”, which, when broken into its constituents, tells us that the HTTP-Version of the message format is “HTTP/1.1”, the Status-Code is “301”, and the Reason-Phrase is “Moved Permanently”. The spec says “The Status-Code is intended for use by automata and the Reason-Phrase is intended for the human user. The client is not required to examine or display the Reason-Phrase.”

The first digit of the Status-Code communicates the general type of the response:

– 1xx: Informational – Request received, continuing process

– 2xx: Success – The action was successfully received, understood, and accepted

– 3xx: Redirection – Further action must be taken in order to complete the request

– 4xx: Client Error – The request contains bad syntax or cannot be fulfilled

– 5xx: Server Error – The server failed to fulfill an apparently valid request

Specific, pre-defined Status-Codes are described in detail by the spec, but the spec is extensible, so services can define their own codes. For example, Yahoo! and Twitter will return 999 and 420, respectively, for requests exceeding rate limits.

When a service returns a custom Status-Code unknown to the client, the Reason-Phrase can help a user determine the status of the response. The spec doesn’t explicitly state this, but it seems like the Reason-Phrase is arbitrary. Twitter’s Reason-Phrase for 420 made me laugh out loud: Enhance Your Calm. I love web services with a sense of humor.

Yahoo!’s response contained several headers: Date, Location, Vary, Content-Type, Cache-Control, Age, Transfer-Encoding, Connection, and Server.

The Date general-header communicates the time at which the response was generated. I can see how this would be helpful for debugging clock-skew issues in signed requests.

The Location response-header “is used to redirect the recipient to a location other than the Request-URI for completion of the request or identification of a new resource”. I most often see Location used with 3xx responses, i.e., redirect to this location, but I recently learned of another use, one that’s actually called out in the spec:

For 201 (Created) responses, the Location is that of the new resource which was created by the request. For 3xx responses, the location SHOULD indicate the server’s preferred URI for automatic redirection to the resource.

This seems intuitive to me. Suppose we make a request to create a new object, e.g.,
curl -X POST http://example.com/new/resource, then it makes sense that example.com would return 201 with a Location header pointing to the new resource.

The spec states that the Vary response-header “indicates the set of request-header fields that fully determines, while the response is fresh, whether a cache is permitted to use the response to reply to a subsequent request without revalidation”, but its use is still a bit unclear to me. Fortunately, Subbu Allamaraju, author of O’Reilly’s RESTful Web Services Cookbook, posted an informative analysis of the Vary header on his blog. According to his write-up a “server can use this response header to indicate the client of the list of request headers it uses to resolve a given URI to a representation”.

In the yahoo.com example, Yahoo! is telling us that it uses the Accept-Encoding request-header to determine which representation of the resource to return. In other words, requesting yahoo.com gzipped via Accept-Encoding: gzip will result in a different representation of the resource than requesting yahoo.com uncompressed. If a user agent knows this, it can cache the returned resource accordingly.

The Content-Type entity-header “indicates the media type of the entity-body sent to the recipient or, in the case of the HEAD method, the media type that would have been sent had the request been a GET.” In our case, Yahoo! sent us back text/html, using the utf-8 charset.

The Cache-Control general-header “is used to specify directives that MUST be obeyed by all caching mechanisms along the request/response chain.” In the yahoo.com response, Cache-Control is set to “private” meaning “all or part of the response message is intended for a single user and MUST NOT be cached by a shared cache”. This makes sense because Yahoo! displays private data on its home page for logged-in users, and we wouldn’t want this content cached and displayed to other users.

The Age response-header “conveys the sender’s estimate of the amount of time since the response (or its revalidation) was generated at the origin server”, literally, the age of the resource. The value is given in seconds. So our response was 0 seconds of age.

The Transfer-Encoding general-header “indicates what (if any) type of transformation has been applied to the message body in order to safely transfer it between the sender and the recipient”. Transfer-Encoding differs from Content-Encoding in that the former refers to the transmission while the latter refers to the entity being transmitted. Yahoo!’s response was “chunked“, meaning the message body is transmitted in a series of pieces, as defined by the spec.

Connection is a general-header that defines the TCP connection behavior for communication between the client and server. In Yahoo!’s response, the Connection header is set to keep-alive. Maintaining a persistent connection is more efficient than opening and closing connections for each request/response, and HTTP 1.1 made persistence the default behavior, but for backwards compatibility, servers can also send Connection: keep-alive to maintain a connection that would otherwise be closed.

The Server response-header simply communicates the server that handled the request. In Yahoo!’s case, its Yahoo! Traffic Server (YTS), a.k.a. Apache Traffic Server.

Conclusion

In short, curl -v and the spec are our friends.  HTTP is a standard for transmitting hypertext and defines things such as request methods and response codes.  HTTP interactions consist of requests and responses.  Requests look something like this:

GET / HTTP/1.1
headers \r\n
\r\n
message-body

 

and responses look something like this:

HTTP/1.1 200 OK
headers \r\n
\r\n
message-body