kafsemo.org

Talking HTTP/2

2015-01-08

While talking HTTP/1.1 is possible with nothing but an invocation of telnet (see Talking HTTP/0.9, 1.0, 1.1), HTTP/2 is a binary protocol. Draft 16 has been put forward for Last Call.

Let’s talk it!

HTTP/2 is very different from the established, textual HTTP/1.1, and may not even be supported by the server we’re talking to. If we started with an HTTP/1.1 connection, switching protocols is exactly what the Upgrade header has been waiting around for since it was introduced in 1997. As per HTTP/2 Version Identification, we want to upgrade to h2c-<draft>, or h2-<draft> over TLS. Use of these, rather than the expected (and planned) ‘HTTP/2’ is controversial but part of the standard.

There’s one extra mandatory header (HTTP2-Options) but, from the description in draft 14, an empty header is a valid way to use defaults.

The nghttp2 project has made a server available that understands draft 14 and an Upgrade from HTTP/1.1, on port 80 of nghttp2.org.

Request

GET / HTTP/1.1
Host: nghttp2.org
Connection: Upgrade, HTTP2-Settings
Upgrade: h2c-14
HTTP2-Settings:

Response

HTTP/1.1 101 Switching Protocols
Connection: Upgrade
Upgrade: h2c-14

<binary>
<plaintext response>

Excellent! We’ve just received an HTTP/2 (draft 14) response! The headers are unreadable, because they’re binary, but the plaintext response is clearly visible.

Let’s try the same thing against google.com:80:

Response

HTTP/1.1 400 Bad Request
Content-Type: text/html; charset=UTF-8
Content-Length: 1419
Date: Fri, 02 Jan 2015 09:46:59 GMT
Server: GFE/2.0
Alternate-Protocol: 80:quic,p=0.02

<!DOCTYPE html>
<html lang=en>

Hmm. 400’s not great. Let’s try twitter.com:80:

Response

HTTP/1.1 200 OK

They’re not rejecting a valid request, like Google, but they’re not upgrading to HTTP/2 either.

NPN, ALPN

Although the Upgrade: header can be used to upgrade an HTTP/1.1 connection to HTTP/2, that’s not universally supported by browsers and servers that have decided not to support HTTP/2 over non-TLS connections.

For http/2 for https://, two upgrade mechanisms are implemented at the TLS layer, NPN and ALPN. NPN is already deprecated, ahead of widespread support for ALPN, and the current HTTP/2 draft specifies ALPN (as RFC 7301).

As the list of HTTP/2 implementations shows, there’s a mix of support for Upgrade, NPN, ALPN and direct HTTP/2 connections.

For SSL we can’t use telnet anymore; OpenSSL’s s_client is one alternative:

Request

openssl s_client -connect twitter.com:443 -nextprotoneg ''

Response

CONNECTED(00000003)
Protocols advertised by server: h2-15, spdy/3.1, http/1.1

So Twitter supports draft 15 of HTTP/2. We can open a connection with:

openssl s_client -connect twitter.com:443 -nextprotoneg 'h2-15'

But we’re making a direct HTTP/2 request now, rather than regular HTTP/1.1 and requesting an upgrade, so we need to fashion a real binary request.

An HTTP/2 request

First up, a magic fingerprint just to confirm that, even after all the negotation already, I’m definitely talking HTTP/2:

#!/usr/bin/python3

prelim = bytes.fromhex('505249202a20485454502f322e300d0a0d0a534d0d0a0d0a')

The rest is a series of frames, payloads with types, lengths and a few flags set. First up, a SETTINGS frame. As before, an empty one is fine:

settings = struct.pack('>IBBI', 0, 0x04, 0, 0)[1:]

(The first length field is 24-bit; here I pack it as 32-bit and then skip the high byte.)

I should also ACK the SETTINGS that the server is going to send:

settingsAck = struct.pack('>IBBI', 0, 0x04, 0x01, 0)[1:]

Here, the 0x01 flag indicates that this is an ACK.

Now, the GET. Since this is a request with no body I simply need to provide the headers, including pseudo-headers for values that would previous have appeared as the first line of an HTTP/1.1 request:

host='twitter.com'

headerDict = {
    ':authority': host,
    ':method': 'GET',
    ':scheme': 'https',
    ':path': '/'
}

The flags here indicate that this is the only header frame and also the entirity of the request:

# type=0x01 HEADERS
# flags = END_STREAM | END_HEADERS
frame = struct.pack('>IBBI', len(header), 0x01, flags, 1)[1:] + header

The headers are encoded using HPACK, a parallel specification to HTTP/2. It describes an elaborate system of default header values and stateful compression that make it extremely efficient to send very common headers, and repeated headers, along with Huffman encoding for the values.

Luckily, we can ignore all that and send literal headers with no indexing:

def lengthed_string(s):
  b = s.encode('us-ascii')
  return struct.pack('B', len(b)) + b

def gen_headers(m):
  h = b''
  for k in m:
    v = m[k]
    h = h + struct.pack('B', 0) + lengthed_string(k) + lengthed_string(v)
  return h

headerPayload = gen_headers(headerDict)

Now, put those together into our first HTTP/2 request:

from sys import stdout

stdout.buffer.write(prelim)
stdout.buffer.write(settings)
stdout.buffer.write(settingsAck)
stdout.buffer.write(frame)

Invoke that, and send it over NPN’d SSL, keeping openssl‘s stdin open to keep it from exiting:

{ ./send-request.py; cat; } | openssl s_client -connect twitter.com:443 -nextprotoneg 'h2-15' | less

Inamidst the SSL debugging output and plaintext payload we see what we were after: a fully HTTP/2 response to our HTTP/2 request.

An HTTP/2 response

But sending requests is only half of the web. We want to make sense of what’s being sent back.

It’s not quite a reference implementation, but here’s enough Python to decode frame boundaries. We’ll also go a bit further and show the connection settings that the server wants to use (see Defined SETTINGS parameters for meanings).

#!/usr/bin/python3

import struct
import sys

def decode_SETTINGS(b):
  print(' Settings:')
  while b:
    (i, v) = struct.unpack('>HI', b[:6])
    print('  %d = %d' % (i, v))
    b = b[6:]

b = sys.stdin.buffer.read()

while b:
  (b1, b2, b3) = struct.unpack('BBB', b[:3])
  l = (b1 << 16) | (b2 << 8) | b3
  print('Length: %d' % l)
  (t, f, s) = struct.unpack('>BBI', b[3:9])
  print('Type: %d, flags: %d, stream ID: %d' % (t, f, s))
  payload = b[9:9+l]
  print(payload)

  if t == 0x04:
    decode_SETTINGS(payload)

  b = b[9 + l:]

Response

Length: 6
Type: 4, flags: 0, stream ID: 0
b'\x00\x04\x00\x01\x00\x00'
 Settings:
  4 = 65536
Length: 0
Type: 4, flags: 1, stream ID: 0
b''
 Settings:
Length: 1002
Type: 1, flags: 4, stream ID: 1
b'\x88@\x86\xb9\xdc\xb6 \xc7\xab\x87\xc7\xbf~\xb6\x02\xb8\x7fX\xad\xa8\xeb\x10d
...
c\xc9\x82\x02\xc91~\x89=\x87\xa4\xb0\x07@\x8c\xf2\xb7\x94!j\xec:JD\x98\xf5\x7f\x8a\x0f\xda\x94\x9eB\xc1\x1d\x07\'_'
Length: 2998
Type: 0, flags: 0, stream ID: 1
b'<!DOCTYPE html>\n<!--[if IE 8]><html class="lt-ie10 ie8" lang="en data-scribe-...
atePropagation=function(){};if(i){f.push(a);r("captured",a)}else r("ig'
Length: 7240
Type: 0, flags: 0, stream ID: 1
b'nored",a\n);return!1}function n($){p();for(var a=0,b;b=f[a];a++){var d=$(b.tar
...
ift/en/init.9041729dc08dc4f68fda011758b48149cb878712.js" async></script>\n\n'
Length: 0
Type: 0, flags: 1, stream ID: 1

There are a few things to notice here. SETTINGS_INITIAL_WINDOW_SIZE is being set to 65536, which is already the default. Then, the headers (Type: 4). Twitter aren’t using the same lazy hack I did, so you’d need a proper HPACK decoder to make sense of them. Then, a number of DATA frames ending with one with END_STREAM set.

How about nghttp2.org? They also support h2c-16 over NPN:

openssl s_client -connect nghttp2.org:443 -nextprotoneg h2c-16

Response

Length: 12
Type: 4, flags: 0, stream ID: 0
b'\x00\x03\x00\x00\x00d\x00\x04\x00\x00\xff\xff'
 Settings:
  3 = 100
  4 = 65535

They’re also setting SETTINGS_MAX_CONCURRENT_STREAMS to 100; it’s otherwise unlimited, and this is the recommended minimum.

Microsoft’s implementation is already using ALPN, requiring a build of OpenSSL from source:

~/source/openssl/apps/openssl s_client -connect h2duo.cloudapp.net:443 -alpn 'h2-14'

Response

Length: 18
Type: 4, flags: 0, stream ID: 0
b'\x00\x03\x00\x00\x00d\x00\x04\x00\x00\xff\xff\x00\x07\x00\x00\x00\x02'
 Settings:
  3 = 100
  4 = 65535
  7 = 2

Settings type 7 isn’t defined by the spec, but we’ll try to make sure we only use two of them.

In summary

You can’t talk HTTP/2 by typing and, despite a relatively simple spec, it’s not a weekend hack anymore. Just as you wouldn’t write your own SSL implementation, rolling your own HPACK and HTTP/2 implementations is not really feasible.

In a sense that’s good — widely-used libraries tend to be higher quality. On the downside, anything that increases the barrier to entry can easily reduce diversity.

Far more than HTTP/1.1, HTTP/2 feels specialised. If you’re a large company operating modern web applications for customers on up-to-date browsers, with latency being worth complexity and engineer effort, it’s a win. If you’re after a generic, extensible model with all optimisations left to the appropriate layers in the stack, maybe less so. It’s one of the most notable layering violations since ZFS.

As an engineering effort, it’s ingenious and opinionated. Many people vocally and articulately object to it, on both technical and political grounds. Debate and merits aside, I expect it to improve the browsing experience for the majority of users out there: that’s a good thing, even if it’s not another twenty-year protocol.

(Music: Queens of the Stone Age, “Smooth Sailing”)
(More from this year, or the front page? [K])