Host header#

HTTP request must have Host header. This is mandatory

  • Host header is just a domain name or IP of the web server and the port, it is “extracted” from URL.
  • For example, when accessing https://example.com:8080, the request has header Host: example.com:8080.
  • Web server can use Host header to determine which website to serve. An example is Content-Delivery Network(CDN)

CDNs#

CDNs are just reverse proxies that caches web pages.

  • When users access a page, like /blog/1, CDN fetches that from the real web server, then stores that for a while.
  • Other users who goes to /blog/1 will be served stored response from CDN, so the real web server doesn’t have to work hard

HTTP is Stateless#

HTTP is a stateless protocol

  • Meaning each request is viewed in isolation, no connection to the previous request
  • To fix that, web servers use Cookies to “remember” context, making it pseudo dynamic

TCP Socket Reuse#

In addition, HTTP 1.1 can reuse a TCP socket to send multiple requests.

  • When reuse a TCP socket, multiple TCP packets is sent using the same socket, less handshakes, less overhead, more data
  • However, the TCP is stream-oriented, so multiple packets combined into a stream
  • And HTTP on the application layer only received the raw data from the TCP stream, so the web server needs a way to separate each HTTP request
  • Web servers uses Content-Length or Transfer-Encoding to know the length of each request’s body

HTTP 1.1 vs HTTP 2#

HTTP 1.1 is string-based protocol, HTTP 2 is binary based. HTTP 2 is more efficient

  • HTTP 2 uses a built-in mechanism to specify the length of the request’s body.
  • HTTP/2 implements measures that effectively prevent request smuggling attacks entirely
  • In some deployments, HTTP 2 requests are rewritten to HTTP 1.1 by an intermediary system before being forwarded to the web server