Tag Archives: HTTP request

1.2.1. The Request line



Line 1 is the “request line”. It is always composed of 3 fields :

  • a METHOD : GET
  • a URI : /serv/login.php?lang=en&profile=2
  • a version tag : HTTP/1.1

All of them are delimited by what the standard calls LWS (linear white spaces), which are commonly spaces, but can also be tabs or line feeds/carriage returns followed by spaces/tabs. The method itself cannot contain any colon (‘:’) and is limited to alphabetic letters. All those various combinations make it desirable that HAProxy performs the splitting itself rather than leaving it to the user to write a complex or inaccurate regular expression.

The URI itself can have several forms :

  • A “relative URI” :
  • /serv/login.php?lang=en&profile=2

    It is a complete URL without the host part. This is generally what is received by servers, reverse proxies and transparent proxies.

  • An “absolute URI”, also called a “URL” :
  • http://192.168.0.12:8080/serv/login.php?lang=en&profile=2

    It is composed of a “scheme” (the protocol name followed by ‘://’), a host name or address, optionally a colon (‘:’) followed by a port number, then a relative URI beginning at the first slash (‘/’) after the address part.
    This is generally what proxies receive, but a server supporting HTTP/1.1 must accept this form too.

  • a star (‘*’) :
  • This form is only accepted in association with the OPTIONS method and is not reliable. It is used to inquiry a next hop’s capabilities.

  • an address:port combination : 192.168.0.12:80
  • This is used with the CONNECT method, which is used to establish TCP tunnels through HTTP proxies, generally for HTTPS, but sometimes for other protocols too.

    In a relative URI, two sub-parts are identified. The part before the question mark is called the “path”. It is typically the relative path to static objects on the server. The part after the question mark is called the “query string”.

    It is mostly used with GET requests sent to dynamic scripts and is very specific to the language, framework or application in use.

    Share Button

1.2.2. The request headers



The headers start at the second line. They are composed of a name at the beginning of the line, immediately followed by a colon (‘:’). Traditionally, an LWS is added after the colon but that’s not required. Then come the values.

Multiple identical headers may be folded into one single line, delimiting the values with commas, provided that their order is respected. This is commonly encountered in the “Cookie:” field. A header may span over multiple lines if the subsequent lines begin with an LWS. In the example in 1.2, lines 4 and 5 define a total of 3 values for the “Accept:” header.

Contrary to a common mis-conception, header names are not case-sensitive, and their values are not either if they refer to other header names (such as the “Connection:” header).

The end of the headers is indicated by the first empty line. People often say that it’s a double line feed, which is not exact, even if a double line feed is one valid form of empty line.

Fortunately, HAProxy takes care of all these complex combinations when indexing headers, checking values and counting them, so there is no reason to worry about the way they could be written, but it is important not to accuse an application of being buggy if it does unusual, valid things.

Important note:
As suggested by RFC2616, HAProxy normalizes headers by replacing line breaks in the middle of headers by LWS in order to join multi-line headers. This is necessary for proper analysis and helps less capable HTTP parsers to work correctly and not to be fooled by such complex constructs.

Share Button