Author Archives: admin

1. Quick reminder about HTTP



When haproxy is running in HTTP mode, both the request and the response are fully  analyzed and indexed, thus it becomes possible to build matching criteria on almost anything found in the contents.

However, it is important to understand how HTTP requests and responses are formed, and how HAProxy decomposes them. It will then become easier to write correct rules and to debug existing configurations.

Share Button

1.1. The HTTP transaction model



The HTTP protocol is transaction-driven. This means that each request will lead to one and only one response. Traditionally, a TCP connection is established from the client to the server, a request is sent by the client on the connection, the server responds and the connection is closed. A new request will involve a new connection :

[CON1] [REQ1] … [RESP1] [CLO1] [CON2] [REQ2] … [RESP2] [CLO2] …

In this mode, called the “HTTP close” mode, there are as many connection establishments as there are HTTP transactions. Since the connection is closed by the server after the response, the client does not need to know the content length.

Due to the transactional nature of the protocol, it was possible to improve it to avoid closing a connection between two subsequent transactions. In this mode however, it is mandatory that the server indicates the content length for each response so that the client does not wait indefinitely. For this, a special header is used: “Content-length”. This mode is called the “keep-alive” mode :

[CON] [REQ1] … [RESP1] [REQ2] … [RESP2] [CLO] …

Its advantages are a reduced latency between transactions, and less processing power required on the server side. It is generally better than the close mode, but not always because the clients often limit their concurrent connections to a smaller value.

A last improvement in the communications is the pipelining mode. It still uses keep-alive, but the client does not wait for the first response to send the second request. This is useful for fetching large number of images composing a page :

[CON] [REQ1] [REQ2] … [RESP1] [RESP2] [CLO] …

This can obviously have a tremendous benefit on performance because the network latency is eliminated between subsequent requests. Many HTTP agents do not correctly support pipelining since there is no way to associate a response with the corresponding request in HTTP. For this reason, it is mandatory for the server to reply in the exact same order as the requests were received.

By default HAProxy operates in keep-alive mode with regards to persistent connections: for each connection it processes each request and response, and leaves the connection idle on both sides between the end of a response and the start of a new request.

HAProxy supports 5 connection modes :

  • keep alive : all requests and responses are processed (default)
  • tunnel : only the first request and response are processed, everything else is forwarded with no analysis.
  • passive close : tunnel with “Connection: close” added in both directions.
  • server close : the server-facing connection is closed after the response.
  • forced close : the connection is actively closed after end of response.
Share Button

1.2.1. The Request line



Line 1 is the “request line”. It is always composed of 3 fields :

  • a METHOD : GET
  • a URI : /serv/login.php?lang=en&profile=2
  • a version tag : HTTP/1.1

All of them are delimited by what the standard calls LWS (linear white spaces), which are commonly spaces, but can also be tabs or line feeds/carriage returns followed by spaces/tabs. The method itself cannot contain any colon (‘:’) and is limited to alphabetic letters. All those various combinations make it desirable that HAProxy performs the splitting itself rather than leaving it to the user to write a complex or inaccurate regular expression.

The URI itself can have several forms :

  • A “relative URI” :
  • /serv/login.php?lang=en&profile=2

    It is a complete URL without the host part. This is generally what is received by servers, reverse proxies and transparent proxies.

  • An “absolute URI”, also called a “URL” :
  • http://192.168.0.12:8080/serv/login.php?lang=en&profile=2

    It is composed of a “scheme” (the protocol name followed by ‘://’), a host name or address, optionally a colon (‘:’) followed by a port number, then a relative URI beginning at the first slash (‘/’) after the address part.
    This is generally what proxies receive, but a server supporting HTTP/1.1 must accept this form too.

  • a star (‘*’) :
  • This form is only accepted in association with the OPTIONS method and is not reliable. It is used to inquiry a next hop’s capabilities.

  • an address:port combination : 192.168.0.12:80
  • This is used with the CONNECT method, which is used to establish TCP tunnels through HTTP proxies, generally for HTTPS, but sometimes for other protocols too.

    In a relative URI, two sub-parts are identified. The part before the question mark is called the “path”. It is typically the relative path to static objects on the server. The part after the question mark is called the “query string”.

    It is mostly used with GET requests sent to dynamic scripts and is very specific to the language, framework or application in use.

    Share Button

1.2.2. The request headers



The headers start at the second line. They are composed of a name at the beginning of the line, immediately followed by a colon (‘:’). Traditionally, an LWS is added after the colon but that’s not required. Then come the values.

Multiple identical headers may be folded into one single line, delimiting the values with commas, provided that their order is respected. This is commonly encountered in the “Cookie:” field. A header may span over multiple lines if the subsequent lines begin with an LWS. In the example in 1.2, lines 4 and 5 define a total of 3 values for the “Accept:” header.

Contrary to a common mis-conception, header names are not case-sensitive, and their values are not either if they refer to other header names (such as the “Connection:” header).

The end of the headers is indicated by the first empty line. People often say that it’s a double line feed, which is not exact, even if a double line feed is one valid form of empty line.

Fortunately, HAProxy takes care of all these complex combinations when indexing headers, checking values and counting them, so there is no reason to worry about the way they could be written, but it is important not to accuse an application of being buggy if it does unusual, valid things.

Important note:
As suggested by RFC2616, HAProxy normalizes headers by replacing line breaks in the middle of headers by LWS in order to join multi-line headers. This is necessary for proper analysis and helps less capable HTTP parsers to work correctly and not to be fooled by such complex constructs.

Share Button

1.2. HTTP request



First, let’s consider this HTTP request :

Line        Contents
number
1          GET /serv/login.php?lang=en&profile=2 HTTP/1.1
2          Host: www.mydomain.com
3          User-agent: my small browser
4          Accept: image/jpeg, image/gif
5          Accept: image/png

Share Button

1.3.1. The Response line



Line 1 is the “response line”. It is always composed of 3 fields :

  • a version tag : HTTP/1.1
  • a status code : 200
  • a reason : OK

The status code is always 3-digit. The first digit indicates a general status :

  • 1xx = informational message to be skipped (eg: 100, 101)
  • 2xx = OK, content is following (eg: 200, 206)
  • 3xx = OK, no content following (eg: 302, 304)
  • 4xx = error caused by the client (eg: 401, 403, 404)
  • 5xx = error caused by the server (eg: 500, 502, 503)

Please refer to RFC2616 for the detailed meaning of all such codes. The “reason” field is just a hint, but is not parsed by clients. Anything can be found there, but it’s a common practice to respect the well-established messages. It can be composed of one or multiple words, such as “OK”, “Found”, or “Authentication Required”.

Haproxy may emit the following status codes by itself :

Code When / reason
200 access to stats page, and when replying to monitoring requests
301 when performing a redirection, depending on the configured code
302 when performing a redirection, depending on the configured code
303 when performing a redirection, depending on the configured code
307 when performing a redirection, depending on the configured code
308 when performing a redirection, depending on the configured code
400 for an invalid or too large request
401 when an authentication is required to perform the action (when accessing the stats page)
403 when a request is forbidden by a "block" ACL or "reqdeny" filter
408 when the request timeout strikes before the request is complete
500 when haproxy encounters an unrecoverable internal error, such as a memory allocation failure, which should never happen
502 when the server returns an empty, invalid or incomplete response, or when an "rspdeny" filter blocks the response.
503 when no server was available to handle the request, or in response to monitoring requests which match the "monitor fail" condition
504 when the response timeout strikes before the server responds

The error 4xx and 5xx codes above may be customized (see “errorloc” in section 4.2).

Share Button

1.3.2. The response headers



Response headers work exactly like request headers, and as such, HAProxy uses the same parsing function for both. Please refer to paragraph 1.2.2 for more details.

Share Button

1.3. HTTP response



An HTTP response looks very much like an HTTP request. Both are called HTTP messages. Let’s consider this HTTP response :

Line number Contents
1 HTTP/1.1 200 OK
2 Content-length: 350
3 Content-Type: text/html

As a special case, HTTP supports so called “Informational responses” as status codes 1xx. These messages are special in that they don’t convey any part of the response, they’re just used as sort of a signaling message to ask a client to continue to post its request for instance.

In the case of a status 100 response the requested information will be carried by the next non-100 response message following the informational one. This implies that multiple responses may be sent to a single request, and that this only works when keep-alive is enabled (1xx messages are HTTP/1.1 only). HAProxy handles these messages and is able to correctly forward and skip them, and only process the next non-100 response. As such, these messages are neither logged nor transformed, unless explicitly state otherwise.

Status 101 messages indicate that the protocol is changing over the same connection and that haproxy must switch to tunnel mode, just as if a CONNECT had occurred. Then the Upgrade header would contain additional information about the type of protocol the connection is switching to.

Share Button

2.1. Configuration file format



HAProxy’s configuration process involves 3 major sources of parameters :

  • the arguments from the command-line, which always take precedence
  • the “global” section, which sets process-wide parameters
  • the proxies sections which can take form of “defaults”, “listen”, “frontend” and “backend”

The configuration file syntax consists in lines beginning with a keyword referenced in this manual, optionally followed by one or several parameters delimited by spaces.

If spaces have to be entered in strings, then they must be preceded by a backslash (‘\’) to be escaped. Backslashes also have to be escaped by doubling them.

Share Button

2.2. Time format



Some parameters involve values representing time, such as timeouts. These values are generally expressed in milliseconds (unless explicitly stated otherwise) but may be expressed in any other unit by suffixing the unit to the numeric value.

It is important to consider this because it will not be repeated for every keyword. Supported units are :

  • us : microseconds. 1 microsecond = 1/1000000 second
  • ms : milliseconds. 1 millisecond = 1/1000 second. This is the default.
  • s : seconds. 1s = 1000ms
  • m : minutes. 1m = 60s = 60000ms
  • h : hours. 1h = 60m = 3600s = 3600000ms
  • d : days. 1d = 24h = 1440m = 86400s = 86400000ms
Share Button