Tag Archives: Layer 7

7.3.6. Fetching HTTP samples (Layer 7)



It is possible to fetch samples from HTTP contents, requests and responses.
This application layer is also called layer 7. It is only possible to fetch the data in this section when a full HTTP request or response has been parsed from its respective request or response buffer. This is always the case with all HTTP specific rules and for sections running with “mode http”. When using TCP content inspection, it may be necessary to support an inspection delay in order to let the request or response come in first. These fetches may require a bit more CPU resources than the layer 4 ones, but not much since the request and response are indexed.

base : string
This returns the concatenation of the first Host header and the path part of the request, which starts at the first slash and ends before the question mark. It can be useful in virtual hosted environments to detect URL abuses as well as to improve shared caches efficiency. Using this with a limited size stick table also allows one to collect statistics about most commonly requested objects by host/path. With ACLs it can allow simple content switching rules involving the host and the path at the same time, such as www.example.com/favicon.ico”. See also “path” and “uri”.

ACL derivatives :

    base     : exact string match
    base_beg : prefix match
    base_dir : subdir match
    base_dom : domain match
    base_end : suffix match
    base_len : length match
    base_reg : regex match
    base_sub : substring match

base32 : integer
This returns a 32-bit hash of the value returned by the “base” fetch method above. This is useful to track per-URL activity on high traffic sites without having to store all URLs. Instead a shorter hash is stored, saving a lot of memory. The output type is an unsigned integer.

base32+src : binary
This returns the concatenation of the base32 fetch above and the src fetch below. The resulting type is of type binary, with a size of 8 or 20 bytes depending on the source address family. This can be used to track per-IP, per-URL counters.

capture.req.hdr(<idx>) : string
This extracts the content of the header captured by the “capture request header”, idx is the position of the capture keyword in the configuration. The first entry is an index of 0. See also: “capture request header”.

capture.req.method : string
This extracts the METHOD of an HTTP request. It can be used in both request and response. Unlike “method”, it can be used in both request and response because it’s allocated.

capture.req.uri : string
This extracts the request’s URI, which starts at the first slash and ends before the first space in the request (without the host part). Unlike “path” and “url”, it can be used in both request and response because it’s allocated.

capture.req.ver : string
This extracts the request’s HTTP version and returns either “HTTP/1.0” or “HTTP/1.1”. Unlike “req.ver”, it can be used in both request, response, and logs because it relies on a persistent flag.

capture.res.hdr(<idx>) : string
This extracts the content of the header captured by the “capture response header”, idx is the position of the capture keyword in the configuration. The first entry is an index of 0. See also: “capture response header”

capture.res.ver : string
This extracts the response’s HTTP version and returns either “HTTP/1.0” or “HTTP/1.1”. Unlike “res.ver”, it can be used in logs because it relies on a persistent flag.

req.cook([<name>]) : string
cook([<name>]) : string (deprecated)
This extracts the last occurrence of the cookie name <name> on a “Cookie” header line from the request, and returns its value as string. If no name is specified, the first cookie value is returned. When used with ACLs, all matching cookies are evaluated. Spaces around the name and the value are ignored as requested by the Cookie header specification (RFC6265). The cookie name is case-sensitive. Empty cookies are valid, so an empty cookie may very well return an empty value if it is present. Use the “found” match to detect presence. Use the res.cook() variant for response cookies sent by the server.

ACL derivatives :

    cook([<name>])     : exact string match
    cook_beg([<name>]) : prefix match
    cook_dir([<name>]) : subdir match
    cook_dom([<name>]) : domain match
    cook_end([<name>]) : suffix match
    cook_len([<name>]) : length match
    cook_reg([<name>]) : regex match
    cook_sub([<name>]) : substring match

req.cook_cnt([<name>]) : integer
cook_cnt([<name>]) : integer (deprecated)
Returns an integer value representing the number of occurrences of the cookie <name> in the request, or all cookies if <name> is not specified.

req.cook_val([<name>]) : integer
cook_val([<name>]) : integer (deprecated)
This extracts the last occurrence of the cookie name <name> on a “Cookie” header line from the request, and converts its value to an integer which is returned. If no name is specified, the first cookie value is returned. When used in ACLs, all matching names are iterated over until a value matches.

cookie([<name>]) : string (deprecated)
This extracts the last occurrence of the cookie name <name> on a “Cookie” header line from the request, or a “Set-Cookie” header from the response, and returns its value as a string. A typical use is to get multiple clients sharing a same profile use the same server. This can be similar to what “appsession” does with the “request-learn” statement, but with support for multi-peer synchronization and state keeping across restarts. If no name is specified, the first cookie value is returned. This fetch should not be used anymore and should be replaced by req.cook() or res.cook() instead as it ambiguously uses the direction based on the context where it is used. See also : “appsession”.

hdr([<name>[,<occ>]]) : string
This is equivalent to req.hdr() when used on requests, and to res.hdr() when used on responses. Please refer to these respective fetches for more details. In case of doubt about the fetch direction, please use the explicit ones. Note that contrary to the hdr() sample fetch method, the hdr_* ACL keywords unambiguously apply to the request headers.

req.fhdr(<name>[,<occ>]) : string
This extracts the last occurrence of header <name> in an HTTP request. When used from an ACL, all occurrences are iterated over until a match is found.
Optionally, a specific occurrence might be specified as a position number.
Positive values indicate a position from the first occurrence, with 1 being the first one. Negative values indicate positions relative to the last one, with -1 being the last one. It differs from req.hdr() in that any commas present in the value are returned and are not used as delimiters. This is sometimes useful with headers such as User-Agent.

req.fhdr_cnt([<name>]) : integer
Returns an integer value representing the number of occurrences of request header field name <name>, or the total number of header fields if <name> is not specified. Contrary to its req.hdr_cnt() cousin, this function returns the number of full line headers and does not stop on commas.

req.hdr([<name>[,<occ>]]) : string
This extracts the last occurrence of header <name> in an HTTP request. When used from an ACL, all occurrences are iterated over until a match is found. Optionally, a specific occurrence might be specified as a position number. Positive values indicate a position from the first occurrence, with 1 being the first one. Negative values indicate positions relative to the last one, with -1 being the last one. A typical use is with the X-Forwarded-For header once converted to IP, associated with an IP stick-table. The function considers any comma as a delimiter for distinct values. If full-line headers are desired instead, use req.fhdr(). Please carefully check RFC2616 to know how certain headers are supposed to be parsed. Also, some of them are case insensitive (eg: Connection).

ACL derivatives :

    hdr([<name>[,<occ>]])     : exact string match
    hdr_beg([<name>[,<occ>]]) : prefix match
    hdr_dir([<name>[,<occ>]]) : subdir match
    hdr_dom([<name>[,<occ>]]) : domain match
    hdr_end([<name>[,<occ>]]) : suffix match
    hdr_len([<name>[,<occ>]]) : length match
    hdr_reg([<name>[,<occ>]]) : regex match
    hdr_sub([<name>[,<occ>]]) : substring match

req.hdr_cnt([<name>]) : integer
hdr_cnt([<header>]) : integer (deprecated)
Returns an integer value representing the number of occurrences of request header field name <name>, or the total number of header field values if <name> is not specified. It is important to remember that one header line may count as several headers if it has several values. The function considers any comma as a delimiter for distinct values. If full-line headers are desired instead, req.fhdr_cnt() should be used instead. With ACLs, it can be used to detect presence, absence or abuse of a specific header, as well as to block request smuggling attacks by rejecting requests which contain more than one of certain headers. See “req.hdr” for more information on header matching.

req.hdr_ip([<name>[,<occ>]]) : ip
hdr_ip([<name>[,<occ>]]) : ip (deprecated)
This extracts the last occurrence of header <name> in an HTTP request, converts it to an IPv4 or IPv6 address and returns this address. When used with ACLs, all occurrences are checked, and if <name> is omitted, every value of every header is checked. Optionally, a specific occurrence might be specified as a position number. Positive values indicate a position from the first occurrence, with 1 being the first one. Negative values indicate positions relative to the last one, with -1 being the last one. A typical use is with the X-Forwarded-For and X-Client-IP headers.

req.hdr_val([<name>[,<occ>]]) : integer
hdr_val([<name>[,<occ>]]) : integer (deprecated)
This extracts the last occurrence of header <name> in an HTTP request, and converts it to an integer value. When used with ACLs, all occurrences are checked, and if <name> is omitted, every value of every header is checked.
Optionally, a specific occurrence might be specified as a position number. Positive values indicate a position from the first occurrence, with 1 being the first one. Negative values indicate positions relative to the last one, with -1 being the last one. A typical use is with the X-Forwarded-For header.

http_auth(<userlist>) : boolean
Returns a boolean indicating whether the authentication data received from the client match a username & password stored in the specified userlist. This fetch function is not really useful outside of ACLs. Currently only http basic auth is supported.

http_auth_group(<userlist>) : string
Returns a string corresponding to the user name found in the authentication data received from the client if both the user name and password are valid according to the specified userlist. The main purpose is to use it in ACLs where it is then checked whether the user belongs to any group within a list. This fetch function is not really useful outside of ACLs. Currently only http basic auth is supported.

ACL derivatives :
http_auth_group(<userlist>) : group …
Returns true when the user extracted from the request and whose password is valid according to the
specified userlist belongs to at least one of the groups.

http_first_req : boolean
Returns true when the request being processed is the first one of the connection. This can be used to add or remove headers that may be missing from some requests when a request is not the first one, or to help grouping requests in the logs.

method : integer + string
Returns an integer value corresponding to the method in the HTTP request. For example, “GET” equals 1 (check sources to establish the matching). Value 9 means “other method” and may be converted to a string extracted from the stream. This should not be used directly as a sample, this is only meant to be used from ACLs, which transparently convert methods from patterns to these integer + string values. Some predefined ACL already check for most common methods.

ACL derivatives :
method : case insensitive method match

Example :

      # only accept GET and HEAD requests
      acl valid_method method GET HEAD
      http-request deny if ! valid_method

path : string
This extracts the request’s URL path, which starts at the first slash and ends before the question mark (without the host part). A typical use is with prefetch-capable caches, and with portals which need to aggregate multiple information from databases and keep them in caches. Note that with outgoing caches, it would be wiser to use “url” instead. With ACLs, it’s typically used to match exact file names (eg: “/login.php”), or directory parts using the derivative forms. See also the “url” and “base” fetch methods.

ACL derivatives :

    path     : exact string match
    path_beg : prefix match
    path_dir : subdir match
    path_dom : domain match
    path_end : suffix match
    path_len : length match
    path_reg : regex match
    path_sub : substring match

req.ver : string
req_ver : string (deprecated)
Returns the version string from the HTTP request, for example “1.1”. This can be useful for logs, but is mostly there for ACL. Some predefined ACL already check for versions 1.0 and 1.1.

ACL derivatives :

    req_ver : exact string match

res.comp : boolean
Returns the boolean “true” value if the response has been compressed by HAProxy, otherwise returns boolean “false”. This may be used to add information in the logs.

res.comp_algo : string
Returns a string containing the name of the algorithm used if the response was compressed by HAProxy, for example : “deflate”. This may be used to add some information in the logs.

res.cook([<name>]) : string
scook([<name>]) : string (deprecated)
This extracts the last occurrence of the cookie name <name> on a “Set-Cookie” header line from the response, and returns its value as string. If no name is specified, the first cookie value is returned.

ACL derivatives :

    scook([<name>] : exact string match

res.cook_cnt([<name>]) : integer
scook_cnt([<name>]) : integer (deprecated)
Returns an integer value representing the number of occurrences of the cookie <name> in the response, or all cookies if <name> is not specified. This is mostly useful when combined with ACLs to detect suspicious responses.

res.cook_val([<name>]) : integer
scook_val([<name>]) : integer (deprecated)
This extracts the last occurrence of the cookie name <name> on a “Set-Cookie” header line from the response, and converts its value to an integer which is returned. If no name is specified, the first cookie value is returned.

res.fhdr([<name>[,<occ>]]) : string
This extracts the last occurrence of header <name> in an HTTP response, or of the last header if no <name> is specified. Optionally, a specific occurrence might be specified as a position number. Positive values indicate a position from the first occurrence, with 1 being the first one. Negative values indicate positions relative to the last one, with -1 being the last one. It differs from res.hdr() in that any commas present in the value are returned and are not used as delimiters. If this is not desired, the res.hdr() fetch should be used instead. This is sometimes useful with headers such as Date or Expires.

res.fhdr_cnt([<name>]) : integer
Returns an integer value representing the number of occurrences of response header field name <name>, or the total number of header fields if <name> is not specified. Contrary to its res.hdr_cnt() cousin, this function returns the number of full line headers and does not stop on commas. If this is not desired, the res.hdr_cnt() fetch should be used instead.

res.hdr([<name>[,<occ>]]) : string
shdr([<name>[,<occ>]]) : string (deprecated)
This extracts the last occurrence of header <name> in an HTTP response, or of the last header if no <name> is specified. Optionally, a specific occurrence might be specified as a position number. Positive values indicate a position from the first occurrence, with 1 being the first one. Negative values indicate positions relative to the last one, with -1 being the last one. This can be useful to learn some data into a stick-table. The function considers any comma as a delimiter for distinct values. If this is not desired, the res.fhdr() fetch should be used instead.

ACL derivatives :

    shdr([<name>[,<occ>]])     : exact string match
    shdr_beg([<name>[,<occ>]]) : prefix match
    shdr_dir([<name>[,<occ>]]) : subdir match
    shdr_dom([<name>[,<occ>]]) : domain match
    shdr_end([<name>[,<occ>]]) : suffix match
    shdr_len([<name>[,<occ>]]) : length match
    shdr_reg([<name>[,<occ>]]) : regex match
    shdr_sub([<name>[,<occ>]]) : substring match

res.hdr_cnt([<name>]) : integer
shdr_cnt([<name>]) : integer (deprecated)
Returns an integer value representing the number of occurrences of response header field name <name>, or the total number of header fields if <name> is not specified. The function considers any comma as a delimiter for distinct values. If this is not desired, the res.fhdr_cnt() fetch should be used instead.

res.hdr_ip([<name>[,<occ>]]) : ip
shdr_ip([<name>[,<occ>]]) : ip (deprecated)
This extracts the last occurrence of header <name> in an HTTP response, convert it to an IPv4 or IPv6 address and returns this address. Optionally, a specific occurrence might be specified as a position number. Positive values indicate a position from the first occurrence, with 1 being the first one. Negative values indicate positions relative to the last one, with -1 being the last one. This can be useful to learn some data into a stick table.

res.hdr_val([<name>[,<occ>]]) : integer
shdr_val([<name>[,<occ>]]) : integer (deprecated)
This extracts the last occurrence of header <name> in an HTTP response, and converts it to an integer value. Optionally, a specific occurrence might be specified as a position number. Positive values indicate a position from the first occurrence, with 1 being the first one. Negative values indicate positions relative to the last one, with -1 being the last one. This can be useful to learn some data into a stick table.

res.ver : string
resp_ver : string (deprecated)
Returns the version string from the HTTP response, for example “1.1”. This can be useful for logs, but is mostly there for ACL.

ACL derivatives :

    resp_ver : exact string match

set-cookie([<name>]) : string (deprecated)
This extracts the last occurrence of the cookie name <name> on a “Set-Cookie” header line from the response and uses the corresponding value to match. This can be comparable to what “appsession” does with default options, but with support for multi-peer synchronization and state keeping across restarts.

This fetch function is deprecated and has been superseded by the “res.cook” fetch. This keyword will disappear soon.

See also : “appsession”

status : integer
Returns an integer containing the HTTP status code in the HTTP response, for example, 302. It is mostly used within ACLs and integer ranges, for example, to remove any Location header if the response is not a 3xx.

url : string
This extracts the request’s URL as presented in the request. A typical use is with prefetch-capable caches, and with portals which need to aggregate multiple information from databases and keep them in caches. With ACLs, using “path” is preferred over using “url”, because clients may send a full URL as is normally done with proxies. The only real use is to match “*” which does not match in “path”, and for which there is already a predefined ACL. See also “path” and “base”.

ACL derivatives :

    url     : exact string match
    url_beg : prefix match
    url_dir : subdir match
    url_dom : domain match
    url_end : suffix match
    url_len : length match
    url_reg : regex match
    url_sub : substring match

url_ip : ip
This extracts the IP address from the request’s URL when the host part is presented as an IP address. Its use is very limited. For instance, a monitoring system might use this field as an alternative for the source IP in order to test what path a given source address would follow, or to force an entry in a table for a given source address. With ACLs it can be used to restrict access to certain systems through a proxy, for example when combined with option “http_proxy”.

url_port : integer
This extracts the port part from the request’s URL. Note that if the port is not specified in the request, port 80 is assumed. With ACLs it can be used to restrict access to certain systems through a proxy, for example when combined with option “http_proxy”.

urlp(<name>[,<delim>]) : string
url_param(<name>[,<delim>]) : string
This extracts the first occurrence of the parameter <name> in the query string, which begins after either ‘?’ or <delim>, and which ends before ‘&’, ‘;’ or <delim>. The parameter name is case-sensitive. The result is a string corresponding to the value of the parameter <name> as presented in the request (no URL decoding is performed). This can be used for session stickiness based on a client ID, to extract an application cookie passed as a URL parameter, or in ACLs to apply some checks. Note that the ACL version of this fetch do not iterate over multiple parameters and stop at the first one as well.

ACL derivatives :

    urlp(<name>[,<delim>])     : exact string match
    urlp_beg(<name>[,<delim>]) : prefix match
    urlp_dir(<name>[,<delim>]) : subdir match
    urlp_dom(<name>[,<delim>]) : domain match
    urlp_end(<name>[,<delim>]) : suffix match
    urlp_len(<name>[,<delim>]) : length match
    urlp_reg(<name>[,<delim>]) : regex match
    urlp_sub(<name>[,<delim>]) : substring match

Example :

      # match http://example.com/foo?PHPSESSIONID=some_id
      stick on urlp(PHPSESSIONID)
      # match http://example.com/foo;JSESSIONID=some_id
      stick on urlp(JSESSIONID,;)

urlp_val(<name>[,<delim>]) : integer
See “urlp” above. This one extracts the URL parameter <name> in the request and converts it to an integer value. This can be used for session stickiness based on a user ID for example, or with ACLs to match a page number or price.

Share Button