6. HTTP header manipulation

In HTTP mode, it is possible to rewrite, add or delete some of the request and response headers based on regular expressions. It is also possible to block a request or a response if a particular header matches a regular expression, which is enough to stop most elementary protocol attacks, and to protect against information leak from the internal network.

If HAProxy encounters an “Informational Response” (status code 1xx), it is able to process all rsp* rules which can allow, deny, rewrite or delete a header, but it will refuse to add a header to any such messages as this is not HTTP-compliant. The reason for still processing headers in such responses is to stop and/or fix any possible information leak which may happen, for instance because another downstream equipment would unconditionally add a header, or if a server name appears there. When such messages are seen, normal processing
still occurs on the next non-informational messages.

This section covers common usage of the following keywords, described in detail in section 4.2 :

  • reqadd <string>
  • reqallow <search>
  • reqiallow <search>
  • reqdel <search>
  • reqidel <search>
  • reqdeny <search>
  • reqideny <search>
  • reqpass <search>
  • reqipass <search>
  • reqrep <search> <replace>
  • reqirep <search> <replace>
  • reqtarpit <search>
  • reqitarpit <search>
  • rspadd <string>
  • rspdel <search>
  • rspidel <search>
  • rspdeny <search>
  • rspideny <search>
  • rsprep <search> <replace>
  • rspirep <search> <replace>

With all these keywords, the same conventions are used. The parameter is a POSIX extended regular expression (regex) which supports grouping through parenthesis (without the backslash). Spaces and other delimiters must be prefixed with a backslash (‘\’) to avoid confusion with a field delimiter.
Other characters may be prefixed with a backslash to change their meaning :

Character Description
\t for a tab
\r for a carriage return (CR)
\n for a new line (LF)
\ to mark a space and differentiate it from a delimiter
\# to mark a sharp and differentiate it from a comment
\\ to use a backslash in a regex
\\\\ to use a backslash in the text (*2 for regex, *2 for haproxy)
\xXX to write the ASCII hex code XX as in the C language

The parameter contains the string to be used to replace the largest portion of text matching the regex. It can make use of the special characters above, and can reference a substring which is delimited by parenthesis in the regex, by writing a backslash (‘\’) immediately followed by one digit from 0 to 9 indicating the group position (0 designating the entire line). This practice is very common to users of the “sed” program.

The parameter represents the string which will systematically be added after the last header line. It can also use special character sequences above.

Notes related to these keywords :

  • these keywords are not always convenient to allow/deny based on header contents. It is strongly recommended to use ACLs with the “block” keyword instead, resulting in far more flexible and manageable rules.
  • lines are always considered as a whole. It is not possible to reference a header name only or a value only. This is important because of the way headers are written (notably the number of spaces after the colon).
  • the first line is always considered as a header, which makes it possible to rewrite or filter HTTP requests URIs or response codes, but in turn makes it harder to distinguish between headers and request line. The regex prefix ^[^\ \t]*[\ \t] matches any HTTP method followed by a space, and the prefix ^[^ \t:]*: matches any header name followed by a colon.
  • for performances reasons, the number of characters added to a request or to a response is limited at build time to values between 1 and 4 kB. This should normally be far more than enough for most usages. If it is too short on occasional usages, it is possible to gain some space by removing some useless headers before adding new ones.
  • keywords beginning with “reqi” and “rspi” are the same as their counterpart without the ‘i’ letter except that they ignore case when matching patterns.
  • when a request passes through a frontend then a backend, all req* rules from the frontend will be evaluated, then all req* rules from the backend will be evaluated. The reverse path is applied to responses.
  • req* statements are applied after “block” statements, so that “block” is always the first one, but before “use_backend” in order to permit rewriting before switching.
Share Button

Leave a Reply