Tag Archives: extract a data from stream

7.1. ACL basics

The use of Access Control Lists (ACL) provides a flexible solution to perform content switching and generally to take decisions based on content extracted from the request, the response or any environmental status. The principle is simple :

  • extract a data sample from a stream, table or the environment
  • optionally apply some format conversion to the extracted sample
  • apply one or multiple pattern matching methods on this sample
  • perform actions only when a pattern matches the sample

The actions generally consist in blocking a request, selecting a backend, or adding a header.

In order to define a test, the “acl” keyword is used. The syntax is :

acl   [flags] [operator] [] ...

This creates a new ACL or completes an existing one with new tests.
Those tests apply to the portion of request/response specified in and may be adjusted with optional flags [flags]. Some criteria also support an operator which may be specified before the set of values. Optionally some conversion operators may be applied to the sample, and they will be specified as a comma-delimited list of keywords just after the first keyword. The values are of the type supported by the criterion, and are separated by spaces.

ACL names must be formed from upper and lower case letters, digits, ‘-‘ (dash), ‘_’ (underscore) , ‘.’ (dot) and ‘:’ (colon). ACL names are case-sensitive, which means that “my_acl” and “My_Acl” are two different ACLs.

There is no enforced limit to the number of ACLs. The unused ones do not affect performance, they just consume a small amount of memory.
The criterion generally is the name of a sample fetch method, or one of its ACL specific declinations. The default test method is implied by the output type of this sample fetch method. The ACL declinations can describe alternate matching methods of a same sample fetch method. The sample fetch methods are the only ones supporting a conversion.

Sample fetch methods return data which can be of the following types :

  • boolean
  • integer (signed or unsigned)
  • IPv4 or IPv6 address
  • string
  • data block

Converters transform any of these data into any of these. For example, some converters might convert a string to a lower-case string while other ones would turn a string to an IPv4 address, or apply a netmask to an IP address.
The resulting sample is of the type of the last converter applied to the list, which defaults to the type of the sample fetch method.

Each sample or converter returns data of a specific type, specified with its keyword in this documentation. When an ACL is declared using a standard sample fetch method, certain types automatically involved a default matching method which are summarized in the table below :

Sample or converter output type Default matching method
boolean bool
integer int
ip ip
string str
binary none, use "-m"

Note that in order to match a binary samples, it is mandatory to specify a matching method, see below.

The ACL engine can match these types against patterns of the following types :

  • boolean
  • integer or integer range
  • IP address / network
  • string (exact, substring, suffix, prefix, subdir, domain)
  • regular expression
  • hex block

The following ACL flags are currently supported :

-i : ignore case during matching of all subsequent patterns.
-f : load patterns from a file.
-m : use a specific pattern matching method
-n : forbid the DNS resolutions
-M : load the file pointed by -f like a map file.
-u : force the unique id of the ACL
-- : force end of flags. Useful when a string looks like one of the flags.

The “-f” flag is followed by the name of a file from which all lines will be read as individual values. It is even possible to pass multiple “-f” arguments if the patterns are to be loaded from multiple files.
Empty lines as well as lines beginning with a sharp (‘#’) will be ignored. All leading spaces and tabs will be stripped. If it is absolutely necessary to insert a valid pattern beginning with a sharp, just prefix it with a space so that it is not taken for a comment.
Depending on the data type and match method, haproxy may load the lines into a binary tree, allowing very fast lookups. This is true for IPv4 and exact string matching. In this case, duplicates will automatically be removed.

The “-M” flag allows an ACL to use a map file. If this flag is set, the file is parsed as two column file. The first column contains the patterns used by the ACL, and the second column contain the samples. The sample can be used later by a map. This can be useful in some rare cases where an ACL would just be used to check for the existence of a pattern in a map before a mapping is applied.

The “-u” flag forces the unique id of the ACL. This unique id is used with the socket interface to identify ACL and dynamically change its values. Note that a file is always identified by its name even if an id is set.

Also, note that the “-i” flag applies to subsequent entries and not to entries loaded from files preceding it. For instance :

    acl valid-ua hdr(user-agent) -f exact-ua.lst -i -f generic-ua.lst test

In this example, each line of “exact-ua.lst” will be exactly matched against the “user-agent” header of the request. Then each line of “generic-ua” will be case-insensitively matched. Then the word “test” will be insensitively matched as well.

The “-m” flag is used to select a specific pattern matching method on the input sample. All ACL-specific criteria imply a pattern matching method and generally do not need this flag. However, this flag is useful with generic sample fetch methods to describe how they’re going to be matched against the patterns. This is required for sample fetches which return data type for which there is no obvious matching method (eg: string or binary). When “-m” is specified and followed by a pattern matching method name, this method is used instead of the default one for the criterion. This makes it possible to match contents in ways that were not initially planned, or with sample fetch methods which return a string. The matching method also affects the way the patterns are parsed.

The “-n” flag forbids the dns resolutions. It is used with the load of ip files. By default, if the parser cannot parse ip address it considers that the parsed string is maybe a domain name and try dns resolution. The flag “-n” disable this resolution. It is useful for detecting malformed ip lists. Note that if the DNS server is not reachable, the haproxy configuration parsing may last many minutes waiting fir the timeout. During this time no error messages are displayed. The flag “-n” disable this behavior. Note also that during the runtime, this function is disabled for the dynamic acl modifications.

There are some restrictions however. Not all methods can be used with all sample fetch methods. Also, if “-m” is used in conjunction with “-f”, it must be placed first. The pattern matching method must be one of the following :

  • found” : only check if the requested sample could be found in the stream, but do not compare it against any pattern. It is recommended not to pass any pattern to avoid confusion. This matching method is particularly useful to detect presence of certain contents such as headers, cookies, etc… even if they are empty and without comparing them to anything nor counting them.
  • bool” : check the value as a boolean. It can only be applied to fetches which return a boolean or integer value, and takes no pattern. Value zero or false does not match, all other values do match.
  • int” : match the value as an integer. It can be used with integer and boolean samples. Boolean false is integer 0, true is integer 1.
  • ip” : match the value as an IPv4 or IPv6 address. It is compatible with IP address samples only, so it is implied and never needed.
  • bin” : match the contents against an hexadecimal string representing a binary sequence. This may be used with binary or string samples.
  • len” : match the sample’s length as an integer. This may be used with binary or string samples
  • str” : exact match : match the contents against a string. This may be used with binary or string samples.
  • sub” : substring match : check that the contents contain at least one of the provided string patterns. This may be used with binary or string samples.
  • reg” : regex match : match the contents against a list of regular expressions. This may be used with binary or string samples.
  • beg” : prefix match : check that the contents begin like the provided string patterns. This may be used with binary or string samples.
  • end” : suffix match : check that the contents end like the provided string patterns. This may be used with binary or string samples.
  • dir” : subdir match : check that a slash-delimited portion of the contents exactly matches one of the provided string patterns. This may be used with binary or string samples.
  • dom” : domain match : check that a dot-delimited portion of the contents exactly match one of the provided string patterns. This may be used with binary or string samples.

For example, to quickly detect the presence of cookie “JSESSIONID” in an HTTP request, it is possible to do :

    acl jsess_present cook(JSESSIONID) -m found

In order to apply a regular expression on the 500 first bytes of data in the buffer, one would use the following acl :

    acl script_tag payload(0,500) -m reg -i <script>

On systems where the regex library is much slower when using “-i”, it is possible to convert the sample to lowercase before matching, like this :

    acl script_tag payload(0,500),lower -m reg <script>

All ACL-specific criteria imply a default matching method. Most often, these criteria are composed by concatenating the name of the original sample fetch method and the matching method. For example, “hdr_beg” applies the “beg” match to samples retrieved using the “hdr” fetch method. Since all ACL-specific criteria rely on a sample fetch method, it is always possible instead to use the original sample fetch method and the explicit matching method using “-m”.

If an alternate match is specified using “-m” on an ACL-specific criterion, the matching method is simply applied to the underlying sample fetch method.
For example, all ACLs below are exact equivalent :

    acl short_form  hdr_beg(host)        www.
    acl alternate1  hdr_beg(host) -m beg www.
    acl alternate2  hdr_dom(host) -m beg www.
    acl alternate3  hdr(host)     -m beg www.

The table below summarizes the compatibility matrix between sample or converter types and the pattern types to fetch against. It indicates for each compatible combination the name of the matching method to be used, surrounded with angle brackets “>” and “<" when the method is the default one and will work by default without "-m". [table width=70%] | pattern type | | Input | sample |type | | | | boolean | integer | ip | string | binary | | none (presence only) | found | found | found | found | found | | none (boolean value) |> bool <| bool | | bool | | | integer (value) | int |> int <| int | int | | | integer (length) | len | len | len | len | len | | IP address | | |> ip <| ip | ip | | exact string | str | str | str |> str <| str | | prefix | beg | beg | beg | beg | beg | | suffix | end | end | end | end | end | | substring | sub | sub | sub | sub | sub | | subdir | dir | dir | dir | dir | dir | | domain | dom | dom | dom | dom | dom | | regex | reg | reg | reg | reg | reg | | hex block | | | | bin | bin | [/table]

Share Button