-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Reference Manual (v2.x) Transformation Functions
Copyright © 2004-2022 Trustwave Holdings, Inc.
Transformation functions are used to alter input data before it is used in matching (i.e., operator execution). The input data is never modified, actually—whenever you request a transformation function to be used, ModSecurity will create a copy of the data, transform it, and then run the operator against the result.
- Note : There are no default transformation functions, as there were in the first generation of ModSecurity (1.x).
In the following example, the request parameter values are converted to lowercase before matching:
SecRule ARGS "xp_cmdshell" "t:lowercase,id:91"
Multiple transformation actions can be used in the same rule, forming a transformation pipeline. The transformations will be performed in the order in which they appear in the rule.
In most cases, the order in which transformations are performed is very important. In the following example, a series of transformation functions is performed to counter evasion. Performing the transformations in any other order would allow a skillful attacker to evade detection:
SecRule ARGS "(asfunction|javascript|vbscript|data|mocha|livescript):" "id:92,t:none,t:htmlEntityDecode,t:lowercase,t:removeNulls,t:removeWhitespace"
- Warning : It is currently possible to use SecDefaultAction to specify a default list of transformation functions, which will be applied to all rules that follow the SecDefaultAction directive. However, this practice is not recommended, because it means that mistakes are very easy to make. It is recommended that you always specify the transformation functions that are needed by a particular rule, starting the list with t:none (which clears the possibly inherited transformation functions).
The remainder of this section documents the transformation functions currently available in ModSecurity.
Decodes a Base64-encoded string.
SecRule REQUEST_HEADERS:Authorization "^Basic ([a-zA-Z0-9]+=*)$" "phase:1,id:93,capture,chain,logdata:%{TX.1}" SecRule TX:1 ^(\w+): t:base64Decode,capture,chain SecRule TX:1 ^(admin|root|backup)$
- Note : Be careful when applying base64Decode with other transformations. The order of your transformation matters in this case as certain transformations may change or invalidate the base64 encoded string prior to being decoded (i.e t:lowercase, etc). This of course means that it is also very difficult to write a single rule that checks for a base64decoded value OR an unencoded value with transformations, it is best to write two rules in this situation.
Decode sql hex data. Example (0x414243) will be decoded to (ABC). Available as of 2.6.3
Decodes a Base64-encoded string. Unlike base64Decode, this version uses a forgiving implementation, which ignores invalid characters. Available as of 2.5.13.
See blog post on Base64Decoding evasion issues on PHP sites - http://blog.spiderlabs.com/2010/04/impedance-mismatch-and-base64.html
Encodes input string using Base64 encoding.
- Note : This is a community contribution developed by Marc Stern http://www.linkedin.com/in/marcstern
- c^ommand /c ...
- "command" /c ...
- command,/c ...
- backslash in the middle of a Unix command
- deleting all backslashes [\]
- deleting all double quotes ["]
- deleting all single quotes [']
- deleting all carets [^]
- deleting spaces before a slash /
- deleting spaces before an open parentesis [(]
- replacing all commas [,] and semicolon [;] into a space
- replacing all multiple spaces (including tab, newline, etc.) into one space
- transform all characters to lowercase
Example Usage:
SecRule ARGS "(?:command(?:.com)?|cmd(?:.exe)?)(?:/.*)?/[ck]" "phase:2,id:94,t:none, t:cmdLine"
Converts any of the whitespace characters (0x20, \f, \t, \n, \r, \v, 0xa0) to spaces (ASCII 0x20), compressing multiple consecutive space characters into one.
Decodes characters encoded using the CSS 2.x escape rules syndata.html#characters. This function uses only up to two bytes in the decoding process, meaning that it is useful to uncover ASCII characters encoded using CSS encoding (that wouldn’t normally be encoded), or to counter evasion, which is a combination of a backslash and non-hexadecimal characters (e.g., ja\vascript is equivalent to javascript).
Decodes ANSI C escape sequences: \a, \b, \f, \n, \r, \t, \v, \\, \?, \', \", \xHH (hexadecimal), \0OOO (octal). Invalid encodings are left in the output.
Decodes a string that has been encoded using the same algorithm as the one used in hexEncode (see following entry).
Encodes string (possibly containing binary characters) by replacing each input byte with two hexadecimal characters. For example, xyz is encoded as 78797a.
Decodes the characters encoded as HTML entities. The following variants are supported:
- HH and HH; (where H is any hexadecimal number)
- DDD and DDD; (where D is any decimal number)
- "and"
-  and
- <and<
- >and>
This function always converts one HTML entity into one byte, possibly resulting in a loss of information (if the entity refers to a character that cannot be represented with the single byte). It is thus useful to uncover bytes that would otherwise not need to be encoded, but it cannot do anything meaningful with the characters from the range above 0xff.
Decodes JavaScript escape sequences. If a \uHHHH code is in the range of FF01-FF5E (the full width ASCII codes), then the higher byte is used to detect and adjust the lower byte. Otherwise, only the lower byte will be used and the higher byte zeroed (leading to possible loss of information).
Looks up the length of the input string in bytes, placing it (as string) in output. For example, if it gets ABCDE on input, this transformation function will return 5 on output.
Converts all characters to lowercase using the current C locale.
Calculates an MD5 hash from the data in input. The computed hash is in a raw binary form and may need encoded into text to be printed (or logged). Hash functions are commonly used in combination with hexEncode (for example: t:md5,t:hexEncode).
Not an actual transformation function, but an instruction to ModSecurity to remove all transformation functions associated with the current rule.
See normalizePath.
Removes multiple slashes, directory self-references, and directory back-references (except when at the beginning of the input) from input string.
- Note : As of 2010 normalisePath has been renamed to normalizePath (MODSEC-103). NormalisePath is kept for backwards compatibility in current versions, but should not be used.
See normalizePathWin.
Same as normalizePath, but first converts backslash characters to forward slashes.
- Note : As of 2010 normalisePathWin has been renamed to normalizePathWin (MODSEC-103). NormalisePathWin is kept for backwards compatibility in current versions, but should not be used.
Calculates even parity of 7-bit data replacing the 8th bit of each target byte with the calculated parity bit.
Calculates odd parity of 7-bit data replacing the 8th bit of each target byte with the calculated parity bit.
Calculates zero parity of 7-bit data replacing the 8th bit of each target byte with a zero-parity bit, which allows inspection of even/odd parity 7-bit data as ASCII7 data.
Removes all NUL bytes from input.
Removes all whitespace characters from input.
Replaces each occurrence of a C-style comment (/* ... */) with a single space (multiple consecutive occurrences of which will not be compressed). Unterminated comments will also be replaced with a space (ASCII 0x20). However, a standalone termination of a comment (*/) will not be acted upon.
Removes common comments chars (/*, */, --, #).
Version: 2.x-3.x
Supported on libModSecurity: Yes
Removes each occurrence of comment (/* ... */, --, #). Multiple consecutive occurrences of which will not be compressed.
- Note : This transformation is known to be unreliable, might cause some unexpected behaviour and could be deprecated soon in a future release. Refer to issue #1207 for further information..
Replaces NUL bytes in input with space characters (ASCII 0x20).
Decodes a URL-encoded input string. Invalid encodings (i.e., the ones that use non-hexadecimal characters, or the ones that are at the end of string and have one or two bytes missing) are not converted, but no error is raised. To detect invalid encodings, use the @validateUrlEncoding operator on the input data first. The transformation function should not be used against variables that have already been URL-decoded (such as request parameters) unless it is your intention to perform URL decoding twice!
Converts all characters to uppercase using the current C locale.
Version: 3.x
Supported on libModSecurity: Yes
Like urlDecode, but with support for the Microsoft-specific %u encoding. If the code is in the range of FF01-FF5E (the full-width ASCII codes), then the higher byte is used to detect and adjust the lower byte. Otherwise, only the lower byte will be used and the higher byte zeroed.
Encodes input string using URL encoding.
Converts all UTF-8 character sequences to Unicode (using '%uHHHH' format). This help input normalization specially for non-english languages minimizing false-positives and false-negatives.
Calculates a SHA1 hash from the input string. The computed hash is in a raw binary form and may need encoded into text to be printed (or logged). Hash functions are commonly used in combination with hexEncode (for example, t:sha1,t:hexEncode).
Removes whitespace from the left side of the input string.
Removes whitespace from the right side of the input string.
Removes whitespace from both the left and right sides of the input string.