|
发表于 2006-5-30 23:58:57
|
显示全部楼层
L7-filter Pattern Writing HOWTO
It is fairly easy to add support for more protocols to l7-filter. All you need to do is add a new pattern file to /etc/l7-protocols (or the directory you specify with --l7dir). iptables searches this directory and its subdirectories (non-recursively) for pattern files. (Thus, it will find /etc/l7-protocols/http.pat and /etc/l7-protocols/protocols/http.pat, but not /etc/l7-protocols/foo/bar/http.pat.) Please consider submitting any patterns you write for inclusion into the official distribution.
File Format
Basic format
The name of the file must match the name of the protocol. (If the protocol is "ftp", the file must be "ftp.pat".) The file format is:
1. The name of the protocol on one line
2. An l7-filter regular expression defining that protocol on the next line (see below)
Lines starting with '#', blank lines and any lines after the regular expression line are ignored. That's it.
File meta-data
Pattern files that are part of the official distribution need some metadata at the top for display on the webpage and for the use of frontends. The top four lines should look like this:
#
# Pattern attributes: [attribute word]*
# Protocol groups: [group name]*
# Wiki: [link]*
"Pattern attributes" is supposed to give some information about how good the pattern is on various scales. "[attribute word]" can be any of "undermatch", "overmatch", "superset", "subset", "great", "good", "ok", "marginal", "poor", "veryfast", "fast", "nosofast", or "slow". Any number of attribute words may be used. They are defined on the protocols page.
"Protocol groups" are supposed to give frontends a way to group similar protocols. A "group name" can be whatever you like, but should match existing names if possible. Any number of group names may be used. More relevant groups should be listed first for sorting purposes. Group names in use as of 2005-12-16 are:
* chat
* document_retrieval
* file
* game
* ietf_draft_standard
* ietf_internet_standard
* ietf_proposed_standard
* ietf_rfc_documented
* mail
* monitoring
* networking
* obsolete
* p2p
* proprietary
* remote_access
* secure
* streaming_audio
* streaming_video
* time_synchronization
* version_control
* voip
* x_consortium_standard
"Wiki" gives a link to a page documenting the pattern and other methods of identifying the protocol on protocolinfo.org.
Regular Expressions
l7-filter uses Version 8 regular expressions ("V8 regexps") with a few modifications, noted here. V8 regexps are likely more limited than the regexps you are used to. Notably, you cannot use bounds ("foo{3}"), character classes ("[[:punct:]]") or backreferences.
As an extension to V8 regexps, l7-filter adds perl-style hex matching using \xHH notation (so to match a tab, use "\x09"). Note that regexp control characters are still control characters even when written in hex:
\x24 == $ (only matters if it's the last character)
\x28 == (
\x29 == )
\x2a == *
\x2b == +
\x2e == .
\x3f == ?
\x5b == [
\x5c == \
\x5e == ^ (only matters if it's the first character)
\x7c == |
l7-filter is always case insensitive. Upper case in patterns is identical to lower case. (This is true even if you write an uppercase letter in hex.)
l7-filter strips out the nulls (\x00 bytes) from network data so that it can treat it as normal C strings. (See the FAQ for why.) So (1) you can't match on nulls and (2) fields may appear shorter than expected. For example, if a protocol has a 4 byte field and any of those bytes can be null, it can appear to be any length from 0 to 4.
Useful things:
[\x09-\x0d -~] == printable characters, including whitespace
[\x09-\x0d ] == any whitespace
[!-~] == non-whitespace printable characters
What The Classifier Sees
If you have set up your computer as recommended in the documentation, the data to be matched is that of both the client and the server, in the order that it passes through the computer. For instance, in FTP, the first thing the filter sees is "221 server ready", then "USER bob", then "331 send password", then "PASS frogbeard", and so on.
l7-filter can match across packets. For instance, you could match FTP with "220.*user.*331".
What Makes A Good Pattern
There are two general guidelines:
1) A pattern must be neither too specific nor not specific enough.
Example 1: The pattern "bear" for Bearshare is not specific enough. This pattern could match a wide variety of non-Bearshare connections. For instance, an HTTP request for http://bear.com would be matched.
Example 2: "220 .*ftp.*(\[.*\]|\(.*\))" for FTP is too specific. Not all servers send ()s or []s after their 220. In fact, servers are not even required to send the string "ftp" at any time, but the vast majority do. Good judgement and testing are necessary for instances such as this.
2) It should use a minimum of processing power. Thus, if it is possible to reduce the number of *'s, +'s and |'s in your pattern, you should do so. Use the performance testing program included in the patterns package to determine the speed of your pattern.
The recommended procedure for writing packets is this:
1. Find and read the spec for the protocol you wish to match. If it's an Internet standard, RFCs are a good place to start, although not all standards are RFCs. If it is a proprietary protocol, it is likely that someone has written a reverse-engineered spec for it. Do a general web search to find it. Skipping this step is a good way to write patterns that are overly specific!
2. Use something like Ethereal to watch packets of this protocol go by in a typical session of its use. (If you failed to find a spec for your protocol, but Ethereal can parse it, reading the Ethereal source code may also be worth your time.)
3. Write a pattern that will reliably match one of the first few packets that are sent in your protocol. Test it. Test its performance.
4. Send your pattern to l7-filter-developers AT lists.sf.net for it to be incorporated into the official pattern definitions. |
|