clfdomainsplit(1) split Common-Log Format web logs based on domain name


clfdomainsplit [--help] [-i input] [-d defaultfile] [-c cfg-file] [-o directory]


The clfdomainsplit program will split up large CLF format web logs based on domain name. This is for creating separate log analysis passes for each domain hosted on your server.


The input parameter specifies the file to read (default is standard input).

The defaultfile parameter specifies where data goes if it doesn't have a domain (either it has an IP address for the server or it doesn't have the server-name - the URL is relative to the root of the web server only). The default will be to print them on standard error.

The cfg-file parameter is for specifying the rules for determining what is a different domain name. For example belongs in the same file as and because domain names ending in .au have three major components. The domain names and belong in the same file because domain names ending in .nl have two major components (as do .com, and .gov), wheras anything ending in .va belongs to the same organization. The rules are of the form number:pattern which lists the number of domain parts which are significant (2 for .com and for a simple string comparison, the default will be:


If no config file is specified then it will look for /etc/clfdomainsplit.cfg. Of course comments start with #. Also note that the first match will be used!

The directory parameter is to specify the location for the files to be created (default is the current directory). I recommend that you use a directory for this and nothing else as you never know how many files may be created!


0 No errors

1 Bad parameters


This program, its manual page, and the Debian package were written by Russell Coker <[email protected]>.