logrep(1) A handy tool for sophisticated, ad-hoc analysis of webserver logs.


wtop is like "top" for your webserver. How many searches or signups are happening per second? What is the response time histogram for your static files? wtop shows you at a glance.


logrep [--mode MODE] [--include | --exclude CLASSES] [-H | -R]

[--output FIELDS] [--filter FILTERS] [--last LAST_N] [--sort LIM:FIELDS:DIRECTION] [--config CFG_FILE] [--quiet] [LOG_FILE]
There are three modes:

      - "grep" parses an entire log file (default).

      - "tail" reads from the end of the file.

      - "top"  shows running performance stats.
-i, -e CLASSES
Include or exclude the given URL "classes". You can
configure logrep to classify URLs by a set of
regular expressions. See the installation docs and /etc/wtop.cfg for how to configure your own classes. --include and --exclude are mutually exclusive.
--include "home,search,wiki"
--exclude "img,xml,js"
-f FILTERS -f filters act on named fields.
There is support for strings & numbers, greater than (>), less than (<), equals (=), not-equals (!=), and regular expression match (~ and !~).
For example: Filter successful requests that were over 10kB in size that do not have 'example.com' in the Referer field:
-f "status=200,bytes>10000,refdom!~example.com"


     millisecond response time


     The IP address of the client


     The path of the request, eg '/home'


     'Referer' header


     domain part of the 'Referer' header


     Bytes sent


     User-agent header


     First 30 characters of ua


     URL class, configurable in wtop.cfg


     HTTP status code, eg 200, 301, 404


     Protocol version, eg 'HTTP/1.1'


     HTTP method, eg 'GET', 'POST'


     Is a robot? 1 or 0. Only a guess.


     eg 'Googlebot', 'Nutch', 'Slurp', etc


     Unix timestamp of the request






     country    country name (see Geocoding, below)

     cc         ISO-639 country code (see below)
-H, -R
Shorthand for a useful but incomplete filter of robot user-agents. Equivalent to --filter 'bot=0' or --filter 'bot=1'
Output only the given fields, tab-delimited. All
of the fields listed for --filter are available.
AGGREGATE FUNCTIONS: In -m grep mode you can use aggregate functions on numeric fields such as bytes and msec. Any non-aggregate fields in the list will be used to group records together.
count(*) avg(FIELD) mean average min(FIELD) lowest seen value max(FIELD) highest seen value sum(FIELD) summation of all values var(FIELD) population variance dev(FIELD) deviation (square root of variance)
--sort Use this option to sort & limit aggregate records. LIMIT is the number of records to return, FIELDS is a comma-delimited list of column positions starting with 1, and DIRECTION is either 'descending' (default) or 'ascending'.
(grep mode) Only read the last N log lines.
Feed logrep a custom config file. By default it
will use: /etc/wtop.cfg (Linux, BSD, OSX, etc)
Python sys.prefix (Windows)
-q, --quiet
Quiet mode. Does not print warnings to stderr.
The path to a log file. By default logrep will read from the file path specified in wtop.cfg If you specify '-', logrep will read from STDIN.


Configuring Apache
Please Note: By default Apache LogFormats do not have the %D (microsecond response time) directive. You must have at least %s, %r, %t and %D in your LogFormat in order to use wtop. You can use logrep without %D, but you will not be able to use the msec field.
Example: before

     LogFormat "%h %l %u %t      CustomLog logs/access_log common
Example: after

     LogFormat "%h %l %u %t      CustomLog logs/access_log common


logrep will use the MaxMind GeoIP library if it is installed. This will enable two extra fields for filtering and output: country (eg "United Kingdom"), and cc (ISO-639 country code, eg "UK"). These are a *guess* at the country the HTTP client is from.


Some installations of Apache have HostnameLookups defaulted to On. This means that the %h field will contain the fully-qualified domain name of the client (xdsl456.foo.example.com) instead of the IP address ( Geocoding will work but will require a DNS lookup to resolve the IP address. Using the 'cc' or 'country' field in this case will generate a *LOT* of DNS traffic and can hang the program. It is recommended to explicitly set HostnameLookups Off in your Apache configuration.


"wtop" for all human traffic:
$ logrep -m top -f 'bot=0' access.log
Status code & response times for all Googlebot homepage hits:
$ logrep -f 'botname=Googlebot' -i home -o status,msec
Tail for pages about Angelina Jolie or Brad Pitt sent from example.com:
$ logrep -m tail -f 'url~jolie|pitt,ref~example.com' access.log
Get maximum response size and average response time for requests grouped by URL class:
$ logrep -o 'class,max(bytes),avg(msec)' access.log
Grouped by status:
$ logrep -o 'status,count(*),avg(msec)'
$ logrep -o 'cc,msec,url'
Total bytes sent, by hour & minute:
$ logrep -o 'hour,minute,sum(bytes)' -s'3600:1,2:a'
The 10 most popular URLs:
$ logrep -o 'url,count(*)' -s '10:2'