[netflow-tools] Summarisation and charting of flowd logs

Sat Mar 4 15:57:33 EST 2006

Hi,

Attached is something I cooked up this morning: a basic tool to 
summarise flowd logs and store the results in an RRD[1] database, 
and a shell script to generate charts of the results.

An example of the output is at:

http://www.mindrot.org/~djm/tmp/flows.html

(note that this URL will change once I hook a better example up to 
the flowd project page, and tidy this mail into a HOWTO).

Right now it is pretty basic: just a basic protocol breakdown across
TCP, UDP, ICMP, GRE, ESP IPsec and "other". However, the tools can
be easily extended to summarise pretty much anything you can express
in Python. Another mode of summarisation I am considering adding is
clustering flows by the "tag" that can be set in a flowd.conf filter -
so setting up summary classes is just a matter of writing a filter
specification. Please let me know what would work best for you.

Note that the main script ("flow_rrd.py") requires py-rrdtool[2]. It 
is able to read either historic flowd logs or listen for live updates 
using flowd-0.9's experimental logging socket support. 

The first mode (historical logs) is likely to be more stable; it
works like this:

./flow_rrd.py /path/to/flowd_log /path/to/database.rrd

If the RRD database doesn't already exist, then it will be created
with some reasonable defaults automatically. The database will be 
populated with summaries of whatever flows were in the log file.

Once the RRD database has been created and filled in, it can be 
graphed using the "plot.sh" script, like this:

./plot.sh /path/to/database.rrd /output/directory/for/images/

This script will create daily, weekly, monthly and yearly images
showing flows-per-second, bytes-per-second and packets-per-second.
The HTML files in the archive/CVS display these all on one page.

In real use, the summarisation should probably be hooked up to 
occur when the log is rotated and the plotting should be run 
afterwards or out of cron.

Note that this accounts flows to the period when they were
*received*, so traffic flows lasting greater than five minutes will
cause spikes in the chart. To work around this, you can instruct your
flow exported to expire flows at the five minute mark. For
softflowd[1], add the command-line argument "-tmaxlife=300". On
a recent-ish Cisco device, use the command:

ip flow-cache timeout active 5

These tools have just been committed to CVS, so they will also
show up in the snapshot releases. They likely still have a few bugs,
and I would appreciate any testing list lurkers are able to offer.

Cheers,
Damien Miller
-------------- next part --------------
A non-text attachment was scrubbed...
Name: flowrrd.tgz
Type: application/octet-stream
Size: 3933 bytes
Desc: 
Url : http://lists.mindrot.org/pipermail/netflow-tools/attachments/20060304/0b2ef291/attachment.obj