[netflow-tools] Dealing with duplicates in softflowd

Andrew McGill list2009 at lunch.za.net
Tue Sep 8 06:28:16 EST 2009

I've written some code for softflowd to deal (somewhat) with duplicate 
packets, in order to act more appropriately for bandwidth accounting.  (See 
attached patch, if the list server doesn't eat it). 

Softflowd receives duplicates when:

 0) Something transmits duplicates on the network (duh).

 1) IPVS or something similar forwards a packet to another server for 
handling.  The retransmitted packet is identical to the original packet, 
except for a modified set of MAC addresses.  

 2.0) Softflowd is attached to a mirror/span port, and sees a forwarded packet 
multiple times -- e.g. it is received externally, and received and transmitted 

 2.1) A packet crosses multiple vlans in the course of its travels, all of 
which send copies to the mirror/span port.

 2.2) The switch does some unicast flooding for reasons of its own, in which 
case, it may or may not grace the span/mirror port with multiple copies of its 
flood.  (This stuff must be coming from somewhere - we see so much of it!)

The attached patch adds the calculation of a CRC on the top and tail of the IP 
or IP6 part of a packet, and discards packets with matching CRC's as 
duplicates.  For my purposes, I don't want to consider the MAC addresses as 
part of the calculation.

It's not quite right yet ...

 * Is there a better crc32 function I can use?  The random one I've hacked the 
code with is GPL licenced, which is doesn't quite fit with the current licence 
of softflowd.

 * An alternative to top and tail-ing the packet through crc32 for a quick and 
cheap hash would be worth considering.

 * I've added 66 bytes per flow (give or take who-knows-what for alignment) to 
struct FLOW for tracking the last few checksums.  It would be nicer to 
allocate space for checksums dynamically, since it should be optional - 
although I dislike variable-sized arrays almost as much as extra pointers ...  
what difference is 33% extra memory between friends when an important topic 
like the elimination of duplicates is considered?  Is there a neat way to do 
it in a single dynamically allocated array?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: softflow-duplicates.patch
Type: text/x-patch
Size: 11199 bytes
Desc: not available
URL: <http://lists.mindrot.org/pipermail/netflow-tools/attachments/20090907/01badc1b/attachment-0001.bin>

More information about the netflow-tools mailing list