[netflow-tools] [softflowd]about softflowd TODO

Guanqun Lu guanqun.lu at gmail.com
Thu Mar 29 21:18:18 EST 2007


On 3/29/07, Damien Miller <djm at mindrot.org> wrote:
> On Wed, 28 Mar 2007, Guanqun Lu wrote:
>
> > Hi,
> >
> > 1. What do you mean by "Use strtonum()"? You want to replace "atoi()" with it?
>
> Yes, it is a more simple API and can make the code more readable.
>
> > 2. I'm currently doing some research that is mainly based on the
> > performance of softflowd. But it seems that the softflowd can't stand
> > up with the heavy flow. My colleague did some hack into the code.
> > The diff of file softflowd.c is attached. After doing this, the
> > performance does enhance a little. But still, the usage of CPU climbs
> > up to 100% as soon as the pps increases to about 13,000.
>
> The diff trades off the time spent in packet processing against the
> time spend it managing expiry events, etc.  You might be able to play
> with the "expint" timeout to achieve the same effect (without a code
> change).

Yes, the program with the diff patch spends more time on packet
processing.  But it has several demerits:
1. if there is no flow out there, `softflowctl' can't exist. The pcap_dispatch
waits for the incoming flow and blocks the program.

2. The increase of performance is limited. It doesn't show that the performance
gain is linear with the loopnum in the patch. When the loopnum is bigger than 5,
no obvious performance increase is seen.

Thanks for your mentioning of `expint' timeout, I'll have a look at this option.

>
> > I'm glad to see that there's a performance part in TODO.
> > Performance
> >  - Profile and see where the hot spots are
> > It seems that it's a CPU intensive task.
> >  - Fast "new flow" test using a bloom filter
> > You named 'bloom filter', maybe we can have a try.
>
> It is an idea, it may improve things or it may add overhead.
>
> >  - See if we can reduce per-packet overhead more
> >    - Cost of expiry remove and re-add per packet
> >  - Stop run-time malloc (maybe)
> > Why is it necessary? I'm wondering. Will the run-time malloc cost the
> > performance?
>
> Malloc is designed to be a good general purpose allocator for objects of
> various sizes. For softflowd, we need fast allocations of fixed size
> objects and we generally know how many (maximum) we need ahead of time.
> It should be able to avoid some of malloc's cost by preallocating the
> struct FLOW and struct EXPIRY.
>
> One thing that has a good likelihood of improving performance is to
> replace or modify the data structure used to store flows. At present
> it is a splay tree, which is fast when matching existing flows that
> receive a lot of traffic but slower otherwise (new flows or lots of
> quiet flows). Coming up with a good flow hash and replacing the splay
> tree with a hash table, or putting a hash table in front of splay trees
> is likely to help a lot.

As seen in our experiment, the memory usage of softflowd is very low.
Therefore, I think maybe we can trade the memory for the performance,
using some auxiliary information to increase the performance.

>
> I have changed jobs (several times) since I first wrote softflowd and
> no longer have easy access to large quantities of real-world traffic
> to test it against. Because of this, I will have to depend more on the
> user community to improve softflowd's performance.

It would be my pleasure if I could do something useful to improve the
performance.

>
> Thanks,
> Damien
>


-- 
Guanqun


More information about the netflow-tools mailing list