[netflow-tools] flowd benchmark

Wed Jul 13 22:53:53 EST 2005

Damien Miller wrote:
> flowd is always going to have to do a little more work, because the set
> of fields that it stores is variable. That being said, it should be
> possible to speed up the reader function by moving more it from the pure
> python part of the module to the C implementation.

OK, I moved all of the flow reding into the C part of the Python module
and it didn't help much.

So the problem is a little deeper. I probably need to break out gprof to
analyse it properly, but I think the problem is that the C part of the
python module always converts all of the flow fields to python objects
when the flow is loaded. This is a waste of time if not all of those
fields are subsequently used.

It is probably better to make the deserialiser return a first-class
object with tp_dict or tp_members hooked to do the C struct -> python
object conversion either on demand or lazily.

Unfortunately, it is quite a bit more work, but it does fall into the
Python API renovation that is already in the TODO. I'll try to have a
look at it on the weekend but it will likely take a while longer. If
there are any Python hackers on the list, now would be a good time to
delurk and help out :)

In the meantime, you can get a direct speed increase by only storing the
fields that you are interested in.

-d
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: flowd-python-slightly-faster.diff
Url: http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050713/53797700/attachment.ksh