From sbat at ugtel.ru Fri Jul 1 22:08:41 2005 From: sbat at ugtel.ru (Sergei Batakov) Date: Fri, 1 Jul 2005 16:08:41 +0400 Subject: [netflow-tools] pfflowd-0.6 and OpenBSD 3.7 Message-ID: <3310730934.20050701160841@ugtel.ru> Hello All, I will try run subj on OpenBSD 3.7 ./pfflowd -i em1 -n 127.0.0.1:1999 -d ZZZZ 4 pfflowd[15196]: pfflowd listening on em1 Jul 1 11:53:59 box pfflowd[15196]: pfflowd listening on em1 pfflowd[15196]: Unsupported pfsync version 1, exiting Press any key to continue...Jul 1 11:54:01 box pfflowd[15196]: Unsupported pfsync version 1, exiting From gijs at looze.net Wed Jul 6 22:16:09 2005 From: gijs at looze.net (Gijs Molenaar) Date: Wed, 06 Jul 2005 14:16:09 +0200 Subject: [netflow-tools] flowd + avici v9 flows problem Message-ID: <42CBCB89.5020103@looze.net> Hello, We are trying to implement flowd 0.8.5 in our new network. I like flowd because it is very simple, and it has a good functioning python module ready to use. But when we let flowd analyse the v9 flows exported by an avici router we encounter the following error (output of flowd): netflow v.9 tempate flowset NetFlow v.9 template with 18 records: Invalid field length in netflow v.9 flowset template 256 from type 10 len 4 Now we first thought this was a bug in the Avici router software, so we contacted Avici. They responded the following: > From the logs you provided, we determined that the flowd netflow > collector s/w you are using at SurfNet is not compliant with the > draft-claise-netflow-9-07.txt : > it does not support field type 10 (ifindex) with length of 4, > which is a requirement w/ draft-claise-netflow-9-07.txt. > draft-claise-netflow-9-07.txt introduces INPUT_SNMP(type 10) and > OUTPUT_SNMP(type 14) fields as variable length fields. > The default length being 2. But higher values can be used. > It looks like flowD might support an earlier draft. > Unfortunately we cannot easily change Avici ifIndexes to be 16 > bit only. The draft can be found here: http://www.ietf.org/internet-drafts/draft-claise-netflow-version9-07.txt I hope this can be any help for flowd, and if you need any contact with Avici people let me know. Greetings, -- Gijs Molenaar gijs at looze.net http://gijs.looze.net From gijs at looze.net Thu Jul 7 18:12:59 2005 From: gijs at looze.net (Gijs Molenaar) Date: Thu, 07 Jul 2005 10:12:59 +0200 Subject: [netflow-tools] flowd + avici v9 flows problem In-Reply-To: <42CBCEA4.4090409@mindrot.org> References: <42CBCB89.5020103@looze.net> <42CBCEA4.4090409@mindrot.org> Message-ID: <42CCE40B.4090905@looze.net> Damien Miller wrote: > Please try the attached diff. It won't drop the packets any more, but It works! But now I'm getting the same error for field type 17 :). This is NF9_DST_AS. The daft describes that the lenght for these fields should be 2 or 4. The same for NF9_SRC_AS*. When I look at your code, it only supports lenght 2.* > the flowd log format only has a 2 octet space to store the SNMP > interface indices, so the most signficant two octets will be ignored. Will the flowd log format support variable length for these fields in the future? > (yes, I am guilty of only testing the NetFlow v.9 code against a Cisco > and softflowd) I will do this for you :) From gijs at looze.net Thu Jul 7 18:20:09 2005 From: gijs at looze.net (Gijs Molenaar) Date: Thu, 07 Jul 2005 10:20:09 +0200 Subject: [netflow-tools] flowd PID Message-ID: <42CCE5B9.4020305@looze.net> Hello, I have an other question. Flowd doesn't create a PID file. I need to know the PID to send a SIGUSR for log rotation. Now it is possible to discover the PID with a command like: ps -aef | grep "flowd: monitor" | grep -v grep | awk {'print $2'} or grep to "flowd: net", but then you get multiple PID's when I run multiple flowd's (and I'm running multiple flowd's). can't life be much more easier with the option of an PID file? Or is this wrong in the minimalistic philosophy and should this be handled by scripts? greetings, -- Gijs Molenaar gijs at looze.net http://gijs.looze.net From djm at mindrot.org Thu Jul 7 20:33:14 2005 From: djm at mindrot.org (Damien Miller) Date: Thu, 07 Jul 2005 20:33:14 +1000 Subject: [netflow-tools] flowd PID In-Reply-To: <42CCE5B9.4020305@looze.net> References: <42CCE5B9.4020305@looze.net> Message-ID: <42CD04EA.5020704@mindrot.org> Gijs Molenaar wrote: > Hello, > > I have an other question. > > Flowd doesn't create a PID file. It does. You can even choose where to put it using the config file ("pidfile /path/to/it"). -d From djm at mindrot.org Fri Jul 8 10:39:43 2005 From: djm at mindrot.org (Damien Miller) Date: Fri, 08 Jul 2005 10:39:43 +1000 Subject: [netflow-tools] flowd + avici v9 flows problem In-Reply-To: <42CCE40B.4090905@looze.net> References: <42CBCB89.5020103@looze.net> <42CBCEA4.4090409@mindrot.org> <42CCE40B.4090905@looze.net> Message-ID: <42CDCB4F.4030201@mindrot.org> Gijs Molenaar wrote: > > Damien Miller wrote: > > >>Please try the attached diff. It won't drop the packets any more, but > > It works! But now I'm getting the same error for field type 17 :). This > is NF9_DST_AS. The daft describes that the lenght for these fields > should be 2 or 4. The same for NF9_SRC_AS*. When I look at your code, it > only supports lenght 2.* Actually more work is needed - my patch will clobber adjacent fields. I'm working on a better one now. >>the flowd log format only has a 2 octet space to store the SNMP >>interface indices, so the most signficant two octets will be ignored. > > Will the flowd log format support variable length for these fields in > the future? That't what I'm working on now :) BTW. If anyone wants fields added to the flowd log format, now is the time to speak up. So far I'm: - Extending src/dst AS to 32 bits each - Extending in/out SNMP indices to 32 bits each - Adding NetFlow v.9 source_id to FLOW_ENGINE_INFO - Probably adding NetFlow v.9 min/max packet length. Any more fields that you want? (I'm trying to make the changes backwards-compatible, so a new flowd will be able to read an old flowd's logs, but probably not write or append to them.) -d From djm at mindrot.org Sat Jul 9 09:37:27 2005 From: djm at mindrot.org (Damien Miller) Date: Sat, 09 Jul 2005 09:37:27 +1000 Subject: [netflow-tools] pfflowd-0.6 and OpenBSD 3.7 In-Reply-To: <3310730934.20050701160841@ugtel.ru> References: <3310730934.20050701160841@ugtel.ru> Message-ID: <42CF0E37.5000308@mindrot.org> Sergei Batakov wrote: > Hello All, > > I will try run subj on OpenBSD 3.7 > > ./pfflowd -i em1 -n 127.0.0.1:1999 -d Remove the "-i em1". pfflowd listens to pfsync interfaces, not general purpose ones. (sorry for the slow reply). > ZZZZ 4 hm, that is debug code that should be killed. -d From djm at mindrot.org Sat Jul 9 11:59:17 2005 From: djm at mindrot.org (Damien Miller) Date: Sat, 09 Jul 2005 11:59:17 +1000 Subject: [netflow-tools] Flowd unable to use FIFO In-Reply-To: References: Message-ID: <42CF2F75.6030405@mindrot.org> Jason Dixon wrote: > Forwarding to netflow-tools where it belongs... > > I'm trying to get flowd to write to FIFO so I can read it in with a perl > script and output to my choice of storage (file, db, etc). The patch > below from djm allows flowd to write to fifo, but Flowd.pm fails on > init() with a "bad magic" error. Hi, I have thought about this some more - rather than skipping log headers based on what type of file your are reading from / writing to, I think it is better that users a choice. This patch does this - it adds a "skip header" option to flowd, flowd-reader and the perl/python APIs. So you should be able to put: logfile "/path/to/fifo" noheader in your flowd.conf and directly listen to it with "flowd-reader -S /path/to/fifo". You can also use flowd-reader to write to a fifo Read the diff to see the equivalent options for perl and python. Does this work for you? If so, I'll tidy it up, document and commit it. -d From djm at mindrot.org Sat Jul 9 12:18:15 2005 From: djm at mindrot.org (Damien Miller) Date: Sat, 09 Jul 2005 12:18:15 +1000 Subject: [netflow-tools] Flowd unable to use FIFO In-Reply-To: <42CF2F75.6030405@mindrot.org> References: <42CF2F75.6030405@mindrot.org> Message-ID: <42CF33E7.2070801@mindrot.org> It would help if I actually attached the patch. -d Damien Miller wrote: > Jason Dixon wrote: > >>Forwarding to netflow-tools where it belongs... >> >>I'm trying to get flowd to write to FIFO so I can read it in with a perl >>script and output to my choice of storage (file, db, etc). The patch >>below from djm allows flowd to write to fifo, but Flowd.pm fails on >>init() with a "bad magic" error. > > > Hi, > > I have thought about this some more - rather than skipping log headers > based on what type of file your are reading from / writing to, I think > it is better that users a choice. > > This patch does this - it adds a "skip header" option to flowd, > flowd-reader and the perl/python APIs. So you should be able to put: > > logfile "/path/to/fifo" noheader > > in your flowd.conf and directly listen to it with "flowd-reader -S > /path/to/fifo". You can also use flowd-reader to write to a fifo > > Read the diff to see the equivalent options for perl and python. > > Does this work for you? If so, I'll tidy it up, document and commit it. > > -d > > _______________________________________________ > netflow-tools mailing list > netflow-tools at mindrot.org > http://www.mindrot.org/mailman/listinfo/netflow-tools -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: flowd-noheader.diff Url: http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050709/1cdfb5e3/attachment.ksh From gijs at looze.net Wed Jul 13 18:12:14 2005 From: gijs at looze.net (Gijs Molenaar) Date: Wed, 13 Jul 2005 10:12:14 +0200 Subject: [netflow-tools] flowd benchmark Message-ID: <42D4CCDE.5030308@looze.net> Hello people, I'm doing some research for what is the best flow analyse tool for us at the moment. We have routers generating around the 1.000.000 flows every 5 minutes, and this is already sampled with a rate of 100. So speed is very important for us. The 2 tools I like the most are flowd and flow-tools. Flowd supports v9 (and with that ipv6), so I prefer flowd. The first thing that I was looking at was the load of the capture daemon. There isn't a big difference between the 2. I use a quite slow computer (pentium III 450, 1 GB ram), and both deaemons use about 10% CPU time. When the PC is very busy, flow-tools (flow-capture) starts to drop packages and logs this. My question is, what will happen with flowd when the CPU load is too high to process a high flow of flows? The fact that flows are dropped isn't important for us, but how many can be interesting. The next thing I did was flow analysation. I tried both python libraries for this job. I captured 5 minutes with each daemon. Flowd will write all info it has to the file, flow-tools does this also. The results where stunning. These are the results (scripts are attached): $ python flowtools.py finished in 20 seconds flowcount: 931711 45769 flows/s $ python flowd.py finished in 256 seconds flowcount: 944281 3688 flows/s The flowd python library is about 12x slower! I was really not happy when I saw this output. The thing is, I can't use flowd now. I need to do a _lot_ more computations than to calculate in and out AS traffic. Running flowtools python program on a (at the moment) fast machine can speed it up by about a factor 5, but then flowd would still be much to slow. Maybe it has to do something with the fact that with flow-tools I do a readlines() to load the whole file in memory. With flowd it 'walks' trough the file, which can be much slower. But I'm not sure. flowtools python libary is also completely written in C. I like to use flowd, so I wanted to try to change the flowtools python source to be able to read the flowd binary format. I'm not really a good C programmer, but I can give it a try :). Greetings, -- Gijs Molenaar gijs at looze.net http://gijs.looze.net -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: flowtools.py Url: http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050713/164659e0/attachment.ksh -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: flowd.py Url: http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050713/164659e0/attachment-0001.ksh From gijs at looze.net Wed Jul 13 18:16:06 2005 From: gijs at looze.net (Gijs Molenaar) Date: Wed, 13 Jul 2005 10:16:06 +0200 Subject: [netflow-tools] flowd benchmark In-Reply-To: <42D4CCDE.5030308@looze.net> References: <42D4CCDE.5030308@looze.net> Message-ID: <42D4CDC6.3070906@looze.net> Whoeps, last line of flowd.py should be: print "%i flows/s" % (flowcount/(time.time()-starttime)) From gijs at looze.net Wed Jul 13 20:27:03 2005 From: gijs at looze.net (Gijs Molenaar) Date: Wed, 13 Jul 2005 12:27:03 +0200 Subject: [netflow-tools] flowd benchmark In-Reply-To: <42D4CCDE.5030308@looze.net> References: <42D4CCDE.5030308@looze.net> Message-ID: <42D4EC77.7020502@looze.net> Gijs Molenaar wrote: > Maybe it has to do something with the fact that with flow-tools I do a > readlines() to load the whole file in memory. With flowd it 'walks' > trough the file, which can be much slower. But I'm not sure. flowtools > python libary is also completely written in C. Again I have to correct myself. It doesn't do a readlines, it loops trough an FlowSet object that looks like a list in python. I still think io is the bottleneck. From djm at mindrot.org Wed Jul 13 20:52:50 2005 From: djm at mindrot.org (Damien Miller) Date: Wed, 13 Jul 2005 20:52:50 +1000 Subject: [netflow-tools] flowd benchmark In-Reply-To: <42D4CCDE.5030308@looze.net> References: <42D4CCDE.5030308@looze.net> Message-ID: <42D4F282.9000504@mindrot.org> Gijs Molenaar wrote: > Hello people, > > I'm doing some research for what is the best flow analyse tool for us at > the moment. We have routers generating around the 1.000.000 flows every > 5 minutes, and this is already sampled with a rate of 100. So speed is > very important for us. The 2 tools I like the most are flowd and > flow-tools. Flowd supports v9 (and with that ipv6), so I prefer flowd. (I assume that you are using flowd-0.8.5) > The first thing that I was looking at was the load of the capture > daemon. There isn't a big difference between the 2. I use a quite slow > computer (pentium III 450, 1 GB ram), and both deaemons use about 10% > CPU time. When the PC is very busy, flow-tools (flow-capture) starts to > drop packages and logs this. My question is, what will happen with flowd > when the CPU load is too high to process a high flow of flows? The fact > that flows are dropped isn't important for us, but how many can be > interesting. flowd doesn't detect if packets are dropped by the kernel before they reach the daemon. It should check the netflow v5+ sequence numbers, and this is already on the todo list. > The next thing I did was flow analysation. I tried both python libraries > for this job. I captured 5 minutes with each daemon. Flowd will write > all info it has to the file, flow-tools does this also. The results > where stunning. These are the results (scripts are attached): > > $ python flowtools.py > finished in 20 seconds > flowcount: 931711 > 45769 flows/s > > $ python flowd.py > finished in 256 seconds > flowcount: 944281 > 3688 flows/s Does turning off storing the CRC32 in flowd.conf speed this up? flowd is always going to have to do a little more work, because the set of fields that it stores is variable. That being said, it should be possible to speed up the reader function by moving more it from the pure python part of the module to the C implementation. If I get time, I'll look at it on the weekend. -d From gijs at looze.net Wed Jul 13 21:05:12 2005 From: gijs at looze.net (Gijs Molenaar) Date: Wed, 13 Jul 2005 13:05:12 +0200 Subject: [netflow-tools] flowd benchmark In-Reply-To: <42D4F282.9000504@mindrot.org> References: <42D4CCDE.5030308@looze.net> <42D4F282.9000504@mindrot.org> Message-ID: <42D4F568.2060905@looze.net> Damien Miller wrote: > (I assume that you are using flowd-0.8.5) yes > flowd doesn't detect if packets are dropped by the kernel before they > reach the daemon. It should check the netflow v5+ sequence numbers, and > this is already on the todo list. ah ok, good :) > Does turning off storing the CRC32 in flowd.conf speed this up? I just did another test with flowd only logging AS info and octets, this with the following results: finished in 145 seconds flowcount: 961501 6631 flows/s twice as fast, but still much more slower than flowtools. > If I get time, I'll look at it on the weekend. Great. I really prefer flowd, but I need speed. If I can be of any help, let me know. thanks for the fast reply! From djm at mindrot.org Wed Jul 13 22:53:53 2005 From: djm at mindrot.org (Damien Miller) Date: Wed, 13 Jul 2005 22:53:53 +1000 Subject: [netflow-tools] flowd benchmark In-Reply-To: <42D4F282.9000504@mindrot.org> References: <42D4CCDE.5030308@looze.net> <42D4F282.9000504@mindrot.org> Message-ID: <42D50EE1.6050201@mindrot.org> Damien Miller wrote: > flowd is always going to have to do a little more work, because the set > of fields that it stores is variable. That being said, it should be > possible to speed up the reader function by moving more it from the pure > python part of the module to the C implementation. OK, I moved all of the flow reding into the C part of the Python module and it didn't help much. So the problem is a little deeper. I probably need to break out gprof to analyse it properly, but I think the problem is that the C part of the python module always converts all of the flow fields to python objects when the flow is loaded. This is a waste of time if not all of those fields are subsequently used. It is probably better to make the deserialiser return a first-class object with tp_dict or tp_members hooked to do the C struct -> python object conversion either on demand or lazily. Unfortunately, it is quite a bit more work, but it does fall into the Python API renovation that is already in the TODO. I'll try to have a look at it on the weekend but it will likely take a while longer. If there are any Python hackers on the list, now would be a good time to delurk and help out :) In the meantime, you can get a direct speed increase by only storing the fields that you are interested in. -d -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: flowd-python-slightly-faster.diff Url: http://lists.mindrot.org/pipermail/netflow-tools/attachments/20050713/53797700/attachment.ksh From gijs at looze.net Wed Jul 13 23:39:14 2005 From: gijs at looze.net (Gijs Molenaar) Date: Wed, 13 Jul 2005 15:39:14 +0200 Subject: [netflow-tools] flowd benchmark In-Reply-To: <42D50EE1.6050201@mindrot.org> References: <42D4CCDE.5030308@looze.net> <42D4F282.9000504@mindrot.org> <42D50EE1.6050201@mindrot.org> Message-ID: <42D51982.4070009@looze.net> Damien Miller wrote: > Damien Miller wrote: > >> flowd is always going to have to do a little more work, because the set >> of fields that it stores is variable. That being said, it should be >> possible to speed up the reader function by moving more it from the pure >> python part of the module to the C implementation. > > > OK, I moved all of the flow reding into the C part of the Python module > and it didn't help much. That's fast! : ) I did a little more research. Because I thought the python API of flowd was slow, I wanted to write a flowd-reader parser in python. I tried 3 flowcapture programs and their readers. The test is the most basic operation, just read out the flow log file and print the fields $ time flowd-reader ./flowd-log | wc -l 944282 real 1m55.109s user 1m23.378s sys 0m42.296s $ time flow-export -f2 < ./flowtools-log | wc -l flow-export: Exported 931711 records 931712 real 0m24.225s user 0m23.384s sys 0m2.037s Much, faster but not variable field length. $ time ./ipflow grep ./netflow.log | wc -l 1280538 real 1m45.336s user 1m44.407s sys 0m3.188s This is a new one I tried, supporting v9. It isn't that much faster than flowd. So it really is the variable field length thing that makes it slow. All tests where done with v5 cisco flows, and on a 2 proccessor system. From jason at dixongroup.net Thu Jul 14 00:14:48 2005 From: jason at dixongroup.net (Jason Dixon) Date: Wed, 13 Jul 2005 10:14:48 -0400 Subject: [netflow-tools] flowd benchmark In-Reply-To: <42D51982.4070009@looze.net> References: <42D4CCDE.5030308@looze.net> <42D4F282.9000504@mindrot.org> <42D50EE1.6050201@mindrot.org> <42D51982.4070009@looze.net> Message-ID: <091BE381-AF0F-4058-ABCC-57BD4705EFA1@dixongroup.net> On Jul 13, 2005, at 9:39 AM, Gijs Molenaar wrote: > All tests where done with v5 cisco flows, and on a 2 proccessor > system. I hear SGI Altix are on sale these days. ;-) -- Jason Dixon DixonGroup Consulting http://www.dixongroup.net From gijs at looze.net Thu Jul 14 00:58:17 2005 From: gijs at looze.net (Gijs Molenaar) Date: Wed, 13 Jul 2005 16:58:17 +0200 Subject: [netflow-tools] flowd benchmark In-Reply-To: <091BE381-AF0F-4058-ABCC-57BD4705EFA1@dixongroup.net> References: <42D4CCDE.5030308@looze.net> <42D4F282.9000504@mindrot.org> <42D50EE1.6050201@mindrot.org> <42D51982.4070009@looze.net> <091BE381-AF0F-4058-ABCC-57BD4705EFA1@dixongroup.net> Message-ID: <42D52C09.9000606@looze.net> Jason Dixon wrote: >I hear SGI Altix are on sale these days. ;-) > I had one, but it doesn't run Half-life 2... From pete at midworld.co.uk Tue Jul 19 20:41:30 2005 From: pete at midworld.co.uk (Pete Bristow) Date: Tue, 19 Jul 2005 10:41:30 +0000 Subject: [netflow-tools] Filtering by IP Message-ID: <4623642f12fca50b207a6493280da478@midworld.co.uk> Hi The filtering in flowd is very reminiscent of pf. I was wondering if it's possible to have something along the lines of internal_traffic = "{ 192.168.0.0/24 192.168.2.0/24 }" discard src $internal_traffic dst $internal_traffic If not what's the suggested way of doing this is as once you have more than a few subnets the rule set grows quite large and I'd imagine quite inefficient to run. Thanks Pete From djm at mindrot.org Tue Jul 19 21:56:01 2005 From: djm at mindrot.org (Damien Miller) Date: Tue, 19 Jul 2005 21:56:01 +1000 Subject: [netflow-tools] Filtering by IP In-Reply-To: <4623642f12fca50b207a6493280da478@midworld.co.uk> References: <4623642f12fca50b207a6493280da478@midworld.co.uk> Message-ID: <42DCEA51.2050203@mindrot.org> Pete Bristow wrote: > Hi > The filtering in flowd is very reminiscent of pf. I was wondering if it's > possible to have something along the lines of heh, that is because the flowd rule parser is based on pf's :) > internal_traffic = "{ 192.168.0.0/24 192.168.2.0/24 }" > discard src $internal_traffic dst $internal_traffic No, that isn't presently supported. > If not what's the suggested way of doing this is as once you have more > than a few subnets the rule set grows quite large and I'd imagine quite > inefficient to run. It shouldn't matter much - the rules are very fast to run and, compared to a packet filter, aren't executed nearly as often. Also, remember that pf internally expands a rule like: pass in from { 192.20.0.1, 192.20.0.2 } to any into two separate rules: pass in from 192.20.0.1 to any pass in from 192.20.0.2 to any (though the skip step optimisation speeds things up quite a bit) -d