[netflow-tools] softflowd keeps crashing

alex k xela at mailinglist.at
Sat Apr 25 02:13:50 EST 2009


> On Mon, 13 Apr 2009, alex k wrote:
>
>> > Are all the flows incorrectly dated, or just the ones from around the
>> time
>> > softflowd exited?
>> >
>>
>> It seems to me, that the first one or two flows after the crash
>> (softflowd
>> gets started automatically) are the wrong dated ones.
>> It crashed at 00:04 and was started a few seconds after that (I found a
>> very fast way to do that).
>
> Just to be clear: it is the first flows out of softflowd after a restart
> and not the last couple before a crash that have invalid times? Are both
> the start time and the end time incorrect?
>
> Could you try to find the details of this flow in the softflowd debug log
> and see if the times are incorrect there too? The flow start time comes
> from
> libpcap, so it is possible that it is giving us bad data.
>
>> >> What happened? Network error? Corrupted file? Socket problem?
>> >
>> > Is anything restarting (bringing down and back up) the network
>> interface
>> > on which softflowd is listening? That can cause this sort of problem.
>> > This line:
>> >
>> >> Shutting down after pcap EOF
>> >
>> > Indicates that libpcap has closed itself.
>>
>> As far as I can see, the network interface had no problem at that time.
>> The host is monitored and was never unreachable.
>> It could have been a problem with VMware. (The IP with the wrong dated
>> entries is a virtual machine.)
>>
>> How can I find out, if it's a libpcap problem? It all happens in memory,
>> right?
>
> Are you running softflowd with a pcap filter on the commandline?
>
> You might also want to try this diff:
>
> Index: softflowd.c
> ===================================================================
> RCS file: /var/cvs/softflowd/softflowd.c,v
> retrieving revision 1.98
> diff -u -p -r1.98 softflowd.c
> --- softflowd.c	3 Sep 2007 10:50:05 -0000	1.98
> +++ softflowd.c	13 Apr 2009 11:04:10 -0000
> @@ -1916,7 +1916,7 @@ main(int argc, char **argv)
>  				logit(LOG_ERR, "Exiting on pcap_dispatch: %s",
>  				    pcap_geterr(pcap));
>  				break;
> -			} else if (r == 0) {
> +			} else if (r == 0 && capfile != NULL) {
>  				logit(LOG_NOTICE, "Shutting down after "
>  				    "pcap EOF");
>  				graceful_shutdown_request = 1;
>

Hi Damien,

Something _really_ ugly happened. I was very busy this week, so I placed
reliance on my monitoring. Bad idea... softflowd didn't crash, the process
stayed, but it stopped to capture packets and to expire flows last sunday.
Absolutly nothing in the nohup.out file (no "Shutting down..." message).

What happened? "capfile" was NULL?

Before I finally killed the leftover (and useless) softflowd process, I
tried "softflowctl statistics". It gave me:

softflowd[26288]: Accumulated statistics:
Number of active flows: 0
Packets processed: 53254480
Fragments: 280
Ignored packets: 198188 (198188 non-IP, 0 too short)
Flows expired: 434564 (0 forced)
Flows exported: 841806 in 132789 packets (0 failures)
Packets received by libpcap: 109063926
Packets dropped by libpcap: 55601898
Packets dropped by interface: 0

Expired flow statistics:  minimum       average       maximum
  Flow bytes:                  40         64771     104883460
  Flow packets:                 1           123        105328
  Duration:                  0.00s        47.66s        77.08s

Expired flow reasons:
       tcp =    145544   tcp.rst =         0   tcp.fin =         0
       udp =      4239      icmp =     36829   general =         0
   maxlife =    247952
  over 2Gb =         0
  maxflows =         0
   flushed =         0

Per-protocol statistics:     Octets      Packets   Avg Life    Max Life
           icmp (1):      263824161      2779572      57.46s      74.46s
            tcp (6):    27819369585     49610926      39.47s      77.08s
           udp (17):       64032137       863982      44.32s      72.45s

It seems, everything was fine, then something happened with "capfile" and
softflowd couldn't proceed. No more active flows.

What could we try next?

xela




More information about the netflow-tools mailing list