[Bug 1831] New: Repeatable crash of softflowd on high PPS collector?

bugzilla-daemon at bugzilla.mindrot.org bugzilla-daemon at bugzilla.mindrot.org
Wed Nov 3 05:34:01 EST 2010


https://bugzilla.mindrot.org/show_bug.cgi?id=1831

           Summary: Repeatable crash of softflowd on high PPS collector?
           Product: softflowd
           Version: -current
          Platform: amd64
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: softflowd
        AssignedTo: djm at mindrot.org
        ReportedBy: p.wood at lancaster.ac.uk


Good Evening,

First of all, thanks for your efforts we've been using softflowd here
at Lancaster University for some time and we love it. It's previously
been happily running on a FreeBSD 7.X machine.

We've obtained a monitoring server which we're resetting up our
installation on, due to hardware and NIC limitations we're now having
to use Ubuntu Linux. In this case 10.04 on AMD64, this issue occurs
with softflowd 0.9.8 in packages and from source. This server is
currently receiving around 40kpps, full payload.

I can run up softflowd and after a short period (fairly random) the
following happens:

Nov  2 17:49:09 packet softflowd[2533]: softflowd v0.9.8 starting data
collection
Nov  2 17:49:09 packet softflowd[2533]: Exporting flows to
[127.0.0.1]:12001
Nov  2 17:49:18 packet softflowd[2533]: Shutting down after pcap EOF
Nov  2 17:49:18 packet softflowd[2533]: Shutting down on user request

I've traced this through the softflowd code, and it appears to be
softflowd.c:1870 at "fault":

                        } else if (r == 0) {
                                logit(LOG_NOTICE, "Shutting down after
pcap EOF");
                                graceful_shutdown_request = 1;*/
                                break;
                        }

r is the return value from pcap_dispatch, according to the
pcap_dispatch man page during live capture a return of 0 can mean
simply that there is no data for the pcap consumer to use. Commenting
out this section results in a completely usable version of softflowd,
which is currently in use for us. 

I've seen comments around the code base that there are issues with
timeouts? Perhaps for some reason this is getting here when there's no
data for it to deal with? I apologise there's no patch here to fix
this, I'll look at what I can do but right now I've got to complete the
rest of the setup.

Kind regards,

Peter.

-- 
Configure bugmail: https://bugzilla.mindrot.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.


More information about the openssh-bugs mailing list