Timing glitch during startup of forwarded connection?

Tom Lane tgl at sss.pgh.pa.us
Mon Dec 17 04:17:32 EST 2001


I have been chasing a reliability problem that seems to boil down to
this: sshd needs a certain amount of time delay between opening a
forwarded connection and receiving the first data packet for the
connection.  Otherwise it never forwards the data.  Anyone heard of
such a thing, or have an idea where to look for the problem?

I see a possibly-related bug report in the archives:
http://marc.theaimsgroup.com/?l=openssh-unix-dev&m=100147052203072&w=2
but little followup.

Details: I'm using openssh-2.9.9p2 here ("here" being an HPUX 10.20 box)
to tunnel to a machine inside my company's firewall (that machine is
running openssh_2.9p2 on RHLinux 7.0).  The tunnel connection carries
several forwarded ports, including one that forwards POP connections
to my company's mail server, elsewhere inside the firewall.  What I was
seeing was that fetchmail runs would sometimes work and sometimes hang
up at the start of the connection.  I first assumed it was fetchmail's
problem and set to work debugging fetchmail, but could find no problem;
what's more, inserting any breakpoint between opening the connection and
sending the first POP command caused the problem to vanish.  I then
tried debugging ssh instead, with the same results: AFAICT, ssh is
sending the first data packet on, but nothing comes back, if there is
too little delay before the packet is sent.  Building with PACKET_DEBUG
enabled suppresses the problem, evidently because executing buffer_dump
a couple of times provides the necessary delay.  (Commenting out the
calls to buffer_dump in packet.c allows the problem to come back, even
with -v -v -v.  The debug trace shows no indication of trouble.)

The sshd connection itself is not frozen, as data will transfer just
fine over other new or existing tunneled connections.  Also, if I give
up and kill fetchmail, the stuck tunneled connection is closed down
properly.

I don't have the privileges to try to debug sshd at my company's server,
so I'm kinda stuck at this point.

According to reports in my company's intranet, other people have seen
similar problems with fetchmail hanging for quite some time, but
apparently no one has tried hard to chase it down.  I take this to mean
that the problem is not specific to my platform or the particular
openssh versions in use.

Any suggestions would be much appreciated.

			regards, tom lane



More information about the openssh-unix-dev mailing list