GSSAPI vs load-balanced servers - anything we can do?
Jan Iven
jan.iven at cern.ch
Sat Sep 15 02:53:38 EST 2007
Dear all,
(apologoies - this has nothing to do with 4.7 being out, but is rather a
long-standing issue that regularly bites us).
Is there anything I could do to further the case of
https://bugzilla.mindrot.org/show_bug.cgi?id=1008
As a summary, GSSAPI auth against machine in a DNS load-balanced server
farm fails. SSH-1 Kerberos works.
DNS load-balanced farm:
Individual machines in the farm have separate IP addresses (ipA, ipB),
separate hostnames (nameA, nameB, ..) and separate Kerberos identities
(host/nameA.domain at REALM) . A common DNS name (clustername) resolves to
one or several IPs. Reverse lookup on the IP gives the individual
machine name. (seems to be a common & cheap way to spread load).
The problem is that GSSAPI insists on doing its own DNS lookup
(forward+reverse) to determine the (Kerberos) identity of the server,
and has a fair chance of getting a different reply. So a typical session
looks like
client: gethostbyname( clustername) -> ipA
connect (ipA)
(KEX and other wonderful SSH stuff)
do GSSAPI auth
gethostbyname(clustername) -> ipB
gethostbyaddr( ipB) -> nameB
get service ticket for host/nameB.domain at REALM
send ticket to connected machine (nameA)
server: huh? Enotmynameinticketgoaway.
The GSSAPI behaviour is apparently mandated by RFC1964 (2.1.3):
> When a reference to a name of this type is resolved, the "hostname"
> is canonicalized by attempting a DNS lookup and using the fully-
> qualified domain name which is returned, or by using the "hostname"
> as provided if the DNS lookup fails. The canonicalization operation
> also maps the host's name into lower-case characters.
so is unlikely to change.
The only workaround seems to be feeding the canonical hostname (or IP)
of the currently-connected server machine into GSSAPI, instead of the
hostname the user provided (this is what SSH-1 Kerberos did, by the way).
While in principle we could change the reverse DNS of the cluster
machines to point to the cluster name, this would introduce confusion
for everything that known already which exact host to connect to.
This is a client-side issue, so no amount of patching on the server will
make this issue go away. In addition, we need to convince vendors to
provide patches to their deployed "legacy" versions, which is made
difficult by the fact that this is not fixed "upstream". We seem to have
convinced Red Hat that this is an issue.
Two-line patch is at https://bugzilla.mindrot.org/attachment.cgi?id=1202.
https://bugzilla.mindrot.org/show_bug.cgi?id=1008 also has a more
elaborate version by Simon that introduces a new config option.
I'd be happy to forward-port either to 4.7, if there is a chance that
this will get applied one day.
Sorry for the lengthy post, thanks for your time.
Jan
More information about the openssh-unix-dev
mailing list