AcceptEnv LANG LC_* vs available locales

Jochen Bern Jochen.Bern at binect.de
Sun May 1 08:42:20 AEST 2022


On 28.04.22 11:05, Ingo Schwarze wrote:
> Jochen Bern wrote on Thu, Apr 28, 2022 at 08:53:32AM +0200:
>> And here I thought that, after OpenVPN painstakingly retrofitting one
>> into their data channel setup, "when you need an agreement, you need an
>> explicit negotiation phase" would be a commonly accepted tenet now ...
> 
> This would be a nice to have if it were possible.
> But calling an impossible task "a commonly accepted tenet"
> feels unwise to me.

It was not "impossible" for the developers of OpenVPN, and those did 
*not* have the advantage of one side *already* havind a mechanism to 
pass the pertinent information to the other, like OpenSSH does with the 
env var forwarding for the (client side) locales.

> And if the client says they want "zh_CN.UTF-16" but the server only
> knowns how to do "C" and "en_US.UTF-8", then what are you going to do,
> as just one example among a host of tricky combinations?  In fact,
> the only safe option in such a situation is to reject the connection.

If that's your opinion, go ahead and implement it. Which my suggestion 
of a server-side helper evaluating the client-side-set and 
server-side-available locales for *some* decision *does* support. It 
actually even allows you to send back a US-ASCII error message *telling* 
the user what the problem is before summarily kicking him out. But don't 
expect the rest of the planet to join you in waiting for the *perfect*, 
totally loophole-free solution you seem to insist on.

Which is not to say, mind, that such a thing will not come EVENTUALLY. 
Unicode seems well-poised to prove being the end-all complete framework 
for transmitting and outputting texts, short of quipus or a first 
contact situation; I expect terminals to gravitate more and more towards 
implementing it, increasing the pressure on the consortium to include 
two codepoints to "shift to control" and "shift back" into its 
standards, laying the groundwork of a *global* algorithm for terminals 
to tell control data from content.

(And considering that easily more than half of the control sequences 
*do* have an effect on the rendering of the "printable" text, who says 
that standardizing them in their entirety is forever outside the 
consortium's purview, even?)

But no guarantees that any of that is going to come full circle within 
any present company's lifetimes, though. Hence the question what *we* 
can do to improve the situation, if only partially.

> And even if a safe setting existed in every case, which it does not,
> it would be a complex and open-ended task to figure out what that setting
> is on a given machine to be compatible with an arbitrary locale name
> received over the wire, since POSIX explicitly says that the meaning
> of locale names (apart from "C" and "POSIX") is implementation-defined.

That situation is, however, not going to improve if you go and tell 
everyone that "it's nobody else's business" if they choose to keep 
inventing their very own locale XY_äöü for language XY. And the way it 
is right now, chances are that such maintainers will *never even hear* 
about the interoperability problems they cause. Even if an interop 
standards project (like one to write said helper) cannot *solve* 
everything, it can at least *demonstrate* that XY_äöü causes 
significantly more problems that XY_ok, and *why* that is.

> So for every operating system and every possible subset of locales that
> might be installed on a server, OpenSSH maintainers would have to maintain

... absolutely nothing. SSH is merely the means connecting a terminal 
(some device to present "printable" data) to a - likely remote - 
shell/application (producing such data), and can be replaced by pretty 
much everything from ye olde serial cable to whatever post-quantum 
communications protocol minimizes in-transit bit decay by Dark Energy. 
All that is required from OpenSSH is to transport the information "this 
is a remote connection" and "these are the other side's capabilities", 
and setting the SSH_* and LC_* env vars on the server side already does 
that nicely.

I'm using the term "(external) helper" instead of "code (to be 
integrated into OpenSSH)" for a *reason*, you know.

> Users might even install their own, personal locale, so a locale string
> might even indicate a locale that is non-standard even for the operating
> system the client happens to be running...

Well, breaking interoperability between "themselves" (user and their DIY 
software) and their *own* computer should teach them to be wary of 
distribs that do not bother about cross-platform interop *real* fast ...

... hmmm, I wonder, would they actually be able to vote with their feet 
and install a port of a better-interop locale from another distrib over 
an isolationist one that came with their locally installed distrib ... ?

> Since they have a german locale set on the the
> client side, they type "ja" when the program on the server asks whether
> they really want to delete the file.  But the server does not have a
> german locale installed, so "ja" actually means "no", the file remains,
> and the secret information is accidentally disclosed without the poor
> user even suspecting something might have gone wrong.

Fine, let's assume for a moment that being prompted with

> Effacer le fichier ULTRAGEHEIM.txt? [o/N] _
fails to inform the user that the server is not exactly talking German 
to him. Let's further assume that we're not talking about a yes/no 
decision (FYI, I would still have to find a German localization asking 
"j/N ?" that would not also accept the C/English "y" for a "yes"), but 
something that's not as trivial to translate - say, the user enters an 
amount thinking of € but the software takes it as ¥.

Now how is *that* an argument in *favor* of your proposal of having the 
server *downright ignore* the locale the user has tried to set!? A user 
*that* dependent on a never-changing interface, and robbed of the option 
of adapting the server's setup, would need to have a dedicated immutable 
server to use for the rest of his (work)life. (Preferably one that still 
accepts amounts in DM instead of €, I suppose.)

Regards,
-- 
Jochen Bern
Systemingenieur

Binect GmbH
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3449 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20220501/606196f7/attachment-0001.p7s>


More information about the openssh-unix-dev mailing list