Phasing out forwarding of locale settings

Sat Sep 11 22:23:58 AEST 2021

Hi Jochen,

Jochen Bern wrote on Fri, Sep 10, 2021 at 11:24:56PM +0200:
> On 10.09.21 12:36, Ingo Schwarze wrote:
>> Jochen Bern wrote on Thu, Sep 09, 2021 at 08:28:27PM +0200:

>>> What you could ask for *here* is that OpenSSH stops supporting SendEnv /
>>> AcceptEnv altogether - but I have a hunch that you'll need a much more
>>> convincing case to get *that* thermonuclear solution.

>> I realize you may not be serious about this

> Well, *half*-joking, actually. When passing on a user's locale is seen
> primarily as a potential attack vector,

That is not what i was trying to say.  Quite to the contrary, i
said "Neither passing nor not passing these variables does anything
to improve security."  What surprises me is that people discuss
passing irrelevant data while at the same time apparently ignoring
the actual problem.

> worse than forcing every which user to try to master the CLI in the
> one language chosen by the admin,

That is definitely not required, in no case whatsoever.

When you first log into a machine, for security reasons, you have
to check *anyway* what the default locale(1) settings are (even
though i do not doubt many users neglect doing that).  If these
locale(1) settings are unsafe with the terminal mode you are using
on any of the machines you will be connecting from, then you have
to define the necessary LC_* variables in the adequate shell startup
files in your home directory on the target machine, such that they
are safe with all terminal modes you will use on machines you are
connecting from.  From that point onward, whatever locale(1) defaults
the sysop on the target machine may have chosen no longer apply to you.

When connecting from a new machine, you need to check your terminal
mode before connecting and consider whether it is safe with the
locale(1) settings you chose earlier on the target machine, and
change the terminal mode if it is not safe, before you connect.  In
the relatively unusual case that this is not possible, if you trust
the sysop of the target machine enough to not put malicious code
into global shell startup files, you can connect even with an unsafe
terminal mode, but then the first thing you do on the target machine
after connecting is of course adjusting the remote locale(1) manually
for this unusual connection before you start any real work on the
remote machine.

Obviously, none of this can be automated, because it depends on various
factors that none of the involved programs and machines can inspect,
in particular:
 * Which terminals or terminal emulators and which modes you use
   on *all* the machines you may be connecting from (the target
   machine can know nothing about that, and even the shell you
   are starting the ssh(1) client from cannot really know that
   reliably).
 * Which locales are available on the target machine (none of the
   machines you are connecting from can know that).

> then I don't quite see what setting *would* be considered safe
> enough for forwarding.

Again, SendEnv / AcceptEnv cannot *make* any of this safe.
Users need to use their brains to make their connections safe.

And *if* users use SendEnv / AcceptEnv - typically for reasons other
than safety - then again, they have to use their brains to make
sure what they do afterwards is safe with the SendEnv / AcceptEnv
settings they chose earlier.

> *Especially* not $TERM with all its historic baggage, I guess.

At least $TERM is usually set by the terminal emulator, so it usually
matches the terminal you are really using on the client side.
Besides, the ssh_config(5) manual explains that passing it is
required by the protocol, and it is indeed not clear to me how
a pseudo terminal on the server should behave without it.

[...]
>> Needless
>> to say, passing LC_* is usually *not* useful because defining it
>> statically on both sides is usually simpler and more robust.

> I have experienced the communication between a German NOC/ops and a
> French dev team in an enterprise that had a "just have everyone speak
> English" policy. If we hadn't all been IT professionals with years of
> experience in pre-locales computers, and using the same prod platform
> CLIs consequently would have been as much of a stutterfest as the phone
> calls were ...

I don't really see the problem here.  In that company, you would
obviously set all computers to a default of LC_ALL=en_US.UTF-8
(or en_GB.UTF-8 if that is what you prefer for some reason) and
tell all employees to make sure all their terminals run in UTF-8
mode all the time, on all company and private computers they use
for connecting to company equipment.  Problem completely solved
without passing any environment variables around.

If any individual employee desires to ignore company policy, they
can still set LC_ALL=de_DE.UTF-8 or even LC_ALL=ja_JA.UTF-8 to their
heart's content in their personal shell initialization files on all
company and private computers where they want that, and they are
still safe as long as they stick to the *.UTF-8 part.

>> I believe i said this before, but it seems people missed it:
>> What is discussed here has security implications.  Specifically,
>> if the shell on the server uses a locale that does not match the
>> mode the client terminal or terminal emulator is using, the client
>> is susceptible to terminal state corruption attacks

> While forcing the server to make (fixed global) choices *without* having
> any information on the client software's status and the user's native
> language will avoid any mismatch ... seriously?

No, it will not, and i didn't intend to claim that.

What matters is how people behave, not whether these variables are
passed around or not.

>> Neither passing nor not passing these variables does anything to
>> improve security.  You are passing *the wrong data*.  What matters
>> is the mode the client terminal is running in.

> I consider it *very* vital that the login on some remote machine doesn't
> suddenly talk to me in Turkish just because the guy who installed the OS
> liked that better; it's a DoS attack on me as much as falsifying
> filenames in the "ls" output is.

When first logging into a new machine, you have to check the locale(1)
settings anyway.  You have to do that even if you use SendEnv -
because you don't know beforehand whether that new machine will
AcceptEnv, and even if it does, whether it has the locale that you
send to it installed.  Also remember that locale names are not
standardized, so your preferred locale might be installed, but using
a different name.

> (And yes, my current employer had a *number* of IT guys of Turkish
> origin in its early days, so that's not a completely outlandish scenario.)
> 
> But tell me this: If it is so important that the terminal mode be
> communicated correctly, why doesn't any terminal software I know at
> least reflect the *current* mode into $TERM, which already *is* both
> earmarked for the purpose of passing information about the terminal, and
> special-cased by OpenSSH?

That is a good question.  I cannot really make an authoritative statement
about that because i do not maintain any terminal emulator package.

Then again, let me speculate that designing such a feature might
not be trivial for more than one reason.  Firstly, at least some
terminal emulators - including xterm(1) - support changing the mode
interactively, and i am not aware of any way to change environment
variables in child processes that were started earlier.  On top of
that, even though there is historical precendent for providing more
than one alternative $TERM setting for the same kind of terminal,
and while it might be possible to communicate some aspects of
terminal settings of some terminal emulator implementations via
$TERM, terminal settings can be numerous, and i'm not sure all that
can reasonably be communicated via $TERM.  In particular, the
character set and encoding expected by the terminal matters for
what we are discussing, and i'm not sure that can in general be
communicated through $TERM.  At least it would need careful design
to not go overboard.

And even if some terminal emulator would implement something like that,
users would still have to use their brains and check whether all the
terminals they are using actually do that and whether the servers
they connect to actually use the settings being passed correctly,
so i don't think it would fundamentally change the situation that
blindly relying on passing or not passing some environment variables
is insufficient.

Besides, even if the xterm(1) could somehow bypass the local shell
and the ssh(1) client program and communicate to the server that
it is using traditonal 8-bit ISO-Latin-1 mode (which is typical
example of an unsafe mode no matter what the remote locale is),
what is the server supposed to do about that?  Setting an ISO-Latin-1
locale on the server wouldn't help much, and on top of that, such
a locale is not even likely to be installed on a server nowadays.

I still think that passing data around that is closely related to an
actual problem, but not really contributing to a solution of that
problem, is not a smart idea.  Focussing the discussion on whether
or not irrelvant data should be passed by default does not seem
like the ideal kind of discussion either.

Yours,
  Ingo