ssh host keys on cloned virtual machines

Nico Kadel-Garcia nkadel at gmail.com
Tue Feb 28 17:22:14 AEDT 2023


On Mon, Feb 27, 2023 at 8:33 PM Thorsten Glaser <t.glaser at tarent.de> wrote:
>
> On Mon, 27 Feb 2023, Nico Kadel-Garcia wrote:
>
> >> > does any one of you have a best practice on renewing ssh host keys on cloned
> >> > machines?
> >>
> >> Yes: not cloning machines.
> >
> >Good luck with *that*. Building VM's from media is a far, far too
> >lengthy process for production deployment, especially for auto-scaling
> >clusters.
>
> (It’s “VMs”, no genitive apostrophe.)

OK, point.

> What media? debootstrap + a local mirror = fast.
> In fact, much faster than cloning, possibly large, filesystems,
> unless you use CoW, which you don’t because then you’re overcommitting.

Sure, I was doing that sort of "local build into a chroot cage" stunt
in 1999. It's re-inventing the wheel, and using a 3-D printer to spend
time making it, when you've already a very broad variety of off-site
VM images, and well defined tools for deploying them directly. I
suspect that most of us have better things to do with our time than
maintaining a local mirror when our friends in every cloud center on
the planet have already done the work.

> >> There’s too many things to take care of for these. The VM UUID in
> […]
>
> >That's what the "sysprep" procedure is for when generating reference
> >VM images, and "cloud-utils" for setting up new VMs from images, at
>
> What guarantees you “sysprep” and “cloud-utils” find everything that
> needs to be changed?

What makes your customized, hand-written, internal versions of such
tools are better and will work more reliably than a consistently and
effectively used open source tool?

> (I’m not sure where inode generation numbers are (still) a concern,
> on what filesystems, anyway. They only come into play with NFS,
> AFAIK, though, so that limits this issue. When they come into play,
> however, they’re hard to change without doing a newfs(8)…)

They exist for other filesystems. I've not really gone digging into
them, they Just Work(tm) for the imaging tools applied to the VM
images.

> >If people really feel the need for robust random number services,
> >they've got other problems. I'd suggest they either apply an init
> >script to reset whatever they feel they need on every reboot, or find
>
> I think you’re downplaying a very real problem here, as an aside.

In the last 35 years, I've only seen some care much about the RNG.....
twice. And those hosts wound up with physical random number generators
on PCI slots, it was years ago.

> >The more host-by-host customization, admittedly the more billable
> >powers and the more yourself personally into each and every stop. But
> >it doesn't scale
>
> Huh? Scripting that creation from scratch is a job done once that
> scales very well. debootstrap is reasonably fast, installation of
> additional packages can be fast as well (since it’s a new image,
> use eatmydata or the dpkg option I’ve not yet remembered).

I've been the guy who had to do it with large deployments, up to about
20,000 hosts of quite varied hardware from different vendors and
different specs. I do believe they replaced my tools, after about 20
years. when someone found an effective open source tool. That
especially included kernel updates to support the newer platforms. I
have stories when someone deployed out-of-date OS images remotely on
top of the vendor image we gave hardware vendors for initial
deployment.

Being able to use a vendor's already existing tools, such as every
cloud provider's tools, saves a *lot* of time having to re-invent such
wheels.

> And, given the system’s all-new, I believe this is even more
> reliable than cloning something customised, then trying to
> adapt *that* to the new requirements.

There are trade-offs. One is that skew of the OS building tools can
lead to skew among the images. Another is that what you describe does
not scale automatically and horizontally for commercial auto-scaling
structures, which are almost always VM image based these days, and
wind up with the identical host key or skewed host keys for the same
re-allocated IP address problem, it's work to try to resolve both.
Much, much simpler, and more stable, to simply ignore known_hosts and
spend your time on the management of user public keys, which is
generally the far greater risk.

> >, and you will eventually be told to stop wasting your
> >time if your manager is attentive to how much time you're burning on
> >each deployment.
>
> If I’ve scripted the image creation, it’s no more work than
> a cloning approach.

Been there, done that, and it keeps needing tweaking.

> >> This is even more true as every new machine tends to get just the
> >> little bit of difference from the old ones that is easier to make
> >> when not cloning (such as different filesystem layout, software).
> >
> >And *that* is one of the big reasons for virtualization based
> >deployments, so people can stop caring about the physical subtleties.
>
> ?!?!?!
>
> How does that translate into needing, say, 8 GiB HDD for some VMs but
> 32 GiB HDD for some others?

Consistently create small images. Expand them to include that
available disk space with an init script embedded in the image.
Remember when I mentioned 20,000 hosts at a time? It's admittedly
similar to the eork needed for deploying images to new hardware.

> This has *NOTHING* to do with physical vs virtual platforms.

Virtual platforms label the disks and partitions fairly consistently.
Convincing /etc/fstab to work for newly deployed hardware can be....
tricky, if the image deployment uses distinct drive labeling. Been
there, done that, have scar tissue from the Promise SATA controller
drivers that renumbered the /dev/sd* labeled drives in the kernel to
pretend that their add-on card had the first labeled drives. Drove me
*nuts* unfurling that one, because it depended on which kernel you
used.

> >predicted, nor was reverse DNS likely to work at all which was its own
> >distinct burden for logging *on* those remote servers.
>
> Maybe invest your time into fixing infrastructure then…

The reverse DNS was not my infrastructure to fix. When you host
servers remotely, convincing the remote datacenter to do reverse DNS
correctly is.... not always an effective use of time.

> >> (Fun fact on the side, while doing admin stuff at $dayjob, I even
> […]
>
> >You probably don't work on the same scale I've worked, or had to
>
> No, not for that. If I had to do it at larger scale I would have
> scripted it. I didn’t so it turned out to be cheaper, work-time-wise,
> to do part of the steps by hand the few times I needed to do it.
> I don’t admin stuff at work for others any more, so that point is
> moot. But I did want to share this as an anecdote: when scaling very
> small, the “stupid” solution may be better than a clever one.

Yeah, one-offs or small tasks can just be faster to use a few lines of
shell or even a manual step. I've been dealing with bulky environments
where infrastructure as code is vital. Spending the time to ensure
individualized hostkeys, when there's a significant chance of IP
re-use and conflicting host keys to clean up, is... well it's time
better spent elsewhere. It's why tools like "ansible" typicall disable
the known_hosts file at run time with just the ssh_config settings I
mentioned. They don't have the time to manually validate SSH host key
conflicts when deploying new servers.

> >It's also a great way to stretch
> >our billable hours with very familiar tasks which only you know how to
> >do.
>
> I don’t need to do that. Besides, I’m employed, not freelancing,
> so I don’t even have to care about billable hours.
>
> bye,
> //mirabilos

Well, good for you. I'm sad to say I've seen people chortling over how
ver, very, very clever they were with deliberately hand-tuned setups
to assert their complete mastery over their turf, and been brought in
a few times to stabilize the mess when they left. It's lead me to a
lot of "keep it very, very simple" steps like "don't bother using
known_hosts".


More information about the openssh-unix-dev mailing list