-
My module to deploy Kubrnetes clusters on Proxmox using Talos is now documented, and published on github:
https://github.com/jpetazzo/taloprox/
Last step, perhaps write a blog post about all this? 🤔
-
Good news, everyone (especially me): the issue turned out to be both logical and... not so logical.
Formerly, we were booting the Talos nodes on a disk image coming from the Talos factory. That disk image had all the configuration we wanted; in particular, it had the "nocloud" flavor, meaning: "hey, I'm going to give you a bunch of information - including your IP address - through a particular way - in this case, a tiny filesystem on a virtual block device. But now, we're booting from an ISO image. We can't *run* from an ISO image (although, technically, since Talos is immutable... I guess we should be able to? I wonder if that's possible?), so in the Talos MachineConfig, we pass an "install" block to say, "hey, install Talos on this particular disk". And here, there is an "image" parameter, to tell which image you want to use.
Naively, I thought that omitting this parameter meant "infer the image from the ISO" (i.e., use a nocloud image). I was wrong! It picks a different image. In this case, the "metal" image. And the metal image doesn't give a damn about the nocloud metadata, and just does DHCP in that case. So it makes sense!
...But also... Since I booted from a *nocloud* image, why can't it default to a *nocloud* installer? No idea.
Anyways, I changed my MachineConfig template to include the correct image and now we're back in business. Clusters are up and running.
So now I can go back to writing docs and perhaps publishing this module, ... but also I need to pack for my trip to Tennessee. So we'll see :)
-
Of course, I can't be trusted to take notes properly, so I haven't properly documented progress on this thread 😅
But here is what happened since last time...
I moved all that to a module that I intend to publish. This led me into investigating how to set default computed values for the module inputs. For instance, I want to be able to specify IPV4 and IPV6 subnets, but if no subnet is specified, I want to pick a random one.
I also added a bunch of documentation.
And then I tested everything using a Proxmox token instead of SSH access, and ... of course it broke, because I was importing a disk image (downloading a raw disk image from the Talos image factory) and that requires SSH access. Because the Proxmox API is annoying like that.
(I didn't think that'd be an issue because that particular feature wasn't listed in the bpg provider under "stuff that requires SSH access".)
So I'm now refactoring everything to install from an ISO image instead (since that doesn't require SSH access), but of course, yak shaving happened: when installing from a Talos image, when the VM reboots, instead of using the static IP address passed by Proxmox in the "nocloud" payload, it's now obtaining an address from DHCP. Which means that cluster bootstrap doesn't work anymore.
I'm now pondering options:
- switching back to raw disk provisioning (and requiring SSH access for my module to work)
- passing IP addresses in the Talos MachineConfig (that should definitely work, right?)
- finding out if there is a way to tell Talos to use the nocloud payload even when rebooting (actually kexec-ing) the disk install
-
OH: "what's that song about... you know... being a refrigerator repairman?
— money for nothing?
— yeah!"