home.social

#serveroutage — Public Fediverse posts

Live and recent posts from across the Fediverse tagged #serveroutage, aggregated by home.social.

  1. When two Hetzner servers died at the same time

    On May 12, 2026, two of my Arch Linux + LUKS servers at Hetzner became unreachable at the same moment. Both had been running for 4+ months without issue. Both had received the same pacman -Syyu the day before, but had stayed on the old kernel until the morning the websites stopped responding. I rebooted — SSH never came back. nmap -Pn -p 22 showed filtered from anywhere. No ping. No banner. The Hetzner Robot panel insisted the hardware was fine.

    Several hours went into hypotheses that turned out to be wrong:

    • The encryptssh initcpio hook referencing a /usr/lib/initcpio/udev/11-dm-initramfs.rules file that no longer exists. Real bug, no boot impact — the initramfs rebuilds anyway.
    • PermitRootLogin no in sshd_config. Real misconfiguration, fixed it, didn’t help. A refusing sshd shows closed, not filtered.
    • Predictable interface-naming drift after the systemd 260 upgrade. Patched the .network config to match by MAC. Useful hardening; not the cause.
    • Stale GRUB stage1 + core.img in the MBR. Arch never re-runs grub-install after a grub package upgrade. Refreshed it. Still filtered.
    • Kernel 7.0.5 regression. Downgraded to 6.18.3, the kernel that had run for 4 months. Still filtered. So the kernel itself wasn’t it either.

    The clue was in the persistent journal: a single recorded boot from December 31 to May 12 10:13 UTC, and absolutely nothing after. Every reboot since the upgrade was failing before systemd-journald could flush to disk — so the failure had to be in the initramfs, before the root filesystem was even mounted.

    What it almost certainly was

    Hetzner Dedicated servers configure the initramfs network with ip=dhcp on the kernel command line. That depends on Hetzner’s DHCP server replying to whatever request format the current kernel sends. Somewhere between kernel 6.18 / iproute2 6.18 and kernel 7.0 / iproute2 7.0, the request format changed enough that Hetzner’s DHCP stopped responding. Effects:

    • Old kernel at runtime kept the interface already configured (Phase A — 32 hours of healthy operation after the package upgrade).
    • New kernel cold-boots, hits DHCP, never gets an IP, dropbear cannot listen, port 22 stays filtered.

    Hetzner’s own documentation has been quietly moving away from ip=dhcp toward static IPv4 in the kernel command line. The fix is exactly that:

    GRUB_CMDLINE_LINUX="cryptdevice=/dev/md1:cryptroot ip=A.B.C.D::GATEWAY:255.255.255.255:hostname:eth0:none"
    

    One line in /etc/default/grub, grub-mkconfig, reboot. No more dependency on Hetzner’s DHCP responding to whatever your current kernel sends.

    Why it matters for anyone running this stack

    If you run Arch on Hetzner Dedicated with full-disk encryption and remote unlock via dropbear, the ip=dhcp shipped by installimage is a latent bug. It can keep working for years and then break overnight, on every machine you have, after a routine pacman -Syyu. The static-IP version is what Hetzner now recommends and removes the entire dependency.

    Tooling

    While debugging, I turned the whole rescue / chroot / diagnose / fix workflow into a Python CLI (hal) — including hal fix static-ip, which derives the static cmdline directly from your existing systemd-networkd .network file:

    github.com/kevinveenbirkenbach/hetzner-arch-luks

    Single command, idempotent, reversible (the original /etc/default/grub is backed up to .hal-backup). If you’re on this stack, switch to static IP before the next kernel upgrade catches you.

    #ArchLinux #bootFailure #debugging #DevOps #DHCP #Dropbear #fullDiskEncryption #GRUB #Hetzner #initramfs #kernelUpgrade #Linux #LUKS #mkinitcpio #pacman #postmortem #PythonCLI #serverOutage #sysadmin #systemdNetworkd
  2. When two Hetzner servers died at the same time

    On May 12, 2026, two of my Arch Linux + LUKS servers at Hetzner became unreachable at the same moment. Both had been running for 4+ months without issue. Both had received the same pacman -Syyu the day before, but had stayed on the old kernel until the morning the websites stopped responding. I rebooted — SSH never came back. nmap -Pn -p 22 showed filtered from anywhere. No ping. No banner. The Hetzner Robot panel insisted the hardware was fine.

    Several hours went into hypotheses that turned out to be wrong:

    • The encryptssh initcpio hook referencing a /usr/lib/initcpio/udev/11-dm-initramfs.rules file that no longer exists. Real bug, no boot impact — the initramfs rebuilds anyway.
    • PermitRootLogin no in sshd_config. Real misconfiguration, fixed it, didn’t help. A refusing sshd shows closed, not filtered.
    • Predictable interface-naming drift after the systemd 260 upgrade. Patched the .network config to match by MAC. Useful hardening; not the cause.
    • Stale GRUB stage1 + core.img in the MBR. Arch never re-runs grub-install after a grub package upgrade. Refreshed it. Still filtered.
    • Kernel 7.0.5 regression. Downgraded to 6.18.3, the kernel that had run for 4 months. Still filtered. So the kernel itself wasn’t it either.

    The clue was in the persistent journal: a single recorded boot from December 31 to May 12 10:13 UTC, and absolutely nothing after. Every reboot since the upgrade was failing before systemd-journald could flush to disk — so the failure had to be in the initramfs, before the root filesystem was even mounted.

    What it almost certainly was

    Hetzner Dedicated servers configure the initramfs network with ip=dhcp on the kernel command line. That depends on Hetzner’s DHCP server replying to whatever request format the current kernel sends. Somewhere between kernel 6.18 / iproute2 6.18 and kernel 7.0 / iproute2 7.0, the request format changed enough that Hetzner’s DHCP stopped responding. Effects:

    • Old kernel at runtime kept the interface already configured (Phase A — 32 hours of healthy operation after the package upgrade).
    • New kernel cold-boots, hits DHCP, never gets an IP, dropbear cannot listen, port 22 stays filtered.

    Hetzner’s own documentation has been quietly moving away from ip=dhcp toward static IPv4 in the kernel command line. The fix is exactly that:

    GRUB_CMDLINE_LINUX="cryptdevice=/dev/md1:cryptroot ip=A.B.C.D::GATEWAY:255.255.255.255:hostname:eth0:none"
    

    One line in /etc/default/grub, grub-mkconfig, reboot. No more dependency on Hetzner’s DHCP responding to whatever your current kernel sends.

    Why it matters for anyone running this stack

    If you run Arch on Hetzner Dedicated with full-disk encryption and remote unlock via dropbear, the ip=dhcp shipped by installimage is a latent bug. It can keep working for years and then break overnight, on every machine you have, after a routine pacman -Syyu. The static-IP version is what Hetzner now recommends and removes the entire dependency.

    Tooling

    While debugging, I turned the whole rescue / chroot / diagnose / fix workflow into a Python CLI (hal) — including hal fix static-ip, which derives the static cmdline directly from your existing systemd-networkd .network file:

    github.com/kevinveenbirkenbach/hetzner-arch-luks

    Single command, idempotent, reversible (the original /etc/default/grub is backed up to .hal-backup). If you’re on this stack, switch to static IP before the next kernel upgrade catches you.

    #ArchLinux #bootFailure #debugging #DevOps #DHCP #Dropbear #fullDiskEncryption #GRUB #Hetzner #initramfs #kernelUpgrade #Linux #LUKS #mkinitcpio #pacman #postmortem #PythonCLI #serverOutage #sysadmin #systemdNetworkd
  3. When two Hetzner servers died at the same time

    On May 12, 2026, two of my Arch Linux + LUKS servers at Hetzner became unreachable at the same moment. Both had been running for 4+ months without issue. Both had received the same pacman -Syyu the day before, but had stayed on the old kernel until the morning the websites stopped responding. I rebooted — SSH never came back. nmap -Pn -p 22 showed filtered from anywhere. No ping. No banner. The Hetzner Robot panel insisted the hardware was fine.

    Several hours went into hypotheses that turned out to be wrong:

    • The encryptssh initcpio hook referencing a /usr/lib/initcpio/udev/11-dm-initramfs.rules file that no longer exists. Real bug, no boot impact — the initramfs rebuilds anyway.
    • PermitRootLogin no in sshd_config. Real misconfiguration, fixed it, didn’t help. A refusing sshd shows closed, not filtered.
    • Predictable interface-naming drift after the systemd 260 upgrade. Patched the .network config to match by MAC. Useful hardening; not the cause.
    • Stale GRUB stage1 + core.img in the MBR. Arch never re-runs grub-install after a grub package upgrade. Refreshed it. Still filtered.
    • Kernel 7.0.5 regression. Downgraded to 6.18.3, the kernel that had run for 4 months. Still filtered. So the kernel itself wasn’t it either.

    The clue was in the persistent journal: a single recorded boot from December 31 to May 12 10:13 UTC, and absolutely nothing after. Every reboot since the upgrade was failing before systemd-journald could flush to disk — so the failure had to be in the initramfs, before the root filesystem was even mounted.

    What it almost certainly was

    Hetzner Dedicated servers configure the initramfs network with ip=dhcp on the kernel command line. That depends on Hetzner’s DHCP server replying to whatever request format the current kernel sends. Somewhere between kernel 6.18 / iproute2 6.18 and kernel 7.0 / iproute2 7.0, the request format changed enough that Hetzner’s DHCP stopped responding. Effects:

    • Old kernel at runtime kept the interface already configured (Phase A — 32 hours of healthy operation after the package upgrade).
    • New kernel cold-boots, hits DHCP, never gets an IP, dropbear cannot listen, port 22 stays filtered.

    Hetzner’s own documentation has been quietly moving away from ip=dhcp toward static IPv4 in the kernel command line. The fix is exactly that:

    GRUB_CMDLINE_LINUX="cryptdevice=/dev/md1:cryptroot ip=A.B.C.D::GATEWAY:255.255.255.255:hostname:eth0:none"
    

    One line in /etc/default/grub, grub-mkconfig, reboot. No more dependency on Hetzner’s DHCP responding to whatever your current kernel sends.

    Why it matters for anyone running this stack

    If you run Arch on Hetzner Dedicated with full-disk encryption and remote unlock via dropbear, the ip=dhcp shipped by installimage is a latent bug. It can keep working for years and then break overnight, on every machine you have, after a routine pacman -Syyu. The static-IP version is what Hetzner now recommends and removes the entire dependency.

    Tooling

    While debugging, I turned the whole rescue / chroot / diagnose / fix workflow into a Python CLI (hal) — including hal fix static-ip, which derives the static cmdline directly from your existing systemd-networkd .network file:

    github.com/kevinveenbirkenbach/hetzner-arch-luks

    Single command, idempotent, reversible (the original /etc/default/grub is backed up to .hal-backup). If you’re on this stack, switch to static IP before the next kernel upgrade catches you.

    #ArchLinux #bootFailure #debugging #DevOps #DHCP #Dropbear #fullDiskEncryption #GRUB #Hetzner #initramfs #kernelUpgrade #Linux #LUKS #mkinitcpio #pacman #postmortem #PythonCLI #serverOutage #sysadmin #systemdNetworkd
  4. When two Hetzner servers died at the same time

    On May 12, 2026, two of my Arch Linux + LUKS servers at Hetzner became unreachable at the same moment. Both had been running for 4+ months without issue. Both had received the same pacman -Syyu the day before, but had stayed on the old kernel until the morning the websites stopped responding. I rebooted — SSH never came back. nmap -Pn -p 22 showed filtered from anywhere. No ping. No banner. The Hetzner Robot panel insisted the hardware was fine.

    Several hours went into hypotheses that turned out to be wrong:

    • The encryptssh initcpio hook referencing a /usr/lib/initcpio/udev/11-dm-initramfs.rules file that no longer exists. Real bug, no boot impact — the initramfs rebuilds anyway.
    • PermitRootLogin no in sshd_config. Real misconfiguration, fixed it, didn’t help. A refusing sshd shows closed, not filtered.
    • Predictable interface-naming drift after the systemd 260 upgrade. Patched the .network config to match by MAC. Useful hardening; not the cause.
    • Stale GRUB stage1 + core.img in the MBR. Arch never re-runs grub-install after a grub package upgrade. Refreshed it. Still filtered.
    • Kernel 7.0.5 regression. Downgraded to 6.18.3, the kernel that had run for 4 months. Still filtered. So the kernel itself wasn’t it either.

    The clue was in the persistent journal: a single recorded boot from December 31 to May 12 10:13 UTC, and absolutely nothing after. Every reboot since the upgrade was failing before systemd-journald could flush to disk — so the failure had to be in the initramfs, before the root filesystem was even mounted.

    What it almost certainly was

    Hetzner Dedicated servers configure the initramfs network with ip=dhcp on the kernel command line. That depends on Hetzner’s DHCP server replying to whatever request format the current kernel sends. Somewhere between kernel 6.18 / iproute2 6.18 and kernel 7.0 / iproute2 7.0, the request format changed enough that Hetzner’s DHCP stopped responding. Effects:

    • Old kernel at runtime kept the interface already configured (Phase A — 32 hours of healthy operation after the package upgrade).
    • New kernel cold-boots, hits DHCP, never gets an IP, dropbear cannot listen, port 22 stays filtered.

    Hetzner’s own documentation has been quietly moving away from ip=dhcp toward static IPv4 in the kernel command line. The fix is exactly that:

    GRUB_CMDLINE_LINUX="cryptdevice=/dev/md1:cryptroot ip=A.B.C.D::GATEWAY:255.255.255.255:hostname:eth0:none"
    

    One line in /etc/default/grub, grub-mkconfig, reboot. No more dependency on Hetzner’s DHCP responding to whatever your current kernel sends.

    Why it matters for anyone running this stack

    If you run Arch on Hetzner Dedicated with full-disk encryption and remote unlock via dropbear, the ip=dhcp shipped by installimage is a latent bug. It can keep working for years and then break overnight, on every machine you have, after a routine pacman -Syyu. The static-IP version is what Hetzner now recommends and removes the entire dependency.

    Tooling

    While debugging, I turned the whole rescue / chroot / diagnose / fix workflow into a Python CLI (hal) — including hal fix static-ip, which derives the static cmdline directly from your existing systemd-networkd .network file:

    github.com/kevinveenbirkenbach/hetzner-arch-luks

    Single command, idempotent, reversible (the original /etc/default/grub is backed up to .hal-backup). If you’re on this stack, switch to static IP before the next kernel upgrade catches you.

    #ArchLinux #bootFailure #debugging #DevOps #DHCP #Dropbear #fullDiskEncryption #GRUB #Hetzner #initramfs #kernelUpgrade #Linux #LUKS #mkinitcpio #pacman #postmortem #PythonCLI #serverOutage #sysadmin #systemdNetworkd
  5. When two Hetzner servers died at the same time

    On May 12, 2026, two of my Arch Linux + LUKS servers at Hetzner became unreachable at the same moment. Both had been running for 4+ months without issue. Both had received the same pacman -Syyu the day before, but had stayed on the old kernel until the morning the websites stopped responding. I rebooted — SSH never came back. nmap -Pn -p 22 showed filtered from anywhere. No ping. No banner. The Hetzner Robot panel insisted the hardware was fine.

    Several hours went into hypotheses that turned out to be wrong:

    • The encryptssh initcpio hook referencing a /usr/lib/initcpio/udev/11-dm-initramfs.rules file that no longer exists. Real bug, no boot impact — the initramfs rebuilds anyway.
    • PermitRootLogin no in sshd_config. Real misconfiguration, fixed it, didn’t help. A refusing sshd shows closed, not filtered.
    • Predictable interface-naming drift after the systemd 260 upgrade. Patched the .network config to match by MAC. Useful hardening; not the cause.
    • Stale GRUB stage1 + core.img in the MBR. Arch never re-runs grub-install after a grub package upgrade. Refreshed it. Still filtered.
    • Kernel 7.0.5 regression. Downgraded to 6.18.3, the kernel that had run for 4 months. Still filtered. So the kernel itself wasn’t it either.

    The clue was in the persistent journal: a single recorded boot from December 31 to May 12 10:13 UTC, and absolutely nothing after. Every reboot since the upgrade was failing before systemd-journald could flush to disk — so the failure had to be in the initramfs, before the root filesystem was even mounted.

    What it almost certainly was

    Hetzner Dedicated servers configure the initramfs network with ip=dhcp on the kernel command line. That depends on Hetzner’s DHCP server replying to whatever request format the current kernel sends. Somewhere between kernel 6.18 / iproute2 6.18 and kernel 7.0 / iproute2 7.0, the request format changed enough that Hetzner’s DHCP stopped responding. Effects:

    • Old kernel at runtime kept the interface already configured (Phase A — 32 hours of healthy operation after the package upgrade).
    • New kernel cold-boots, hits DHCP, never gets an IP, dropbear cannot listen, port 22 stays filtered.

    Hetzner’s own documentation has been quietly moving away from ip=dhcp toward static IPv4 in the kernel command line. The fix is exactly that:

    GRUB_CMDLINE_LINUX="cryptdevice=/dev/md1:cryptroot ip=A.B.C.D::GATEWAY:255.255.255.255:hostname:eth0:none"
    

    One line in /etc/default/grub, grub-mkconfig, reboot. No more dependency on Hetzner’s DHCP responding to whatever your current kernel sends.

    Why it matters for anyone running this stack

    If you run Arch on Hetzner Dedicated with full-disk encryption and remote unlock via dropbear, the ip=dhcp shipped by installimage is a latent bug. It can keep working for years and then break overnight, on every machine you have, after a routine pacman -Syyu. The static-IP version is what Hetzner now recommends and removes the entire dependency.

    Tooling

    While debugging, I turned the whole rescue / chroot / diagnose / fix workflow into a Python CLI (hal) — including hal fix static-ip, which derives the static cmdline directly from your existing systemd-networkd .network file:

    github.com/kevinveenbirkenbach/hetzner-arch-luks

    Single command, idempotent, reversible (the original /etc/default/grub is backed up to .hal-backup). If you’re on this stack, switch to static IP before the next kernel upgrade catches you.

    #ArchLinux #bootFailure #debugging #DevOps #DHCP #Dropbear #fullDiskEncryption #GRUB #Hetzner #initramfs #kernelUpgrade #Linux #LUKS #mkinitcpio #pacman #postmortem #PythonCLI #serverOutage #sysadmin #systemdNetworkd
  6. Reddit faced a massive global outage, blocking browsing and login for thousands and sparking frustration as communication lagged. Ongoing stability issues raise big concerns for the platform’s future.

    #RedditOutage #RedditDown #TechIssues #PlatformCrash #ServerOutage #SocialMediaNews #TECHi

    Read Full Article Here :- techi.com/is-reddit-down-today

  7. #Linux #Update #KDENeon #Ubuntu #ServerOutage

    Moin.
    Gestern abend war mal wieder "Technik die begeistert" Moment... Nicht begeistert. Ubuntu hat (ob er heute morgen immer noch besteht, weiß ich erst wenn ich es nochmal versucht habe) einen Serverausfall, über den einen Teil der Updates von KDE Neon laufen, weil das ja Ubuntu-Unterbau ist. Erst mal die Bug-Reports lesen müssen, um das rauszufinden. Warum kann man bei einem so gravierenden Problem nicht eine zentrale News dazu raushauen? 😤🫤
    Dazu zusätzlich hatte die Telekom gestern auch ein Internetproblem hier in meiner Region.
    Ich habe gestern mal wieder fluchend vor meinem Laptop gesessen, bis ich raus gefunden hatte, dass nicht meine Geräte an dem Chaos schuld sind... Mmpf.

    Nachtrag: Eben versucht, das Update ist immer noch nicht möglich, weil der Ubuntu-Server immer noch nicht erreichbar ist. Na toll. Naja, dann halt heute nicht mehr. Werd da mal in ein paar Tagen rein schauen. 🤷‍♀️

  8. Just a heads up #Joomla peeps. Many Joomla official sites are down with Rochen having the hiccups this morning. We hope to be back ASAP.
    #ServerOutage #Sorry #Glitch

  9. Does anyone know what happened to swisstoots.ch? They seem to be down right now, and have been for over a week, but I haven't been able to find any information for this person who's been asking about it: it's like they just walked away and let the server fail without saying anything...
    #SwissToots #outage #ServerOutage #Mastodon #meta

  10. @thedanielsgenealogy Hi Dan and thanks for spreading the word - planning a follow-up to talk Masto accessibility soon but may take some time.

    Quick word for anyone accessing the above - TsoHost who hosts my site is experiencing a datacentre issue right now, so the entire site for The Parchment Rustler is down until TsoHost gets that fixed. Please do check back all if you were trying to catch my article. Thanks! #TechWoes #TechFail #ServerOutage