The website might have been rebooted, but everything is not working 100% yet. So far, it is running on Kubernetes hosted by Talos Linux in a 3-node cluster. The cluster is using distributed file storage backed by the Piraeus Operator, running in an IPv6-only environment using Cilium CNI with Gateway API enabled. Backups are running using a combination of Velero and a local self-hosted MinIO.

Outstanding Issues

There are still a couple of issues to resolve:

  1. DNS64/NAT64

    • CoreDNS sometimes still gives out A records instead of DNS64 synthesized DNS records. As a result, some images fail to pull if they come from repositories with IPv4-only records.
  2. IaC Deployment - Mostly Complete

    • There is a mix of Terraform, Helm, Kustomize, and ArgoCD. The homelab repo needs some cleanup and consolidation to streamline the deployment process.
  3. Virtualized Compute

    • Currently, Talos Linux is running in VMs inside of Proxmox. Next steps include replacing Proxmox with bare-metal Talos Linux and looking into KubeVirt for any remaining virtualized workloads.
  4. IPv4 Accessibility

    • Using Cloudflare’s proxy to serve IPv4-only clients from the IPv6-only ingress. This works but adds complexity and an external dependency when using anything but standard HTTP/HTTPS connections.

The Good News

But there is good news! We do have a full-stack Kubernetes cluster running. Backups have been tested, and my Immich app has been successfully restored (full documentation still needed).

LetsEncrypt Certificates and Gateway API have been working well! This website is a testament to that. (Disregard the Cloudflare SSL cert; I’m using their proxy to simplify serving IPv4-only clients from my IPv6-only ingress.)

What’s Next

Upcoming posts will dive deeper into:

  • The Talos Linux experience and migration from Proxmox
  • Gateway API configuration and best practices
  • KubeVirt - For added virtual machine management
  • Renovate - For Automated dependency updates
  • Disaster recovery testing with Velero The infrastructure might not be perfect yet, but it’s running production workloads and improving. Stay tuned for more technical deep dives as I work through each challenge.