Tailscale’d Into Homelabbing
33 points by ugur
33 points by ugur
Not sure if you are asking rhetorically, but I have some opinions and workflows I would like to share:
How should I set up a proper backup system?
Which apps should I try next, maybe things like Jellyfin, Paperless-ngx, or some bookmarking tools?
How can I improve the observability of my home server so I can easily track resource usage?
Is there anything I can do about power outages so that if my computer shuts down, it can start back up automatically?
Where is your Tailscale hosted? If it is hosted on the same machine as everything else, it could cause problems. Everything dies, and you can't get in to fix it from out of band. I have one Tailscale instance setup on my server, and one on a Raspberry Pi as a redundant node in case the server dies completely. My personal next step would be to move from Tailscale to a truly open source alternative.
Hey, thanks for the answers :)
No, it was not rhetorical.
I am definitely going to examine Proxmox and learn more about it. The only thing I know about it is that it's mostly used for managing VMs, and also used by VPS companies a lot. I know that it's kind of like a standard. My first concern about it would be whether it adds a considerable amount of overhead or not. Other than that, I don't have much to say about it. Possibly in one of the upcoming posts, if I decide to use it, I will explain why, otherwise I will explain why I did not go with it. But I first need to research a bit more.
So, Jellyfin and Paperless-ngx are the obvious next things to try. Gitea seems cool, but I don't see the need for it right now, maybe sometime later. So, you host your Matrix instance on your homelab. Is it solely for chatting with people on the same tailnet, or is it exposed to the internet as well, and you are able to talk with people from other Matrix instances?
I mean, it's actually okay for the server to not be available to me occasionally. Most of my important programs being local-first kind of solves that problem. But I would still like to be able to make my computer automatically start as soon as the electrical connection is available again.
By the way, what do you mean by "Tailscale hosted"? Currently, I have Tailscale clients installed on my devices, and I rely on their coordination server. I've seen Headscale as an alternative to Tailscale but have not looked it up in detail yet. If I can use Headscale as the coordination server and still use the Tailscale clients (like using Vaultwarden as a server for Bitwarden), I see no reason not to do so. :)
Exactly, my goal would be to go for headscale for instance in the near future. My question is, if your subnet forwarding tailscale client is hosted on your server, where the rest of your services are, or somewhere else?
As for Matrix: I have services that are open to public, including my website, my matrix site and some other things. My Matrix server is federated, so yes, I can talk to anyone using Matrix on it. On my homeserver I have about 10 active users. It's not much of a homelab any more, but a homeprod.
Oh, if I understood correctly, you are talking about setting up a subnet router? Well, I have not set one up so far. At the moment, I only have a few devices, and all of them have a Tailscale client (daemon) installed and running. The only thing I have done so far regarding Tailscale was basically installing and running tailscaled on my devices and configuring the ACL.
Yeah, it seems like your setup has evolved to some other level lol. Very cool. I am the only one in my tailnet for now. Maybe I can also share some services with my friends in the future when I am ready for it. :D
...I didn't know you could do that. Well I mean, I now see understand the way you have done it, but I use Tailscale basically just as an easy to use VPN. I have a Tailscale node that opens one entire network/subnet. So I can access all my services in a certain subnet from outside, and not just those that are active. This enables me to basically feel at home wherever I am.
But I would still like to be able to make my computer automatically start as soon as the electrical connection is available again.
I put a lot of effort into this because I'm frequently remote these days.
Do note that Tailscale eats up the entirety of the CGNAT range and there's no way to work around this if you want to keep the Tailscale v4 addresses or subnet router (iirc the official response to this was to disable Tailscale IPv4 entirely and only use IPv6, which does not work with subnet router).
I used to do hard patching around this [0]. However, this does not work well if you want to preserve the source IP address when the destination is more than one hop away (e.g. user outside your tailnet -> your ingress Tailscale node (NAT to Tailscale v4 in CGNAT) -> the target node in tailnet). I have this issue because I was using Tailscale to handle internal routing on some nodes that do not have a BGP session. I submitted a very rough PR [1] on dynamically injecting fw rules to not drop all CGNAT traffic, but they don't seem to want it (I tested it on my ~10 node deployment)... Not only this, multicast is also not supported (i.e. I can't run babel)...
Now I've decided to use ranet [2] to make an IPsec mesh and strip Tailscale away.
[0] https://lobste.rs/s/2pi9sn/de_escalating_tailscale_cgnat_conflict
I'm so glad everyone is catching the homelab bug. I've been documenting my adventure, self-hosted of course: https://docs.eblu.me. My stack mirrors what a lot of others have talked about already in the comments. Since OP talked about power management, I think they might like the card about my power chain: https://docs.eblu.me/reference/infrastructure/power
AC Grid (120V) → Anker SOLIX F2000 → CyberPower CP1000PFCLCD → Homelab
Not yet documented (gotta fix that next) is that I also have a Honda EU2200i 2.2kw generator and a 600w solar panel that can all plug in to the F2000 simultaneously. The CP1000 provides clean sine wave power at something like ~5ms cutover but only for ~5 minutes, meanwhile the F2000 supplies approximately 8 hours of power with a ~50ms cutover. The F2000 can be preferably charged by the solar panel to reduce costs, and just sip on AC mains (or the generator during a power outage). It's... way overkill. But I love it! I do plug-out tests regularly and let the F2000 drain down to ~10% at least once a month.
Great writeup and setup! Though there is one thing I do not get: They say they are concerned about somebody else owning the VPS and being able to access their devices. Technically this is still possible with Tailscale. Going off this post here https://tailscale.com/blog/how-tailscale-works the Tailscale coordination server is not run by you and therefore it can be compromised and can add additional public keys for nodes you do not own to your private mesh network. Also ACLs might changed that way.
Don‘t get me wrong. I think Tailscale is awesome and I‘m pretty sure they have designed their systems to limit chances of this ever happening, but if you are really paranoid about something like this (like I am ;-) you should be aware of this.
Thanks! The post you shared was one of the few posts I read before setting up Tailscale. As I understand it (could also be wrong), if you don't do anything additional with your devices other than setting up tailscaled and adding them to your tailnet, the worst thing that can happen (even if you set your ACLs to not limit anything) is that they can act as if they are in the same network, basically similar to being on a LAN.
So, as long as you are not using things like Tailscale SSH (giving Tailscale permission to SSH into your devices), or some other feature I might not be aware of, I felt like it should not be less secure than just setting up my services on a publicly available server (this could still be wrong).
Anyways, your comment actually gave me an idea for a new post. Researching and thinking through what could potentially go wrong with Tailscale, especially with the coordination server. :)
It's a paid feature sadly, but they do offer Tailnet lock which, to quote the page, "even if Tailscale were malicious or Tailscale infrastructure hacked, attackers can't send or receive traffic in your tailnet.".
Tailscale lock is not a paid feature? The link you provided mentions that it is available on the “Personal” plan which is their free plan. I had it when I was running my home on their free tier.
Tailscale was the thing that really accelerated my homelab usage too.
I use Uptime Kuma for outage notifications and Beszel for perf monitoring. Beszel in particular is great because you see a breakdown of activity per docker container without needing to configure anything. You can also add agents to multiple machines, so the Beszel instance on my home server also monitors my VPS.
In terms of services, I’m using
For backups, I have a couple of custom docker images: one for megacmd (the commandline for Mega cloud storage) and one that runs a simple rsync mirror to an external drive.
I've been meaning to check out Pulse for a while. Beszel I've found over time is a little brittle with regards to GPU monitoring, otherwise it's super for a quick healthcheck.
Oh, Pulse definitely looks interesting! Especially if it’ll do uptime as well as perf. I’ll have to have a play with it this weekend.
I literally had Grafana + Prometheus + Node Exporter + Prometheus Podman Exporter doing what Beszel + its agent does out-of-the-box. That was a really nice way of simplifying my homelab setup. Thanks for the tip.
Definitely going to look into Beszel and Uptime Kuma. Audiobookshelf also seems really good, apparently it also has a mobile client. I will give that a try. I will probably use rsync for syncing to an external drive as well. But I still have not figured out what to do about an additional cloud backup. I am looking for possible solutions where it would allow me to both incrementally send files (only the changed parts) as well as provide encryption support (providing zero-trust to the cloud provider). I have come across restic, but did not dig into it further
I use mega.nz for cloud storage. It’s encrypted at rest, though there have been rumours about it being govt compromised (I’m ok with that threat model).
There are probably better/cheaper pure backup services, but we use it like dropbox from all our devices so are ok with some compromise. It’s also pretty good in that it does file versioning for you - being able to grab an older version of a file has saved me a couple of times.
I run Wireguard in my router. What are the advantages of Tailscale over that?
Regarding observability, I've recently set up a couple of jails with VictoriaMetrics and vmagent (similar purpose as Prometheus) and Grafana. Installing node_exporter in your nodes and a couple of dashboards already gives you good visibility on the most important metrics.
Then I fell down the observability hole… I wanted to centralize logs storage, so I installed VictoriaLogs. Then, I decided that it would be nice to enrich firewall logs with geolocation, and there comes Fluent Bit. Ah, but I need alerts! vmalert to define the rules and Alertmanager to actually deliver notifications (via email for now). Ntfy is a very nice project if you want push notifications. And there I am, looking for a way to bridge Alertmanager and Ntfy…
The observability space is so… “sophisticated”. I'd happily hear about your setup.
What are the advantages of Tailscale over that?
The convenience (or inconvenience in some situations). They manage the keys, they provide coordination servers, you don't generally need to modify the fw rules, you can easily share nodes, etc.
Then I fell down the observability hole…
You can scrape metrics over Tailscale. I do that, and Grafana will automatically pick up new nodes and scrape whatever metrics I defined with NixOS Grafana provisioning knobs
Tailscale is the IPv6 Internet we all deserve but have never gotten. I live their approach very much.