• 1 Post
  • 26 Comments
Joined 26 days ago
cake
Cake day: September 9th, 2025

help-circle
  • I originally thought it was one of my drives in my RAID1 array that was failing, but I noticed copying data was yielding btrfs corruption errors on both drives that could not be fixed with a scrub and I was also getting btrfs corruption errors on the root volume as well. I figured it would be quite an odd coincidence if my main SSD and 2 hard disks all went bad and I happened upon an article talking about how corrupt data can also occur if the RAM is bad. I also ran SMART tests and everything came back with a clean bill of health. So, I installed and booted into Memtester86+ and it immediately started showing errors on the single 16Gi stick I was using. I happened to have a spare stick that was a different brand, and that one passed the memory test with flying colors. After that, all the corruption errors went away and everything has been working perfectly ever since.

    I will also say that legacy file systems like ext4 with no checksums wouldn’t even complain about corrupt data. I originally had ext4 on my main drive and at one point thought my OS install went bad, so I reinstalled with btrfs on top of LUKS and saw I was getting corruption errors on the main drive at that point, so it occurred to me that 3 different drives could not have possibly had a hardware failure and something else must be going on. I was also previously using ext4 and mdadm for my RAID1 and migrated it to btrfs a while back. I was previously noticing as far back as a year ago that certain installers, etc. that previously worked no longer worked, which happened infrequently and didn’t really register with me as a potential hardware problem at the time, but I think the RAM was actually progressively going bad for quite a while. btrfs with regular scrubs would’ve made it abundantly clear much sooner that I had files getting corrupted and that something was wrong.

    So, I’m quite convinced at this point that RAID is not a backup, even with the abilities of btrfs to self-heal, and simply copying data elsewhere is not a backup, because something like bad RAM in both cases can destroy data during the copying process, whereas older snapshots in the cloud will survive such a hardware failure. Older data backed up that wasn’t coped with faulty RAM may be fine as well, but you’re taking a chance that a recent update may overwrite good data with bad data. I was previously using Rclone for most backups while testing Restic with daily, weekly, and monthly snapshots for a small subset of important data the last few months. After finding some data that was only recoverable in a previous Restic snapshot, I’ve since switched to using Restic exclusively for anything important enough for cloud backups. I was mainly concerned about the space requirements of keeping historical snapshots, and I’m still working on tweaking retention policies and taking separate snapshots of different directories with different retention policies according risk tolerance for each directory I’m backing up. For some things, I think even btrfs local snapshots would suffice with the understanding that it’s to reduce recovery time, but isn’t really a backup . However, any irreplaceable data really needs monthly Restic snapshots in the cloud. I suppose if don’t have something like btrfs scrubs to alert you that you have a problem, even snapshots from months ago may have an unnoticed problem.







  • I use k3s and enjoy benefits like the following over bare metal:

    • Configuration as code where my whole setup is version controlled in git
    • Containers and avoiding dependency hell
    • Built-in reverse proxy with the Traefik ingress controller. Combined with DNS in my OpenWRT router, all of my self hosted apps can be accessed via appname.lan (e.g., jellyfin.lan, forgejo.lan)
    • Declarative network policies with Calico, mainly to make sure nothing phones home
    • Managing secrets securely in git with Bitnami Sealed Secrets
    • Liveness probes that automatically “turn it off and on again” when something goes wrong

    These are just some of the benefits just for one server. Add more and the benefits increase.

    Edit:

    Sorry, I realize this post is asking why go bare metal, not why k3s and containers are great. 😬







  • melfie@lemy.lolOPtoTechnology@lemmy.worldMarketing Doesn't Work on Nerds
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    16 days ago

    Well, reading the replies to this post, it became clear to me that the title is provocative, but isn’t accurate. Sure, nerds don’t like ads and generally are annoyed by inflated and unsubstantiated claims, but it’s inaccurate to say that marketing doesn’t work on nerds. Many people who read the title obviously recognized this and came here to set the record straight, hence my reference to Cunnungham’s Law. I’m sure others who originally agreed with the title came around to a different understanding like I did after reading the comments and reflecting. “Hey, maybe I’m not immune to marketing after all.”

    Overall, I feel like I’ve been called out on my bullshit in this post and am wiser as a result. Hope others had the same learning experience. Maybe I’m a jagoff as well for being so openly reflective about it.


  • melfie@lemy.lolOPtoTechnology@lemmy.worldMarketing Doesn't Work on Nerds
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    1
    ·
    edit-2
    16 days ago

    As OP, I have to admit this post unintentionally leverages Cunningham’s Law as its main marketing tactic, as do many other popular posts on Lemmy. Post something that might sound correct on the surface, but is demonstrably false, and you will get hundreds of nerds clicking on it saying, “that’s bullshit; let me set this fucker straight!” 🤣