Does lemmy have any communities dedicated to archiving/hoarding data?
deleted by creator
“backups”? Pray tell, fine sir and or madam, what is that?
You know there’s only two kind of people, those who do backups and those that haven’t lost a hard drive/data before. Also: raid is no backup
Still remember the PSU blast taking out my main drive plus my backup drive in like 2001. I thought I was so good because I at least had a backup 😑. Those were the days 🤷🏻♀️
That sounds like an adventure!
Ya, me learning that a dinky psu is your worst enemy, i upgraded my SOs old duron to an athlon for work, which used more energy…
My condolences! That said Athlons were late 90s (?) cool.
I stumbled across this sort of fascinating area of doomsday prepping a few weeks back.
A nice addition to that, don’t just make it a USB, but a raspberry pi. So you’d have a reasonably low-powered computer you could easily take with you.
Not suggesting this one as it seems a bit expensive to me, but https://www.prepperdisk.com/products/prepper-disk-premium-over-512gb-of-survival-content?view=sl-8978CA41
Just built one of these myself. I went NVME M.2 instead of SD Card to avoid data corruption. I know SD Cards are fine if you don’t write to them a lot but if you wanna update or add your own stuff, scares me. Plus NVME is just so much faster.
How would you access the info if electricity permanently goes out?
You find a generator, or solar panels, or wind mill, or water turbine, or a bicycle hooked up to a generator.
If electricity permanently goes out then we’re in a scavenger situation and it is time to start taking apart things that are no longer necessary to build the things that are.
Pretty much what Sinthesis said; USB power brick and/or solar panels. Both at the ready and tested. Also got a big ass battery backup that will charge off solar panels.
Curious about the mindset of the one (so far) person who has downvoted this post. What is there to dislike about archiving Linux and Wikipedia? 🤔
They are probably using a phone app which allows you to swipe sideways to downvote and also using screen gestures to ‘go back’. I’ve accidentally downvoted things this way.
They firmly believe that: Real men don’t do backup, they cry instead.
I also recommend downloading “Flashpoint archive” to have flash games and animations to stay entertained.
There is a 4gb version and a 2.3TB version.
There is a 4gb version and a 2.3TB version.
That’s quite the range
When I downloaded it years ago it was 1.8TB. It’s crazy how big the archive is. The smaller one is just so it’s accessible to most people.
Is that Flash exclusive or do they accept other games from that era?
I’m not sure, but I do think it’s just flash
FWIW :
fabien@debian2080ti:/media/fabien/slowdisk$ ls -lhS offline_prep/ total 341G -rw-r--r-- 1 fabien fabien 103G Jul 6 2024 wikipedia_en_all_maxi_2024-01.zim -rw-r--r-- 1 fabien fabien 81G Apr 22 2023 gutenberg_mul_all_2023-04.zim -rw-r--r-- 1 fabien fabien 75G Jul 7 2024 stackoverflow.com_en_all_2023-11.zim -rw-r--r-- 1 fabien fabien 74G Mar 10 2024 planet-240304.osm.pbf -rw-r--r-- 1 fabien fabien 3.8G Oct 18 06:55 debian-13.1.0-amd64-DVD-1.iso -rw-r--r-- 1 fabien fabien 2.6G May 7 2023 ifixit_en_all_2023-04.zim -rw-r--r-- 1 fabien fabien 1.6G May 7 2023 developer.mozilla.org_en_all_2023-02.zim -rw-r--r-- 1 fabien fabien 931M May 7 2023 diy.stackexchange.com_en_all_2023-03.zim -rw-r--r-- 1 fabien fabien 808M Jun 5 2023 wikivoyage_en_all_maxi_2023-05.zim -rw-r--r-- 1 fabien fabien 296M Apr 30 2023 raspberrypi.stackexchange.com_en_all_2022-11.zim -rw-r--r-- 1 fabien fabien 131M May 7 2023 rapsberry_pi_docs_2023-01.zim -rw-r--r-- 1 fabien fabien 100M May 7 2023 100r-off-the-grid_en_2022-06.zim -rw-r--r-- 1 fabien fabien 61M May 7 2023 quantumcomputing.stackexchange.com_en_all_2022-11.zim -rw-r--r-- 1 fabien fabien 45M May 7 2023 computergraphics.stackexchange.com_en_all_2022-11.zim -rw-r--r-- 1 fabien fabien 37M May 7 2023 wordnet_en_all_2023-04.zim -rw-r--r-- 1 fabien fabien 23M Jul 17 2023 kiwix-tools_linux-armv6-3.5.0-1.tar.gz -rw-r--r-- 1 fabien fabien 16M Oct 6 21:32 be-stib-gtfs.zip -rw-r--r-- 1 fabien fabien 3.8M Oct 6 21:32 be-sncb-gtfs.zip -rw-r--r-- 1 fabien fabien 2.3M May 7 2023 termux_en_all_maxi_2022-12.zim -rw-r--r-- 1 fabien fabien 1.9M May 7 2023 kiwix-firefox_3.8.0.xpibut if you want the easier version just get Kiwix on whatever device in front of you right now (yes, even mobile phone assuming you have the space) then get whatever content you need.
If need a bit of help I recorded TechSovereignty at home, episode 11 - Offline Wikipedia, Kiwix and checksums with a friend just 3 weeks ago.
I also wrote randomly update https://fabien.benetou.fr/Content/Vademecum and coded https://git.benetou.fr/utopiah/offline-octopus but tbh KDE-Connect is much better now.
The point though is having such a repository takes minutes. If you don’t have the space, buy a 512Go microSD for 50EUR then put that on, stuff it in a drawer then move on. If you want to every 3 months or whenever you feel like it, updated it.
TL;DR: takes longer to write such a meme than actually do it.
Whoa, what are all those things you have?
Commenting inline :
-rw-r--r-- 1 fabien fabien 103G Jul 6 2024 wikipedia_en_all_maxi_2024-01.zim # encyclopedia Wikipedia English with images and more -rw-r--r-- 1 fabien fabien 81G Apr 22 2023 gutenberg_mul_all_2023-04.zim # Project Gutenberg, book collection in multiple languages -rw-r--r-- 1 fabien fabien 75G Jul 7 2024 stackoverflow.com_en_all_2023-11.zim # StackOverflow, programming questions and answers -rw-r--r-- 1 fabien fabien 74G Mar 10 2024 planet-240304.osm.pbf # OpenStreetMap low resolution for the whole World -rw-r--r-- 1 fabien fabien 3.8G Oct 18 06:55 debian-13.1.0-amd64-DVD-1.iso # Debian base ISO -rw-r--r-- 1 fabien fabien 2.6G May 7 2023 ifixit_en_all_2023-04.zim # iFixit colection of guides to fix appliances -rw-r--r-- 1 fabien fabien 1.6G May 7 2023 developer.mozilla.org_en_all_2023-02.zim # Web development documentation -rw-r--r-- 1 fabien fabien 931M May 7 2023 diy.stackexchange.com_en_all_2023-03.zim # Do It Yourself Q&A -rw-r--r-- 1 fabien fabien 808M Jun 5 2023 wikivoyage_en_all_maxi_2023-05.zim # WikiVoyage, the version of Wikipedia for traveling -rw-r--r-- 1 fabien fabien 296M Apr 30 2023 raspberrypi.stackexchange.com_en_all_2022-11.zim # Raspberry Pi Q&A -rw-r--r-- 1 fabien fabien 131M May 7 2023 rapsberry_pi_docs_2023-01.zim # Rasspberry Pi documentation -rw-r--r-- 1 fabien fabien 100M May 7 2023 100r-off-the-grid_en_2022-06.zim # Off the grid documents -rw-r--r-- 1 fabien fabien 61M May 7 2023 quantumcomputing.stackexchange.com_en_all_2022-11.zim # Quantum computer Q&A -rw-r--r-- 1 fabien fabien 45M May 7 2023 computergraphics.stackexchange.com_en_all_2022-11.zim # Computer graphics Q&A -rw-r--r-- 1 fabien fabien 37M May 7 2023 wordnet_en_all_2023-04.zim # Graph of words in English -rw-r--r-- 1 fabien fabien 23M Jul 17 2023 kiwix-tools_linux-armv6-3.5.0-1.tar.gz # Kiwix to read .zim files -rw-r--r-- 1 fabien fabien 16M Oct 6 21:32 be-stib-gtfs.zip # public transport database in Brussels, Belgium -rw-r--r-- 1 fabien fabien 3.8M Oct 6 21:32 be-sncb-gtfs.zip # train transport database in Belgium -rw-r--r-- 1 fabien fabien 2.3M May 7 2023 termux_en_all_maxi_2022-12.zim # Termux, Linux tooling on Android, documentation in English -rw-r--r-- 1 fabien fabien 1.9M May 7 2023 kiwix-firefox_3.8.0.xpi # Kiwix Web Extension for the Firefox browserBy the way, there’s now a Wikipedia 2025 snapshot.
I am currently trying to fit that on my phone somehow. I wish I could just omit the index database at the end that can’t be split it seems. I have to keep it, but when it’s split up, it doesn’t work anyway (search is broken that way) (https://github.com/openzim/zim-tools/issues/295).
My phone can only do FAT32 for SD cards…For 2024 Wikipedia, that seems to be around 18GiB of wasted space.
Thanks, updating (~20min) accordingly.
FWIW I have a CMF Nothing 1 and I can put a 500Go microSD in it.
I’ve got Ulefone Armor 24. It can take a 1TB Micro SD, but only FAT32. Why a Linux-based OS can only do FAT32, despite supporting other FSs on internal storage goes beyond me.
Weird, assuming you have Android 13 it should be usable at least as exFAT and thus can be large enough
Unfortunately, this is rather dependent on manufacturer (or rather how much they can fuck up).
Android 14, but without exFAT support.
I tried multiple, exFAT, ext4, f2fs, NTFS, nothing else works.
Yeah not gonna lie, i think i heard someone in a youtube video a while back talk about how the entirety of wikipedia takes up like 200 gigs or something like that, and it got me seriously considering to actually make that offline backup. Shit is scary when countries like the uk are basically blocking you from having easy access to knowledge.
Yeah, it’s surprisingly small when it’s compressed if you exclude things like images and media. It’s just text, after all. But the high level of compression requires special software to actually read without uncompressing the entire archive. There are dedicated devices you can get, which pretty much only do that. Like there are literal Wikipedia readers, where you just give it an archive file and it’ll allow you to search for and read articles.
if you remove topics you are not interessed it can shrink even more
UKGOV haven’t started on things like Wikipedia yet. They know kids use it for school and blinded by ideology though they are, even they can see there’d be an enormous backlash if they blocked it any time soon.
If that’s going to happen at all, I doubt it would be before the next election. That’s whether Labour get re-elected or the Tories make an unexpected comeback. You can tell how far Labour have fallen in the eyes of their party faithful when they’ve taken a Tory-drafted policy and made it their own.
Ironically, the up and coming third option fascist party, have said they’re going to repeal the Online Safety Act. They have other fish to fry if they get in, and they’ll want to keep their preferred demographic(s) happy while they do it.
I assume that eventually something like the OSA would come back to “protect the children”. They love the current US President.
None of this is hopeful. Take this as more of a rant.
I’m certain that when UK forces DigitalID upon the nation it will be a requirement for access to every website
I can answer one part of your question. Yes, it’s not as big as you think it is.

does this include images?
With images, it is 111,08 GB
Compressed or uncompressed? Can it be directly read?
Can be read directly, like normal Wikipedia.
That’s very nice. Does it also include other languages, or would that take more space?
This is English only. Other languages are downloaded separately, though they typically take less space.
Nice.
How about, when included previous versions of pages? (excluding images)Not sure, not having that option. Can imagine not much more, if proper version history management is involved.
Yeah, seems like there’s nothing as simple as something similar to a
git cloneavailable.
One would probably have to download multiple full copies from different times and then merge them with deduplication, to get that answer.
Sorry, I’m out of the loop. Is there something particular that triggered this that I missed?
gestures broadly
The broad censorship of government data in the US, combined with the recent political attacks on Wikipedia caused me to download the whole English Wikipedia earlier this year. Guessing OP is similar
Not sure why they’d download Debian with all packages though
Edit: I should mention it’s less about a potential loss of Wikipedia as it is a personal source of truth on politically sensitive topics that get censored, or turned to propaganda by bots
For example the Wounded Knee Massacre. Pete Hegseth has recently been calling it the, “Battle of Wounded Knee”. I wouldn’t be surprised if the current administration went to war with Wikipedia and forced them to 1) Change articles they disagree with, and 2) Hide those changes from history
My rationale with Debian is that distros are kind of like portals to entire compendiums of free and open-source software. With the increasing attacks on vpns in particular right now, I’m concerned there are any number of programs we take for granted that we might not have access to soon.
The internet is already deeply enshittified. There is a real possibility that it will no longer be a free and open web in any capacity soon. So it’s past time to make archives, and start setting up meshnets.
I had downloaded the full (no pictures) Wikipedia earlier this year for exactly this reason. This thread told me about kiwix, which is awesome, so I downloaded the “Wikipedia .08” using kiwix, which is the best 45,000 articles from Wikipedia with pictures and it’s 7G, very manageable, has most topics anyone would care about.
Well for starters, teachers have had to start telling students that .gov websites are no longer considered credible sources for research.
Nice!
Nothing in particular that I’m aware of, just a growing recognition that things are very much not well in the US these days.
Last year I bought a hard copy of my favorite webcomic in case the website goes down.
To paraphrase Stan Lee here, comics are like boobs. They look good on the internet, but there is just something special about holding them in your hands.
Which webcomic?
Girls with Slingshots, it ended over a decade ago, but I still love the characters. I realized if the author dies and stops renewing the website it could disappear. As a foundational part of my early twenties I couldn’t accept that.
I’ll have to check it out. Thanks for the recommendation.
deleted by creator
I would love to have a small Wikipedia browser that can survive the apocalypse.
E-ink display, mini keyboard and touchpad, multiple ways/ports to transfer info, All wrapped up in a heavy duty equipment case that’s able to survive a building collapses and burns in an earthquake, that’s shielded from EMP.
You mean like the wiki reader:

I used it as an ebook reader until the screen gave out.
Sounds like the beginning of a proper Hitchhikers Guide to the Galaxy.
Actually having something telling me Don’t Panic is big friendly letters would help my mental health…
I would love to have a small Wikipedia browser that can survive the apocalypse.
I’ve got the full 120 GB Wikipedia dump running in Kiwix on a Raspberry Pi Zero. Works great (surprisingly)
E-ink display, mini keyboard
Have been using a Minimal Phone for a few months now which has both of those. Can connect to the Pi easily.
multiple ways/ports to transfer info,
Add a USB-C hub (or add a hub to the Pi) and you’re set
All wrapped up in a heavy duty equipment case that’s able to survive a building collapses and burns in an earthquake, that’s shielded from EMP.
And that’s where I’m limited - My 3D printer can only do so much lol. 😆
I’ve been working on a side project this week with a Orange Pi Zero 2W (Pi Zero “clone” but with better specs). It’s got the Kiwix+Wikipedia like my older Pi (described above) plus a bunch of other neat stuff. It’s kind of a combination travel router, portable web app server, party box, and extremely over-engineered bluetooth speaker all-in-one. Hoping to put together a show-and-tell post about it when I get the last of it squared away.
Very interested in your setup for that opi2w. I have one that is being retired from pihole duty that I’ll be doing similar to. Also want to add an sdr to it so it can pull ghostnet js8call and the like.
Ooh, I haven’t tried RTL-SDR on it yet, but I think I’m nearing capacity on what it can do at once lol.
Here’s the block diagram for it (in spoiler below). Everything’s up and running except the Bluetooth Receiver -> Snapcast (it works on the bench but I don’t have the scripting/automation done yet). I’m also adding an SMA connector for an external antenna, but the new base part is still printing. Photo shows it “as is” of this writing.
SSL for the web apps was a PITA since I wanted real certs. Had to make a wildcard domain under my main hobby domain, so all my apps are like “https://{APP_NAME}.mobile.mydomain.xyz/”
As soon as I can get the Bluetooth + Pulseaudio scripting done, I’m gonna try to do a write up and maybe a show/tell post.
Block Diagram

Current Case

If you do this please share your IP so I can use your backup too
You can find me at ::1
Unlike OP, I’m not some hacker trying to get your IP address. I just need your regular address? :)
Wait, isn’t there an offline copy of a part of Wikipedia? The article Just by yourself a nice printer with enough ink and do it yourself ;)
It could cost a bit if you wanted to keep it up to date.
we need all repos to be stored offline, and documentations to troubleshoot.
the 1st i have no idea how much space we will need. Most linux packages are prerry light, no? But there is A LOT of them…
the 2nd is easy. Heard someone say the entire of wikipedia is 200GB, should be doable. Dont forget the technical wikis too: Debian, Gentoo, Arch.
The official USBs of Trixie fit all 28 DVDs of AMD64 on a 256GiB USB stick
https://www.linuxcollections.com/products/debian/debianusb.htm?id=51007
You’d probably want the 512GiB with all the sources for a real backup in this scenario
Get out of my mind.

















