Allegedly most valuable company on the planet in all of history (can’t afford books). Allegedly not a bubble or fraud.
Are you suggesting that there is a use case for piracy that has less to do with saving money than it does with convenience and easy access to media in one place?
Will they be sued per book?
It’s not stealing when corpos do it.
Meta torrented their training data from the pirate bay. Hell, Spotify initially built their catalog from pirated music. They all do this shit. Corporations are built to steal our shit and sell it back to us. This isn’t any different from pumping oil out of pubic lands and selling it back to us.
No becaese the lawyer cohort will destroy them.
Holy shit the greed knows no bounds.
Wait, so piracy is theft?
Not if it’s the rich guys doing it.
But…why?
Just torrent it?It would be so funny if this ended with Nvidia getting robbed.
Hmm so nvidia is training llms as well. Are they going to compete with their customers now too? Like anthropic and cursor?
Good. Can’t wait for the bubble to pop.
AA might be digging their own grave. Overtime the knowledge gets accumulated in the hands of a select few and then they’re gonna block people from accessing pirated sites like AA or even worse, AA gets shutdown due to lack of traffic.
It’s a really good thought. IMO what they will be producing with AI wont be knowledge it will be slop.
There is always gonna be an indie writer, a local at the pub singing. They cant stop people creating. Download or buy analog of the stuff you like and store it. We don’t have to be a slave to the mainstream dream…i will say though its hard changing habits…but for me, it starts with me.
So the amend alleges, Nvidia having used/stored/copied/obtained/distributed copyrighted works (including plaintiffs’), both through databases available on Hugging Face (‘Books3’ featured in both ‘The Pile’ and ‘SlimPajama’), or pirating from shadow libraries (like Anna’s Archive), to train multiple LLMs (primarily their ‘NeMo Megatron’ series), and distributing the copyrighted data through the ‘NeMo Megatron Framework’; data which was ultimately sourced from shadow libraries.
It’s quite an interesting read actually, especially the link to this Anna’s Archive blog post. Which it grossly pulls out of context, as plaintiffs clearly despise the shadow libraries too: as they have ultimately provided access to their copyrighted material.
Especially the part: “Most (but not all!) US-based companies reconsidered once they realized the illegal nature of our work. By contrast, Chinese firms have enthusiastically embraced our collection, apparently untroubled by its legality.” makes me wonder if that’s the reason why models like Deepseek, initially blew Western models out of the water.
Allegedly, but holy shit if true. Hard to explain yourself out of that one.






