AI image training dataset found to include child sexual abuse imagery

misk@sopuli.xyz · 10 months ago

Communist@lemmy.ml · 10 months ago

How could this even happen by accident?

kromem@lemmy.world · 10 months ago

Because it has five billion images?

The potentially at issue images comprise less than one percent of one percent of one percent of the total.

sir_reginald@lemmy.world · edit-2 10 months ago

removing these images from the open web has been a headache of webmasters and admins for years in sites which host user uploaded images.

if the millions of images in the training data were automatically scraped from the internet, I don’t find it surprising that there was CSAM there.