Stability AI has removed a widely used AI training dataset after researchers found that the data scraper had gathered child sexual abuse material—or CSAM.
The discovery was made by Stanford scientists and reported by 404 Media.
Large language models and AI image generators like Stable Diffusion, Midjourney, and Dall-E rely on massive datasets to train and later generate content. Many of these datasets, like LAION-5B, include images scraped from the internet.
Many of those images depicted harm…
Read the full article here