• 2 Posts
  • 70 Comments
Joined 1 year ago
cake
Cake day: June 16th, 2023

help-circle










  • I’m not an expert, but I would say that it is going to be less likely for a diffusion model to spit out training data in a completely intact way. The way that LLMs versus diffusion models work are very different.

    LLMs work by predicting the next statistically likely token, they take all of the previous text, then predict what the next token will be based on that. So, if you can trick it into a state where the next subsequent tokens are something verbatim from training data, then that’s what you get.

    Diffusion models work by taking a randomly generated latent, combining it with the CLIP interpretation of the user’s prompt, then trying to turn the randomly generated information into a new latent which the VAE will then decode into something a human can see, because the latents the model is dealing with are meaningless numbers to humans.

    In other words, there’s a lot more randomness to deal with in a diffusion model. You could probably get a specific source image back if you specially crafted a latent and a prompt, which one guy did do by basically running img2img on a specific image that was in the training set and giving it a prompt to spit the same image out again. But that required having the original image in the first place, so it’s not really a weakness in the same way this was for GPT.







  • For those in the US: Learn how to file your own taxes. It’s really simple for the large majority of people, and usually just consists of copying numbers into boxes off a sheet your employer made for you. After you’ve done it once, subsequent times you’ll probably have it done yourself in less than half an hour.

    You can do it for free on a ton of sites unless you make significant income, freetaxusa is typically the most highly recommended one.