• Lvxferre@mander.xyz
    link
    fedilink
    English
    arrow-up
    1
    ·
    7 months ago

    That’s perhaps why image generators are comparatively better than text generators*. But there’s still something off: by your example it seems that the model cannot reliably use clues like position to understand “this is a «leg»”. And I don’t know much about image generators but I think that they’re still statistics- and probability-based.