Hi!

While I really enjoy seeing many of my fellow man being accommodating to people with disabilities. I find manually transcribing every image I post to be very tiring.

I thought that I could at least use some sort of AI to help with image transcripts, tho, that could probably be better used by the actual person with the disability.

So thats the question, should I skip the transcribing of an image or let an AI do it?

  • originalucifer@moist.catsweat.com
    link
    fedilink
    arrow-up
    17
    arrow-down
    3
    ·
    1 month ago

    personally, this is the kind of laser focused tooling its good for. LLMs are going to be critical to assisting the disabled in many contexts.

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    1
    ·
    1 month ago

    I’d ask someone who needs these transcriptions first. I tend more towards “Nay”. I mean if they want AI transcriptions, I guess they could just run their own AI. And that way they get to choose between human and AI ones. I’m kind of against flooding the internet with AI content as long as the recipients can do it themselves.

  • x74sys@programming.dev
    link
    fedilink
    English
    arrow-up
    13
    arrow-down
    3
    ·
    1 month ago

    In my opinion, no. It has to be heavily curated. You’re not saving yourself a lot of work if you have to read it word by word (and probably correct stuff) anyway.

    I think just one very short sentence describing what’s on there (it doesn’t have to be detailed) is a lot better than whatever an LLM will give you.

  • Kierunkowy74@piefed.zip
    link
    fedilink
    English
    arrow-up
    7
    ·
    1 month ago

    Check your output as it may be less accurate than your effort.

    AI is able to extensively describe a photo, like these published on !pics@lemmy.world , but fails at seeing, what part of it is actually important, or recognising a point of a meme. It will save you many keystrokes, but probably will still need to be manually corrected.

  • Doorknob@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    1 month ago

    By transcribing, do you mean describing what is in a picture, or transcribing text in a picture?

    For the former, I can’t really imagine an image you couldn’t describe for accessibility within a sentence, and for the latter, OCR could do the job equally well.

    I’m not saying this to just push the view that neural networks are no good for anything btw. For translation, for example, or text to speech/speech to text, I genuinely think they’re a revelation, and they need very little compute to perform those functions.

  • Auster@thebrainbin.org
    link
    fedilink
    arrow-up
    6
    ·
    1 month ago

    Imo it’s a good use. But do make sure you read the outputs throughly. Even hand-made OCR tools can go crazy some times. Also if the AI can be fully offline / self-hosted, that’s even better imo.

  • placebo@lemmy.zip
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    2
    ·
    1 month ago

    AI is great for this. We shouldn’t put people with disabilities at a disadvantage because of the anti-AI hysteria.

  • qaz@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 month ago

    I’d say go ahead but make sure it produces accurate enough results and make sure to add something like [AI Transcribed] in front so people can take the potential for additional errors into consideration when reading it.

    Also, if you’re using an online service make sure you’re using something that doesn’t use it as training data. Many (probably almost all) artists / photographers won’t appreciate that.

  • vala@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 month ago

    You have a unique advantage in using AI for this over a vision impaired person. That being that if the generated text is wrong, you know and can correct it.

  • KatherinaReichelt@feddit.org
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 month ago

    I think that technology can really help us here. OCR on images is mostly solved. If you know what PaddleOCR can do, those people on Mastodon who are whining about others not including an image description for a screenshot seem really annoying. It is possible to do this directly on your computer without any costs, without the need for beefy hardware. So no need to try to force everyone else to include transcriptions for screenshot, no need to attack other people, just do it yourself and enjoy the text on the screenshot. Technology can really help us here.

    This also does kind of apply to AI image descriptions. Try it and put an image into Gemini and ask it to describe it. You will be surprised. AI can totally give you a workable description of an image. The problem here is that those AI tools can get quite expensive when you are using them a lot and that many disabled people do not have much money. So in my opinion it totally is ok to include AI image descriptions.

    I think that there are too many people in the fediverse who do not know the current state of the technology and hate AI for maybe the right reasons, but who are missing out how it could help them.

  • Tamlyn@lemmy.zip
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    6
    ·
    1 month ago

    A lot artists doesn’t want that their art is used on ai. You can’t prevent that if you let ai summarize your images. So don’t use ai for that

    • Gonzako@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 month ago

      I was actually thinking of using a self-hosted LLM for these tasks. I wanna dig again into it and I got access to computers on the cheap