Research Findings:

  • reCAPTCHA v2 is not effective in preventing bots and fraud, despite its intended purpose
  • reCAPTCHA v2 can be defeated by bots 70-100% of the time
  • reCAPTCHA v3, the latest version, is also vulnerable to attacks and has been beaten 97% of the time
  • reCAPTCHA interactions impose a significant cost on users, with an estimated 819 million hours of human time spent on reCAPTCHA over 13 years, which corresponds to at least $6.1 billion USD in wages
  • Google has potentially profited $888 billion from cookies [created by reCAPTCHA sessions] and $8.75–32.3 billion per each sale of their total labeled data set
  • Google should bear the cost of detecting bots, rather than shifting it to users

“The conclusion can be extended that the true purpose of reCAPTCHA v2 is a free image-labeling labor and tracking cookie farm for advertising and data profit masquerading as a security service,” the paper declares.

In a statement provided to The Register after this story was filed, a Google spokesperson said: “reCAPTCHA user data is not used for any other purpose than to improve the reCAPTCHA service, which the terms of service make clear. Further, a majority of our user base have moved to reCAPTCHA v3, which improves fraud detection with invisible scoring. Even if a site were still on the previous generation of the product, reCAPTCHA v2 visual challenge images are all pre-labeled and user input plays no role in image labeling.”

  • someguy3@lemmy.world
    link
    fedilink
    English
    arrow-up
    60
    ·
    edit-2
    3 months ago

    I kinda figured. It was annoying to do one, but then they wanted you to do two or three and that’s absurd. Whenever it comes up now, I usually just close out.

    • Bezier@suppo.fi
      link
      fedilink
      English
      arrow-up
      22
      ·
      3 months ago

      they wanted you to do two or three and that’s absurd

      Yea how about 20

      • LucidNightmare@lemm.ee
        link
        fedilink
        English
        arrow-up
        26
        ·
        3 months ago

        VPN? Google will just go in a loop with these things, so I just stopped using Google completely.

        • Bezier@suppo.fi
          link
          fedilink
          English
          arrow-up
          5
          ·
          edit-2
          3 months ago

          No. But it’s also not like I get 20 constantly, it was just the worst I’ve seen. Usually it’s 2 to 5, I think.

          I assume they’re just collecting data on how many are users willing to do.

          • LucidNightmare@lemm.ee
            link
            fedilink
            English
            arrow-up
            9
            ·
            3 months ago

            One time I did five in a row, because I use VPNs for everything, and realized after the 5th time that it would have been easier to just use bing so I do that first now. Google has turned into my last last resort, which is quite funny, because that’s where Bing used to be. Lmao

        • I Cast Fist@programming.dev
          link
          fedilink
          English
          arrow-up
          4
          ·
          3 months ago

          Whenever I’m on a private window the captchas just keep on coming. Trying to reset your Steam password via the program will also trigger an infinite loop of captchas, you HAVE to use a browser.

      • sramder@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 months ago

        I tried to order some components on Digikey a few months ago and I’m still mentally scarred. Probably did a few hundred of those things over the course of 2 weeks.

      • Dudewitbow@lemmy.zip
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 months ago

        if you have to do that many, you either have some privacy setting on or on a flagged ip given from a VPN

          • catloaf@lemm.ee
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            1
            ·
            3 months ago

            Most people don’t, most bots do. You look more like a bot, so you get extra challenges.

          • Dudewitbow@lemmy.zip
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            3 months ago

            its abnormal to them because vpns are often also used by bad actors. your use is not abnormal but its a there are other people misusing it making it worse for everyone else.

            • Landsharkgun@midwest.social
              link
              fedilink
              English
              arrow-up
              3
              arrow-down
              1
              ·
              3 months ago

              Wow, way to blame individuals who take basic precautions instead of the corporations who are blantly invading your privacy. Good job making the world a better place, bud.

              • Dudewitbow@lemmy.zip
                link
                fedilink
                English
                arrow-up
                1
                ·
                3 months ago

                point where i blame the individuals, the blame is clearly on the bad actors (e.g bots)

    • Fisch@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      8
      ·
      3 months ago

      Some captchas have also just gotten obvious AI training. “Click on the living being in this image”, “Select every image of the same object as in this example image”. And the images you have to select look obviously AI generated.

    • dinckel@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      3 months ago

      At a certain point I did like 10 of them, and then ended up closing the page, cause it never let me in, all because I was on a vpn

    • CosmoNova@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      3 months ago

      Funny thing is they stop asking if you do them really slowly. Almost as if to tell you, you‘re too inefficient to even be an unpaid intern or something. Anyway, if they annoy you, take your time.

  • gradyp@awful.systems
    link
    fedilink
    English
    arrow-up
    58
    ·
    3 months ago

    I honestly thought it was common knowledge that these things were essentially free labor for training AI.

    • dan@upvote.au
      link
      fedilink
      English
      arrow-up
      24
      ·
      3 months ago

      The original reCAPTCHA from Carnegie Mellon University was helping to digitize books. It showed one known word and one unknown word, and if enough people answered the second word with the same answer, that’d be marked as the correct value.

      • thrawn@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        ·
        3 months ago

        It’s basically always been outsourcing labor while checking. I guess they don’t want to provide that service for free.

        But now that it doesn’t work, all it does is attempt to source free labor by refusing to show what you want to see. Cloudflare’s verification doesn’t show the puzzle because it’s not trying to make money off you.

        Also, the books one reminds me of 4chan’s attempt to hijack it. Wasn’t a fan of the way they did it, but the intent was interesting.

        • lud@lemm.ee
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 months ago

          V3 of the Google one doesn’t always show a puzzle to you. In fact it’s designed to not be noticed by users at all. Whether that is successful or not is a different discussion.

          • thrawn@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            3 months ago

            It might well be if it’s being used, but the site itself still uses v2 a lot. I get the picture one a lot when searching things up.

            That actually makes me feel all the more strongly that it’s just there to extract free labor— they have something else, but still use v2 for what seems like most purposes

            • lud@lemm.ee
              link
              fedilink
              English
              arrow-up
              1
              ·
              3 months ago

              the site

              What site?

              I assume it’s up to the website owner to implement V3 and not Google. V3 also has puzzles but only when it’s not sure. I rarely see capchas so I don’t really have anything to complain about.

              • xuv@lemmy.blahaj.zone
                link
                fedilink
                English
                arrow-up
                2
                ·
                3 months ago

                I expect they mean the site google.com, because that’s been my experience. Whenever I get captcha’d there for using a VPN (which is getting more and more common), I always see the Maps image style captcha. Like 60% of the time it tells me I’m wrong anyway and I just give up.

                • thrawn@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  3 months ago

                  Yeah my b, I get captcha’d for VPN use. It’s almost always the “train our self driving car” one, and it tells me I’m wrong all the time too. Very frustrating

  • serenissi@lemmy.world
    link
    fedilink
    English
    arrow-up
    37
    arrow-down
    2
    ·
    3 months ago

    The objective of reCAPTCHA (or any captcha) isn’t to detect bots. It is more of stopping automated requests and rate limiting. The captcha is ‘defeated’ if the time complexity to solve it, whether human or bot, is less than what expected. Now humans are very slow, hence they can’t beat them anyway.

    • smb@lemmy.ml
      link
      fedilink
      English
      arrow-up
      9
      ·
      3 months ago

      […] reCAPTCHA […] isn’t to detect bots. It is more of stopping automated requests […]

      which is bots. bots do automated requests and every automated request doer can also be called a bot (i.e. web crawlers are called bots too and -if kind- also respect robots.txt which has “bots” in its name for this very reason and bots is the shortcut for robots) use of different words does not change reality behind it, but may add a fact of someone trying something on the other.

      • serenissi@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        3 months ago

        There isn’t a good way to classify human users with scripts without adding too much friction to normal use. Also bots are sometimes welcome amd useful, it’s a problem when someone tries to mine data in large volume or effectively DoS the server.

        Forget bots, there exist centers in India and other countries where you can employ humans to do ‘automated things’ (youtube like count, watch hour for example) at the same expense of bots. There are similar CAPTCHA services too. Good luck with those :)

        Only rate limiting is the effective option.

        • smb@lemmy.ml
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 months ago

          Only rate limiting is the effective option.

          i doubt that. you could maybe ratelimit per IP and the abusers will change their IP whenever needed. if you ratelimit the whole service over all users in the world, then your service dies as quickly into uselessness as effective your ratelimiter is. if you ratelimit actions of logged in users, then your ratelimiting is limited by your ability to identify fake or duplicate accounts, where captchas are not helpful at all.

          at the same expense of bots. they might be cheap, but i doubt that anyway, bots don’t need sleep.

          i was answering about that wording (that captchas were “not” about bots but about “stopping automated requests”) and that automated requests “are” bots instead.

          call centers are neither bots nor automated requests (the opposite IS their advantage) and thus have no relation to what i was specifically saying in reply to that post that suggested automated requests and bots would be different things in this context.

          i wasn’t talking about effectiveness of captchas either or if bots should be banned or not, only about bots beeing automated requests (and vice versa) from the perspective of the platform stopping bots. and that trying to use different words for things, (claiming like “X isn’t X, it is really U!”* or automated requests aren’t bots) does not change the reality of the thing itself.

          *) unrelated to any (a-)social media platform

          • serenissi@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            3 months ago

            stopping automated requests

            yeah my bad. I meant too many automated requests. Both humans and bot generate spams and the issue is high influx of it. Legitimate users also use bots and by no means it’s harmful. That way you do not encounter captcha everytime you visit any google page, nor a couple of scraping scripts gets a problem. Recaptcha (or hcaptcha, say) triggers when there is high volume of request coming from same ip. Instead of blocking everyone out to protect their servers, they might allow slower requests so legitimate users face mininimal hindrance.

            Most google services nowadays require accounts with stronger (like cell phone) verification so automated spam isn’t a big deal.

            • smb@lemmy.ml
              link
              fedilink
              English
              arrow-up
              0
              ·
              3 months ago

              since bots are better at solving captchas and humanoid services exist that solve them, the only ones negatively affected by captchas are regular legitimate users. the bad guys use bots or services and are done. regular users have to endure while no security is added, and for the influx i guess it is much more like with the better lock on the front door: if your lock is a bit better than that of your neigbhour, theirs might be force-opened more likely than yours. it might help you, but its not a real but only relative and also very subjective feeling of 'security".

              beeing slower than the wolves also isn’t as bad as long as you are not the slowest in your group (some people say)… so doing a bit more than others always is a good choice (just better don’t put that bar too low like using crowdsnakeoil for anything)

              • serenissi@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                3 months ago

                the bad guys use bots or services and are done. regular users have to endure while no security is added

                put in other words, common users can’t easily become ‘bad guy’ ie cost of attack is higher hence lower number of script kiddies and automated attacks. You want to reduce number. These protections are nothing for bitnet owners or other high profile bad actors.

                ps: recaptcha (or captcha in general) isn’t a security feature. At most it can be a safety feature.

                • smb@lemmy.ml
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  3 months ago

                  isn’t a security feature. At most it can be a safety feature.

                  o,O

    • tb_@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      1
      ·
      3 months ago

      I thought captcha’s worked in a way where they provided some known good examples, some known bad examples, and a few examples which aren’t certain yet. Then the model is trained depending on whether the user selects the uncertain examples.

      Also it’s very evident what’s being trained. First it was obscured words for OCR, then Google Maps screenshots for detecting things, now you see them with clearly machine-generated images.

  • umbraroze@lemmy.world
    link
    fedilink
    English
    arrow-up
    32
    arrow-down
    1
    ·
    3 months ago

    reCAPTCHA is exploiting users for profit

    Well duh.

    reCAPTCHA started out as a clever way to improve the quality of OCRing books for Distributed Proofreaders / Project Gutenberg. You know, giving to the community, improving access to public-domain texts. Then Google acquired them. Text CAPTCHAs got phased out. No more of that stuff, just computer vision rubbish to improve Google’s own AI models and services.

    If they had continued to depend on tasks that directly help community, Google would at least have had to constantly make sure the community’s concerns are met. But if they only have to answer to themselves for the quality of the data and nobody else even gets to see it, well, of course it turned into yet another mildly neglected Google project.

    • dan@upvote.au
      link
      fedilink
      English
      arrow-up
      7
      ·
      3 months ago

      Then Google acquired them. Text CAPTCHAs got phased out

      Google kept the text version for five years after the acquisition though. They used it to digitize books on Google Books, to allow full-text search of their book archive.

  • Churbleyimyam@lemm.ee
    link
    fedilink
    English
    arrow-up
    28
    ·
    3 months ago

    Getting served a captcha often results in me closing the tab. I’m not doing stupid puzzles for you.

          • hddsx@lemmy.ca
            link
            fedilink
            English
            arrow-up
            4
            ·
            3 months ago

            What do you mean? I am a fleshy human and do fleshy human things like being made of flesh.

            • xavier666@lemm.ee
              link
              fedilink
              English
              arrow-up
              1
              ·
              3 months ago

              Time to take a knife and check for sure

              Seriously /s Don’t harm yourself!

              • hddsx@lemmy.ca
                link
                fedilink
                English
                arrow-up
                1
                ·
                3 months ago

                I disassembled my tail using a knife and it reassembled itself. Based on new data, my name is Rafael Cruz.

              • AlolanYoda@mander.xyz
                link
                fedilink
                English
                arrow-up
                1
                ·
                3 months ago

                Harm yourself?

                Take the knife and harm the people responsible for this travesty. The laws of robotics prevent robots from harming humans: if you manage to harm them, then that means either you’re human or they’re not!

      • tyler@programming.dev
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        It knows they’re wrong which is why I don’t really think this article is accurate. Is it training if it already has the answers? Probably not.

        • MajinBlayze@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          3 months ago

          That’s why it gives you a panel of 9 images. It would have a high confidence on some images, and a low confidence on others. When you pick the correct images and don’t pick incorrect ones it uses the ones it’s confident about as “validation” while taking the feedback on low confidence images to update the training data.

          What this does mean in practice is that only ones actually being “graded” are the ones bots can solve anyway.

        • AmidFuror@fedia.io
          link
          fedilink
          arrow-up
          1
          ·
          3 months ago

          My understanding is different from others here. I thought they served the same Captcha to many people at once and use the majority response to decide who is answering correctly.

          • catloaf@lemm.ee
            link
            fedilink
            English
            arrow-up
            1
            ·
            3 months ago

            That’s true, or at least it used to be back when they were using it for OCR. I have no reason to believe it’s changed.

        • Vox@lemmy.world
          link
          fedilink
          English
          arrow-up
          0
          ·
          3 months ago

          It’s why they ask you to do multiple, 1-2 of them are the control group, they are training on the others

          • tyler@programming.dev
            link
            fedilink
            English
            arrow-up
            0
            ·
            3 months ago

            You’re implying they give you multiple. I hardly ever get multiple, pretty much only if I ‘fail’ the first one.

            • Miaou@jlai.lu
              link
              fedilink
              English
              arrow-up
              1
              ·
              3 months ago

              If they have a good fingerprint on you they don’t need the control group. That’s why you get 5+ captchas when using a VPN/tor.

    • snooggums@midwest.social
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 months ago

      I haven’t done an image one in years for the same reason.

      My general internet usage has plummeted between ads and captchas and all the other modern website bullshit, which is why I am here so much.

    • BangCrash@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      3 months ago

      There’s platforms that do that.

      I can pay a service to auto solve captcha and anything that can’t be solved will be pushed to a human to solve.

      Never actually used it but it was interesting learning it existed

    • Appoxo@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      edit-2
      3 months ago

      In case you didnt know: This is already a thing with pictures slowly fading in for selecting stuff like traffic cones or busses.

  • HiramFromTheChi@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    1
    ·
    3 months ago

    There’s nothing that can express my disdain for Google’s reCaptcha.

    😒 We’re training its AI models 😒 It’s free labor for Google 😒 Sometimes it wants the corner of an object, sometimes it doesn’t 😒 Wildly inconsistent 😒 Always blurry and hard to see 😒 Seemingly endless 😒 It’s the robot asking us humans if we’re the robots

  • Mubelotix@jlai.lu
    link
    fedilink
    English
    arrow-up
    11
    ·
    3 months ago

    I bypassed 35000 google recaptcha v2 using bots. Don’t ever rely on this for security

          • Mubelotix@jlai.lu
            link
            fedilink
            English
            arrow-up
            2
            ·
            2 months ago

            It’s a custom extension solving my very specific problem on a specific internal website. It was never meant for you to use it, it’s just there to serve as inspiration to others

      • Gizmokid2005@lemmy.world
        link
        fedilink
        English
        arrow-up
        10
        ·
        3 months ago

        Except, that’s most of its ad copy on Google’s own website?

        reCAPTCHA uses an advanced risk analysis engine and adaptive challenges to keep malicious software from engaging in abusive activities on your website. Meanwhile, legitimate users will be able to login, make purchases, view pages, or create accounts and fake users will be blocked.

        It’s literally billed as a security measure for a website.

        https://www.google.com/recaptcha/about/

        • theherk@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          7
          ·
          3 months ago

          I see your perspective, but I don’t consider that security in the context of software, which may also explain why they don’t use that word, though I readily admit that it is technically security of a sort. The term usually implies authentication, authorization, and isolation.

  • Blackmist@feddit.uk
    link
    fedilink
    English
    arrow-up
    10
    ·
    3 months ago

    I thought the whole point of reCaptcha was to provide a reliable set of data to train bots. Entering a fuzzy scanned word, identifying bikes and traffic lights, etc.

    The fact that they’ve now got that, and the bots are trained is hardly a surprise.

    Without captchas the problem of spambots would still be a million times worse.

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          2
          ·
          3 months ago

          No, it tracks things like mouse movements to see if it looks human or like a bot. Humans don’t move the mouse in a straight line, there’s some jitter and whatnot, whereas bots will look quite a bit different.

          • Vlyn@lemmy.zip
            link
            fedilink
            English
            arrow-up
            2
            ·
            3 months ago

            That’s super easy to fake for a bot…

            It’s a ton more than mouse movement. Lots of browser fingerprinting for example and tracking.

  • Flying Squid@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    3 months ago

    I had to deal with one yesterday that wouldn’t let me in no matter what I did.

    So it isn’t even good at figuring out who isn’t a robot.

    • icedterminal@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      3 months ago

      Solving too fast. I shit you not. Sometimes you have to go really slow. Like you’re 80 and can’t see very well trying to discern what’s in those boxes.

  • Petter1@lemm.ee
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    2
    ·
    3 months ago

    Why is that no news to me? How did so many people not know that? Should I have spread the word more, even if all people I told that where likr “yea, yea, of course, but, what can I do? 🤷🏻‍♀️”?