Around the same time, Cloudflare’s chief technology officer Dane Knecht explained that a latent bug was responsible in an apologetic X post.

“In short, a latent bug in a service underpinning our bot mitigation capability started to crash after a routine configuration change we made. That cascaded into a broad degradation to our network and other services. This was not an attack,” Knecht wrote, referring to a bug that went undetected in testing and has not caused a failure.

  • AldinTheMage@ttrpg.network
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    2
    ·
    4 months ago

    So the actual outage comes down to pre-allocating memory, but not actually having error handling to gracefully fail if that limit is or will be exceeded… Bad day for whoever shows up on the git blame for that function

    • hue2hri19@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      8
      ·
      4 months ago

      This is the wrong take. Git blame only show who wrote the line. What about the people who reviewed the code?

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 months ago

        If you have reasonable practices, git blame will show you the original ticket, a link to the code review, and relevant information about the change.