• 0 Posts
  • 20 Comments
Joined 1 year ago
cake
Cake day: June 13th, 2023

help-circle









  • Except that’s not even how most bus systems work because most of them are majority funded by taxes with fares originally meant to serve as a stopgap but then slowly converted into a profit engine (usually after privitization). Fares are a way to gatekeep a service which your taxes already pay for, which I would argue, is by itself a form of theft.

    As an example check out the latest MTA report only 26% of funding comes from fares, and that ones a bit in the higher end from what I’ve seen (NYC public transit, picked as the example a it’s recently been in the news for issues with fare evasion)

    All that aside, it’s also worth noting that fare increases are extremely unpopular and it’s not that easy to increase them without potential serious backlash (ie the mass protests in Chile a few years back that were in part set off by the fare hikes.)


  • This is a classic case of “tech journalism”… If you follow the sources the source of the data and it’s methodology uses the CBECI which the latest update lists a range of 75-384 TWh. (Note that the “2%” listed in the parent article is the global power consumption of the Bitcoin network compared to the US electrical network, aka a bad faith comparison). It explicitly states:

    The upper-bound estimate corresponds to the absolute maximum power demand of the Bitcoin network. While useful for providing a quantifiable maximum, it is a purely hypothetical value that is non-viable for various reasons…

    Which of course is the estimate that the journalists use for this peice.

    There’s also a bunch of likely issues within the methodology as it’s estimate is largely based on the number of ASICs produced; the assumption that “mining nodes (‘miners’) are rational economic agents that only use profitable hardware.” and that any amount profit is sufficient to keep a mining operation ongoing; and many others. It actually does a pretty good job of disclosing a lot of the methodology flaws within the link above.

    My goal is just to call out bad/lazy journalism and what I assume is oil/gas distractionary tactics. Electricity is ~38% of US energy consumption and even that maximum bound of 2% when comparing it to the global Bitcoin network is practically negligible when contextualized.




  • No, that’s the current legal precedent within the US.

    Kelly v. Arriba Soft

    The court opinion:

    “The Court finds two of the four factors weigh in favor of fair use, and two weigh against it. The first and fourth factors (character of use and lack of market harm) weigh in favor of a fair use finding because of the established importance of search engines and the “transformative” nature of using reduced versions of images to organize and provide access to them. The second and third factors (creative nature of the work and amount or substantiality of copying) weigh against fair use.”

    That “compression is transformative” principle has been pretty solidly enshrined as precedence at this point (IE Perfect 10, Inc. v. Amazon.com, Inc.) however with no real guidelines as to what amount is required to be considered transformative

    The major argument as to whether the sort of LLM training in the parent article still constitutes fair use or not depends on whether there exists “market harm” or the “substantiality of copying” is especially egregious (note that these are the two fronts that the NYT is taking.) There is precedence for copying of style not being fair use Dr. Seuss Enters., L.P. v. Penguin Books USA, Inc. which I suspect is why NYT is approaching it the way that they are…

    Now, all that being said, my personal opinion is fuck the US legal system and fuck copyright. There is no solution to the core issues surrounding this topic that isn’t inherently contradictory and/or just a corporate power grab. However, the “techbro idiots” are “right” and you’re not, but it’s because they are idiots who are largely detached from any sort of material reality and see no problem with subjecting the rest of us to their insanity.





  • The academic name for the field is quite literally “machine learning”.

    You are incorrect that these systems are unable to create/be creative, you are correct that creativity != consciousness (which is an extremely poorly defined concept to begin with …) and you are partially correct about how the underlying statistical models work. What you’re missing is that by defining a probabilistic model to objects you can “think”/“be creative” because these models dont need to see a “blue hexagonal strawberry” in order to think about what that may mean and imagine what it looks like.

    I would recommend this paper for further reading into the topic and would like to point out you are again correct that existing AI systems are far from human levels on the proposed challenges, but inarguably able to “think”, “learn” and “creatively” solve those proposed problems.

    The person you’re responding to isn’t trying to pick a fight they’re trying to help show you that you have bought whole cloth into a logical fallacy and are being extremely defensive about it to your own detriment.

    That’s nothing to be embarrassed about, the “LLMs can’t be creative because nothing is original, so everything is a derivative work” is a dedicated propaganda effort to further expand copyright and capital consolidation.


  • I partially agree with you, but I think you’re missing the end goal of Facebook et al.

    As HughJanus pointed out it’s not really any different than a person reading a book and by that reasoning using copyrighted material to train models like these falls well within the existing framework of “fair use”.

    However, that depends entirely on “the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes.” I agree completely with you that OpenAI’s products/business (the most blatant violator) does easily violate “fair use” due to that clause. However they’re doing it, at least partially, to “force the issue” on the open question of “how much can public information be privatized?” with the goal of further privatizing and increasing commercial applications of raw data.

    As you pointed out LLMs can only create facsimiles and not the original work, and by that logic they can’t exactly replicate the inputs either.

    No I don’t think artists can claim that they own any and all “cheap facsimiles” of their works, but by that same reasoning none of these models produced should be allowed to be the enforceable property of any individual/company either.

    For further reading check out:

    • Kelly v. Arriba Soft Corporation on why “thumbnails” (and by extension LLMs, “eigen-images”, etc.) are inherently transformatve and constitute fair use.
    • Bridgeport Music, Inc. v. Dimension Films for the negative impacts that ruling has had and how it still doesn’t protect the artists from their stuff being used for training and LLM.
    • “Variational auto-encoders” for understanding on how the latest LLMs actually do achieve a significant amount of “originality” and I would argue are able to be minimally creative.