Technology

37573 readers

494 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 2 years ago

MODERATORS

[email protected]

379

Google says AI systems should be able to mine publishers’ work unless companies opt out, turning copyright law on its head (www.theguardian.com)

submitted 1 year ago by [email protected] to c/[email protected]

178 comments fedilink hide all child comments

In its submission to the Australian government’s review of the regulatory framework around AI, Google said that copyright law should be altered to allow for generative AI systems to scrape the internet.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 6 points 1 year ago (7 children)

This is a tendency I've heard that I haven't been able to understand. What is the new risk of expressing your thoughts, prose, or poetry online that didn't exist before and currently exists with LLMs scraping them? How would the corporations exploit your work through data scraping that would demotivate you to express it at all? Because I know tone doesn't come accross well in text, I want to clarify that these are genuine questions because my answers to these questions seem to be very different than many and I'd like to understand where that difference in perspective comes from.

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago) (6 children)

I think this largely boils down to the time scales required. A person copying your work has a minimum amount of time it takes them to do that, even when it's just copy and paste. An LLM can copy thousands of different developer's code, for instance, and completely launder the license. That's not ok. Why would we allow machines to commit fraud when we don't allow people to?

[–] [email protected] 2 points 1 year ago (4 children)

Except that isn't exactly how neural networks learn. They aren't exactly copying work, they're learning patterns in how humans make those works in order to imitate them. The legal argument these companies are making is that the results from using AI are transformative enough that they qualify as totally new and unique works, and it looks as if that might end up becoming law, depending on how the lawsuits currently going through the courts turn out.

To be clear, technically an LLM doesn't copy any of the data, nor does it store any data from the works it learns from.

[–] [email protected] 1 points 1 year ago

Except, what it produces is very similar or identical to some copyrighted works, licensed under the LGPL, like in this case. You don't have to copy a whole program to plagiarize someone

load more comments (3 replies)

load more comments (4 replies)