The relevant part is probably > ...works of the author, for her own benefit with...

i_like_apis · on Nov 1, 2022

Training on copyrighted material is legal. As it should be, IMO.

jfk13 · on Nov 1, 2022

Is that established law (in what jurisdiction?), or just your opinion?

btilly · on Nov 1, 2022

I am pretty sure that it is not established law, but I am pretty sure that that is how it will work out. US provisions for fair use make training models likely OK, and the EU is carving out exemptions for it. See https://valohai.com/blog/copyright-laws-and-machine-learning... for more.

The question of whether the output of the model itself counts as a derivative work, though, is rather more complex. In the case of Github Copilot it has proven very adept at spitting out large chunks of clearly copyrighted code with no warning that it has done so. And lawsuits are being filed over this.

But in the case of the visual artwork, I'm pretty sure that it is going to be ruled not derivative. Because while it is imitative, you cannot produce anything that anyone can say is a copy of X.

But as ML continues to operate, we'll get cases that are ever closer to the arbitrary line we are trying to maintain about what is and is not a copyright violation. And I'm sure that any criteria that the courts try to put down is not going to age well.

zuminator · on Nov 1, 2022

In the music industry, even the tiniest sound sample used in a work entitles the original creator to compensation. It might come to pass that using any portion of an author's work in your training data will confer certain rights to the author over the AI generated product. You'll have to perpetually keep records of your training data for commercially available work, lest you be sued. A whole bureaucracy will evolve, a Getty Curated Training Data, public domain training data sets. Basically the same sorts of issues that we've had over the past 30 years, except replacing "internet" with "AI."

And if the past is any guide, the forces of capital will prevail commercially, but after aborted attempts to rein them in with lawsuits, hobbyists and kids on social media will be mostly ignored by rights holders.

i_like_apis · on Nov 1, 2022

That would be silly.

It’s about outputs not inputs. If you create copyrighted material you must compensate.

One day a robot is going to be walking around the world. Will it have to pay someone every time it glimpses a video / book / logo / car design / etc?

sinity · on Nov 2, 2022

> One day a robot is going to be walking around the world. Will it have to pay someone every time it glimpses a video / book / logo / car design / etc?

Looking at all of the people happily deciding to interpret copyright as widely as possible is kinda horrifying in this context: https://www.youtube.com/watch?v=IFe9wiDfb0E

> Your stored mind contains sections from 124,564 copyrighted works. In order to continue remembering these copyrighted works, a licensing fee of $18,000 per month is required.

> Would you like to continue remembering these works?

> [you have insufficient funds to pay this licencing fee]

> Thank you. Please stand by.

> [Copyrighted works are being deleted]

> Welcome to Life. Do you wish to continue?

i_like_apis · on Nov 2, 2022

Yes exactly :)

Somehow I trust the law is going to work out ok though, despite all the hot takes from people who don’t really understand ML or copyright.

It’s scary to see so many people not understanding the distinction between an input and an output.

i_like_apis · on Nov 1, 2022

I’m the EU it’s established law. In the US it’s basically true but it will be playing out in the courts a little in the future.

LastTrain · on Nov 1, 2022

We don’t know that yet.