But we're not talking about obfuscation; we're talking about the data you're des...

manholio · on Nov 3, 2022

I'm not claiming the problem is identical to pre-existing problems in the copyright space, just that it's sufficiently similar not to pose a significant challenge for legal scholars, IMHO. Existing copyright laws not only forbid verbatim reproduction, but require derivative works too not prejudice the original author, and grants those authors the power to authorize and reject derivation: https://en.wikipedia.org/wiki/Derivative_work

You anthropomorphic analogy falls flat in its face because the algorithm does not "know" anything, not in any sense of the "know" word for sentient and rational creatures. The algorithm embeds an association between the text "Wonder woman" and actual artistic representations of Wonder woman included in the prior art it is trained on. When prompted, it can reproduce one (see the Copilot fail where it spited out verbatim copyrighted code including comments) or a remix of such representations and integrate them into the output. That's plain as day a derivative work.

The particular case you are referring too, style extraction, could be considered fair use assuming you can technically separate the base visual model from the output style and you can prove the training data for the output module is distilled into abstract, statistical quantities pertaining to that style, such as color palete, stroke weight etc. That sounds like a tall order and I would consider any AI-model trained with copyrighted works as tainted until that burden of proof is satisfied.

shadowgovt · on Nov 3, 2022

Isn't the fact that it can faithfully simulate, in the style of the author, works the author has never created proof enough that the style is disjoint from the trained content?

Hollie Mengert never rendered the streetscape in the article, but DreamBooth did it in her style.

If we're talking criminal copyright infringement, why is the burden of proof on the defendant to show statistical abstraction if the plaintiff can't prove the AI generates works she has made? (Again, if it is possible to get DreamBooth to kick out Hollie's original work, or substantial portions of it, I'd be inclined to agree with your way of thinking, but I haven't seen that yet).

> embeds an association between the text "Wonder woman" and actual artistic representations of Wonder woman included in the prior art it is trained on

Not if I understand how it works correctly, no; it does not. In fact, Mengert's rendering of Wonder Woman differs from the one DreamBooth kicked out if you look up the work she's done for "Winner Takes All! (DC Super Hero Girls)". This is because DreamBooth's approach is to retrain Stable Diffusion with new information but preserve the old; since Stable Diffusion already had an encoding of what Wonder Woman looked like from a mélange of sources, its resulting rendering is neither Mengert's nor the other sources, but a synthesis of them all.