I have written previously about the hot war between creators of content and the “foundation” AI companies like ChatGPT, which hoovered up as much content as they could over the past couple of years and continue to do so with an insatiable thirst. It has not cooled down.

Great gluttonous swathes of text, images, videos, audio are still being inhaled. At times, it seems as though the whole corpus of human intercourse is buried in these synthetic brains somewhere, even as individual works are hard to find, having been stripped down to indecipherable numbers and statistical relationships.

So, it was with a fair amount of consternation and teeth gnashing that two eagerly awaited court decisions came down in the last few weeks. Artificial intelligence won. It was no contest.

Both judges decided firmly that AI systems trained on data that is created by humans do not breach copyright. Although the decisions in the two cases were not identical, they were close enough that I need only talk about one of them.

It was a class action complaint launched against AI company Anthropic by plaintiffs (and authors) Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson. Their central claim: Anthropic knowingly downloaded and copied plaintiffs’ works without permission—using them directly to train its AI—whilst profiting from those unauthorised uses. And more broadly, that Anthropic had built a multibillion-dollar business by stealing hundreds of thousands of copyrighted books.

Rather than obtaining permission and paying a fair price for the creations it exploits, Anthropic pirated them.

Enormous implications

We pause here, because the authors were claiming breach of copyright, not only for their own works, but for all the intellectual property that has been used in the training of Anthropic’s AI. This case had enormous implications.

A previous similar case against Meta had also failed, so this Anthropic case was key to testing the strength of the copyright arguments with a new legal team and judge.

Which brings us to a concept called “fair use”. This is a legal construction that allows for ungoverned use of copyrighted material under four very specific conditions. So as not to get too far into legal weeds (for which I am not qualified), there were two “escape” conditions successfully argued by the defendants.

The first is the matter of “transformation”. If you copy a work directly and then go on to use it unchanged, you are subject to copyright restrictions. But if you “transform” that work, you are not—it is fair use.

AI does exactly that—it disassembles the original work and feeds it into a massive statistical machine—it is never meant to be solely reused in its original form (even though careful prompting can force it to do so).

The second is the matter of “market harm”. The plaintiffs could not show that their book sales suffered (or anyone else’s, for that matter).

The third and fourth conditions (whether the work is creative and how much of the work is used) may well have been violations of copyright, but the court only had to balance the four fair use arguments, and it decided in favour of the AI companies.

But there’s more

But there was more. In the great hoovering of 2022–2024, much of the training data was illegally sourced from pirate sites and other shady venues. This is an entirely different matter, and the court determined that separate actions could be brought with respect to content that was not legally acquired.

If AI companies source the content legally, they are home free. They do not have to accredit or recognise or inform or pay the creators. That door now seems firmly shut, and experts do not think that an appeal will bear fruit.

This decision is personal. I write columns and books, and there is a good likelihood that in some tiny corner of ChatGPT lies some strands of my work’s DNA. My visceral reaction is one of outrage, but perhaps it should be one of gratification that I am a small contributor to the vast sum of knowledge being built by these systems.

Has this matter been put to bed? I doubt it, because there are other IP scenarios which will no doubt raise their ugly heads soon. For instance, AI systems will be able to ingest and decode chemical signatures, allowing companies to quickly recreate patented drugs which cost fortunes to develop.

Using the same arguments that defeated the writers in this case, will these companies be able to claim “fair use” if they transform the drugs into something even better? Will they be able to argue that there was no market harm if their drugs are better than the original?

The real question is whether there is any chance of protecting any intellectual property at all as AI sprints towards a world in which it is superintelligent and beyond both our ken and control.

The views of the writer are not necessarily the views of the Daily Friend or the IRR.

If you like what you have just read, support the Daily Friend

Image: reve.ai


Steven Boykey Sidley is a professor of practice at University of Johannesburg, columnist-at-large for Daily Maverick and a partner at Bridge Capital. His new book "It's Mine: How the Crypto Industry is Redefining Ownership" is published by Maverick451 in SA and Legend Times Group in UK/EU, available now. His columns can be found at https://substack.com/@stevenboykeysidley