Another AI Scraping Copyright Case in Canada: News Media Companies Sue OpenAI

Image: Shutterstock (AI assisted)

First, I heard it on the radio. The word “copyright” caught my attention because that’s a word seldom heard on the morning news. Then the news stories started to appear, first on Canadian Press, which was “largely” accurate, then on the CBC, Globe and Mail, even the New York Times. A consortium of Canadian media, including the Toronto Star, Postmedia, the Globe and Mail and the CBC/Radio-Canada is suing OpenAI in Ontario Superior Court for copyright infringement and for violating their Terms of Use. The publishers are seeking CAD20,000 per infringement plus an injunction to prevent further infringement. The case largely parallels a similar one in the US brought by the New York Times against OpenAI and its largest investor Microsoft, which I wrote about earlier this year (When Giants Wrestle, the Earth Moves (NYT v OpenAI/Microsoft).

Despite what the press articles state, this is not the first case in Canada where copyright infringement has been alleged as a result of data being scraped to use in AI applications, as I noted last week. However, it is the first case where news organizations have gone after an AI development company. It also has nothing to do with the Online News Act as stated in the Canadian Press report. In fact, it is the absence of legislation in Canada regarding copyright and AI that is partly responsible for this being fought out in the courts.

OpenAI in its statement quoted “fair use” and “related international copyright principles” to justify its behaviour. The fact that the US fair use doctrine does not apply in Canada, combined with the closed nature of fair dealing exceptions, and the lack of a Text and Data Mining exception in Canadian law, could prove troublesome for OpenAI. It also has the effrontery to state that it offers “opt out” options for news publishers. When you are taking someone’s proprietorial content without permission or payment, it is an insult to tell them they can always opt out. To steal, and then to tell your victim to request that you not steal again, is hardly the way ethical companies operate.

One question to be decided is whether the scraped content falls under copyright as it is a well-established principle that the “news of the day” is not subject to copyright protection. See (Do News Publishers “Own” the News?) News media may not have a monopoly over reporting on what is happening in, say, Gaza but they certainly have the rights to their expression of what is happening through their coverage. OpenAI has also apparently said that its web crawlers are just “reading” publicly available material, as a human being would do. However, reading and copying are two different things, although proving reproduction may be difficult given the unwillingness of OpenAI to disclose its training methods, an issue that has come up in the New York Times case. “Publicly available” is irrelevant, since being publicly available on the internet, or in a library, or anywhere else, does not justify copyright infringement.

In their suit, the plaintiffs are also alleging circumvention of a TPM (technological protection measure, sometimes referred to as a digital lock, which puts content behind a paywall). This is a separate violation of the Copyright Act. In addition, they are alleging violation of their Terms of Use, which are linked to their websites. When a user accesses material on the publishers’ websites, they must agree to the Terms of Use which, among other things, state that the content to be accessed is for the “personal, non-commercial use of individual users only, and may not be reproduced or used other than as permitted under the Terms of Use”, unless consent is given.

The publishers state that OpenAI was well aware of the need to pay for their content and to obtain permission to use it. That is essentially the position also taken by the New York Times. OpenAI has reached licensing agreements with some publishers including the Associated Press, Axel Springer (Business Insider, Politico), the Financial Times, the publishers of People, Better Homes and Gardens and other titles, News Corp (Wall Street Journal and many others), The Atlantic, and others. But not the New York Times obviously (negotiations broke down, leading to the current lawsuit) and not with any of the Canadian media bringing suit. A licensing agreement acceptable to both parties will be the likely outcome of this case. As the US-based Copyright Alliance has pointed out, generative AI licensing isn’t just possible, it’s essential.

There is a vacuum when it comes to legislation in Canada, and elsewhere, regarding the intersection of copyright and AI development. Various models are being experimented with, from the “throw copyright under the bus” model in Singapore to a more nuanced model in Japan, to uncertainty elsewhere. Australia has just produced a Senate report in response to its public consultation on the issue. Among its recommendatons, the Select Committee Report on Adopting Artificial Intelligence called for changes that would ensure copyright holders are compensated for use of their material, while tech firms would be forced to reveal what copyrighted works they used to train their AI models. Canada initiated a public consultation on the topic last year and the Australian Committee’s recommendations with respect to copyrighted content are essentially what the Canadian copyright community asked for. However, since receiving input in January of this year and publishing the submissions received in June, there has been no further information released by the Canadian government. A conclusion similar to the recommendations in Australia would be welcome.

Canadian creators and rightsholders are waiting for some action. Meanwhile the only alternative is to toss the issue to the courts to adjudicate.

(c) Hugh Stephens, 2024. All Rights Reserved.

Author: hughstephensblog

I am a former Canadian foreign service officer and a retired executive with Time Warner. In both capacities I worked for many years in Asia. I have been writing this copyright blog since 2016, and recently published a book "In Defence of Copyright" to raise awareness of the importance of good copyright protection in Canada and globally. It is written from and for the layman's perspective (not a legal text or scholarly work), illustrated with some of the unusual copyright stories drawn from the blog. Available on Amazon and local book stores.

Leave a Reply

Discover more from Hugh Stephens Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading