How AI Can Destroy Local Journalism

Image: Shutterstock.com

We all know that local journalism is under extreme pressure. Long established regional newspapers are closing or are being turned into little more than franchise operations where a bare bones local newsroom contributes a modicum of local news to a newspaper fleshed out with filler from national wire services or mother publications. The regional titles in Canada owned by PostMedia are a prime example of this phenomenon. Some digital startups have helped fill the gap, but they too are struggling. There seems to be a reluctance on the part of many to pay for news through subscriptions, while small online publications are forced to compete with everyone from Google and Facebook on down for ad dollars.

One of the supposed remedies, at least in Canada, has been to create various funds to support local journalism. There are a range of programs including the government funded Local Journalism Initiative, launched in 2019 to encourage local news production in “news deserts” and underserved communities, administered by News Media Canada (an industry group), tax credits to offset journalist salaries if the organization is a “Qualified Canadian Journalism Organization” (QCJO), and most recently the Google-funded $100 million fund for media outlets. This was established as a result of Google’s deal with the government to get a five year exemption from being designated under the Online News Act. According to recent reports, the Google fund’s annual contribution to a journalist’s salary will be in the range of C$13,000 to C$20,000 depending on the number of applying journalistic enterprises deemed to qualify.

All these Band-aid measures are designed to staunch the loss of print and digital publications, which has created news deserts in many parts of Canada and the US. In a 2023 report from Northwestern University’s School of Journalism, as reported by Forbes, it was estimated that almost 3000 print newspapers out of approximately 9000 in the US had ceased publication since 2005. The average loss of newspapers in 2023 was 2.5 per week. In my home province of British Columbia alone, Global News reported that the number of daily papers dropped from 36 to just 13 in the six years from 2010 to 2016. According to the Local News Research Project housed at Toronto Metropolitan University’s School of Journalism, over 500 local radio, TV, print and online news operations shuttered in 345 communities across Canada since 2008 although during that time, around 200 new local news outlets, many of them exclusively digital offerings, launched in 152 communities. However, just one opened in 2023. The decline of local news in Canada was well documented in a study released in February of this year by the Public Policy Forum, “The Lost Estate”. The study examines a number of possible remedies, including philanthropic engagement, community foundations, better targetting of government support programs and increased government advertising in local media.

The reasons for the decline of local news are many; the rise of social media, the migration of ad revenues to giant online platforms like Google and META, the reluctance of a younger generation of consumers to pay for news (especially digitally provided content), unauthorized password sharing (even by government!), rising costs of print, and so on. Newly launched digital outlets have tried to fill the gap, but they are facing challenges in getting sufficient ad revenue or subscriptions. In Canada, some will qualify for tax credits or funding from Google, but it will depend on whether they employ at least 2 full time journalists. For many of these digital outlets, their principal modus operandi is to aggregate content from other sources, provide a quick rewrite summary and then insert a link to the original source, thus avoiding copyright infringement. There is some but not a lot of original local journalism.

I have no doubt that with the growing use of AI, some of this initial screening is done through use of artificial intelligence. AI may even be being used to create summaries and rewrites. This saves time and money—but unfortunately cuts down on the need for real journalists. The evaluation of AI’s utility in local journalism is typical of its use in many other areas, from screen writing to auditing to medical diagnoses. It can be a useful tool to enhance productivity, but often at a cost somewhere else, such as with respect to employment. Even if an experienced employee can employ AI to enhance what they are working on, AI could eliminate the beginner or training jobs that help develop the required experience to do this. These challenges are not new and not unique to local journalism.

What is new is the use of AI to take the aggregation model followed by many small local online journals a step further and go nation-wide, a development discovered and highlighted by the Neiman Lab. The Neiman Lab is part of the Neiman Foundation for Journalism, established in 1938 at Harvard. It administers what is proclaimed to be the oldest fellowship program in the world for journalists. The Foundation also publishes a quarterly magazine, Neiman Reports, dedicated to a critical examination of journalism and other journalism-related programs. In an excellent piece of investigative sleuthing, the Lab discovered that “Good Daily”, which operates in 47 states and 355 towns and cities across the US, targeting small town America, is run by just one person, Mathew Henderson, armed with an AI program. If this seems hard to believe, read on.

According to Neiman’s investigation, Henderson operates his “media empire” out of New York. He uses an AI bot to scan the news daily in each local market. The AI program curates the most relevant stories, summarizes them, edits and approves the copy, formats it into a newsletter, and publishes it. The same day! Readers in these 355 towns are led to believe that this is a local publication. It has local testimonials, although the same testimonials, slightly tweaked, appear in various editions around the country. The publications also share the same “About” information and the same mission, which is “to make local news more accessible and highlight extraordinary people in our community.” Henderson claims his automated newsletter is actually helping local publications by driving traffic to them. This is the same argument put forward by META when refusing to pay for news content that it uses on its platform to attract and retain viewers (and sell ads).

Henderson’s business model is to sell advertising and solicit readers for donations. The advertising pie is not infinite, so it is obvious that his “local” newsletters are just one more source of competition for local media chasing ad dollars. Just to be clear, there is nothing illegal about what Henderson is doing. Aggregating content and linking to it is not a violation of copyright law. The lack of full disclosure is a bit disconcerting but I am doubtful if there is anything illegal about having a corporate veil. What is problematic is the cannibalistic nature of his business model, enabled by AI.

This kind of operation, fuelled by AI, can only operate if there is local content to aggregate. The Good Day publications contribute nothing, absolutely zero, to content creation. They, like Facebook, are the ultimate cannibalistic free riders. They can continue to operate successfully, free riding on content created by others, so long as those “others” remain in the business of producing content. However, the more successful businesses like Good Day become, the less viable will be the local journalism sector that produces the content its free-riding competitor subsists on. In the end, the result will be an ouroboros. (Great Scrabble word by the way). The AI driven aggregator will in the end devour the very source of its content. That may be a few years down the road, but logically that is what will happen. It’s like eating your seed grain.

Is there a solution? Many remedies have been proposed, but if there is a silver bullet, I don’t know what it is. For many in the media, finding a way to get the big platforms that benefit from media content to contribute financially to journalism is one avenue, but as we have seen in Canada, Australia, and California, the platforms will pull out all the stops to prevent being required to do so, especially META, which has thumbed its nose at Canada and, now, Australia.

Government subsidies, such as Canada’s Local Journalism Initiative, are resisted by many journalists lest the industry become beholden to government handouts. Holding government to account is one of the key functions of the media, the so-called “Fifth Estate”. Can a subsidized media be trusted to do so? On the other hand, without subsidies will there be any viable local media left? Maintaining its independence is one reason why the small Ottawa-based online publication, Blacklock’s Reporter, that is locked in a David and Goliath struggle with the Government of Canada over the government’s abuse of password-sharing, is opposed to initiatives like the Online News Act. (For the record, that is their view, not mine, but it is a position I respect). Potentially another option could be a tweaking of tax laws to encourage businesses to place ads on local media rather than with the giant international platforms, but in the end business will and should be able to spend its ad dollars where it believes it will get the best results.

For every potential remedy to the problem of keeping local journalism alive, there is a potential downside, just as there is with AI generally. For all its potential advantages AI also has the potential to destroy journalism. Good Daily may be the canary in the coalmine, a fully legal but particularly egregious use of AI putting yet another nail in the coffin of local journalism.

© Hugh Stephens, 2025. All Rights Reserved.

Digital Platforms and News Content: Australia Takes Off the Gloves

Image: Shutterstock

Canada infamously tried to take a leaf from Australia’s book in dealing with large internet platforms, like Google and Meta, that benefit from news media content without paying for it. In 2023, Canada introduced the Online News Act (Bill C-18), a Canadian version of Australia’s News Media Bargaining Code. The Australian approach, first introduced through legislation in late 2021, was initially very successful. Rod Sims, the author of the Code from his then position as Commissioner of the Australian Competition and Consumer Commission (ACCC), testified before the Canadian House of Commons Committee studying C-18, pointing to the success of the initiative in generating some AUD200 million in financial support for Australian media annually. Although there had been pushback by both Google and Meta in Australia, both eventually came onside, especially after the spectacular flop of Meta’s news blackout campaign. The threat of being designated under the Australian code was enough to get the two platforms to negotiate agreements with most Australian media in the form of funding to support journalistic output. As a result of the agreements reached, neither platform was designated under the Code and thus was not subject to “final offer” arbitration imposed by the ACCC.

Canada thought it would “improve” on the Australian precedent by making the process somewhat more transparent in terms of funding offers, and by requiring the platforms to self-designate. Whether it was the tweaked Canadian legislation or, more likely, a reappraisal of the value and cost of the agreements (particularly when it became apparent that the Australian precedent was likely to be followed elsewhere, with Canada being the first out of the gate), both platforms dug in. Meta in particular refused to engage with the Canadian process and declared that it would “comply” with the legislation by removing all links to Canadian media. That is not what the Government of Canada or Canadian media had in mind when Bill C-18 was introduced. Meta has held that line, although the extent to which it is fully complying is under review by the regulator, the CRTC. Various workarounds to allow news content to appear on Facebook have been employed by both Facebook users and some news providers, and META seems willling to turn a blind eye. Why that doesn’t trigger the Online News Act requirement to reach funding agreements with news content providers is a question that cries out for a response. (CRTC take note: We are waiting).

Google was slightly more amenable to striking a deal with the Government of Canada, agreeing that in return for exemption from the legislation, it would contribute $100 million (CAD) annually for five years (adjusted to inflation) to a fund that would provide support to qualified Canadian media enterprises. The $100 million subsumes existing contributions Google was already making to some Canadian journalism programs, so the net result is not $100 million in new money. Google has now begun to disburse this funding through the Canadian Journalism Collective (CJC), an entity established by what could legitimately be called “non mainstream media”, i.e. many small digital startups. The CJC was selected by Google as the executing agency for its funding, thus snubbing the organization representing the major media enterprises, News Media Canada. There are likely to be disputes over whether some of the “little guys” actually qualify as bona fide journalists. The more mouths there are to feed, the less there is for each supplicant and the big players are not happy to see the Google revenue stream diluted.

Meanwhile, back in Oz, Meta has announced that once they expire it will not renew the media agreements it reached back in 2022. Many will end this year. It appears Meta has decided it will adopt a consistent global position by insisting that news media content provides it with no value. Zero. None. And therefore it will not pay a cent. In part, this is to head off similar moves in the US where news media providers would like to bring in an arrangement similar to that instituted in Australia, or Canada. A separate initiative in California ended up with an outcome close to the one in Canada, with Google reluctantly agreeing to contribute funding to local journalism while Meta walked away. The Australian government has seen where this is heading, and it is not happy. It is taking the gloves off.

The Albanese government has announced it will be taking measures to require that any internet company that refuses to negotiate with publishers or removes news from its platform will be forced to pay, regardless. This is the big stick to counter META. What happens next is a consultation process, beginning now, to determine how what is being called a “news bargaining incentive” will actually be applied, retroactive to January 1. All digital platforms with annual revenues of AUD250 million annually will likely be subject to it. This will expand the net to include ByteDance (Tik Tok) and Microsoft (Bing, LinkedIn) as well as META and Google. Google has already said it will carry on and will renew the deals it signed with Australian media, allowing it to be exempted under the Code. META is not backing down.

The so-called “incentive” will take the form of a discounted penalty or fine. The initial proposal is that companies that sign deals amounting to 90% or more of the total of the fine that would otherwise be levied will be exempt. In other words, find ways to strike deals that in the end will save you money. Simply refusing to carry news content, as META has done in Canada, will not let a designated platform off the hook, a significant variation from the Canadian legislation, which many have criticized as being flawed. Rod Sims, now back in academia, fully supports the new incentive initiative. It appears the only way META can avoid payment is by closing its business in Australia. Given META’s track record, this might even be a card it is prepared to play. One can expect it to pull out all the stops to oppose the “incentive”, from legal challenges to threatening a pullout to seeking to invoke the support of the Trump Administration.

What position will the Trump Administration adopt? Donald Trump certainly has no love for the news media, as evidenced by his current and threatened lawsuits against US media outlets for providing coverage he doesn’t like. On the other hand, NewsCorp, which has strong holdings in Australia, has in recent years built its reputation and business model on catering to Trump’s vanity and desires. Trump also is not fan of Facebook, but Mark Zuckerberg has smelled the coffee and has donated a $1 million to Trump’s inauguration, after having kissed the ring by dining with the President-elect at Mar-a-Lago. So in the end, who knows where the US government will be on this question? All I can say to the Australian government’s expressed intention to deal head-on with META’s scorched-earth tactics is “good on ya, mate”. I wish Canada had the gumption to do the same.

© Hugh Stephens, 2025. All Rights Reserved.

Looking Back at 2024: It’s All About AI and Copyright (And a Few Other Things)

Image: Shutterstock

A retrospective on the year now coming to a close is what one expects this time of year, so I will try not to disappoint. However, when I look back at the copyright developments I wrote about in 2024, the dominant issues that jump out are AI, AI and AI. You can’t read or think about copyright without Artificial Intelligence, or to be more correct, Generative Artificial Intelligence (GAI), occupying most of the space despite many other issues on the copyright agenda. The mantra of “AI, AI and AI”, as in “Location, Location and Location” is apt because there are at least three important copyright dimensions related to AI; training of AI models; copyright protection for outputs generated by AI; and infringement of copyright by works created with or by AI. Of the three, the use of copyrighted content for AI training is the most salient.

Last year in my year-ender, I also discussed AI and the numerous lawsuits that were emerging as rightsholders pushed back on having their content vacuumed up by AI developers to train their algorithms. Those lawsuits have only multiplied. At last count, there are more that 30 cases in the US, ranging from big media vs big AI (New York Times v OpenAI/Microsoft) to class action suits brought by artists and authors, as well as litigation in the UK, EU, and now in Canada (see here and here). That is just on the input side.

In terms of output, i.e. whether works produced by an AI can be copyrighted, there are a couple of interesting cases in the US where applications for copyright registration have been refused by the US Copyright Office (USCO) because of a lack of human creativity. A couple of months ago, I discussed two such high profile cases, one brought by Stephen Thaler, and the other by Jason Allen. To date the USCO is not budging, although it is undertaking an extensive study of the issue. Part 1 of its study, on digital replicas, was published in July of this year. The next section on copyrightability is expected to be published in January with the issues of ingestion for training and licensing in Q1 2025.

While the USCO has to date denied applications for copyright registration of AI-generated works, the Canadian copyright office (CIPO-Canadian Intellectual Property Office) has been caught up in a problem of its own making. This is because Canadian copyright registration is granted automatically, so long as tombstone data and the prescribed fee is provided. The work for which registration is sought is not examined. As a result, copyright certificates have been issued to works created by AI, notwithstanding the general presumption that copyright protection is only accorded to human created work (although this is not explicitly stated in the Act). In July a legal challenge was launched against copyright registrant Ankit Sahni, who successfully registered a work with CIPO claiming an AI as co-author. The case was brought by the Canadian Internet Policy and Public Interest Clinic (CIPPIC) at the University of Ottawa, as I wrote about here. (Canadian Copyright Registration and AI-Created Works: It’s Time to Close the Loophole).

While the courts in the US, UK, Canada and elsewhere are grappling with various issues related to AI and copyright, governments are studying the issue.

In Australia, the Select Committee on Adopting Artificial Intelligence issued its final report in November. While the report was wide-ranging, three of its recommendations related to copyright;

engagement with the creative Industry to address unauthorized use of their works by AI developers and tech companies,

transparency in Training Data by requiring AI developers to disclose the use of copyrighted works in training datasets and ensure proper licensing and payment for these works, and

remuneration for AI Outputs, with an appropriate mechanism to be determined through further consultation

These are important principles, but how they will be implemented in practice remains to be determined.

In Canada, a consultation on AI and copyright was launched late in 2023 with submissions to be received by January 15, 2024. The Canadian cultural community put forth three key demands;

No weakening of copyright protection for works currently protected (i.e. no exception for text and data mining to use copyrighted works without authorization to train AI systems)

Copyright must continue to protect only works created by humans (AI generated works should not qualify)

AI developers should be required to be transparent and disclose what works have been ingested as part of the training process (transparency and disclosure).

Submissions to the consultation were published in mid-year but since then there has been no apparent action. Given the current political crisis facing the Trudeau government, none is expected in the near term although the issue will inevitably have to be addressed after the general election in 2025.

While the EU has already established some parameters dealing with use of copyrighted materials for AI training, the new UK Labour government is taking another run at the issue after various proposals in Britain to find a modus vivendi between the AI and content industries under the Tories went nowhere. The current UK discussion paper on Copyright and Artificial Intelligence, which seems excessively tilted in favour of the AI industry, has aroused plenty of controversy. While it says some of the right things, such as proclaiming that one of the objectives of the consultation is to “support…right holders’ control of their content and ability to be remunerated for its use” the thrust of the paper is to find ways to encourage the AI industry to undertake more research in the UK by establishing a more permissive regime with respect to use of copyrighted content. It is based on three self-declared principles; (notice how these things always seem to come in threes?);

Control: Right holders should have control over, and be able to license and seek remuneration for, the use of their content by AI models

Access: AI developers should be able to access and use large volumes of online content to train their models easily, lawfully and without infringing copyright, and

Transparency: The copyright framework should be clear and make sense to its users, with greater transparency about works used to train AI models, and their outputs.

These three objectives then lead to what is clearly the preferred solution;

“A data mining exception which allows right holders to reserve their rights, underpinned by supporting measures on transparency”

Fine in principle, but the devil is always in the detail and the details in this case revolve around transparency (how detailed, what form, what about content already taken?) and, in particular, reservation of rights, aka “opting out”. This is easy to proclaim in principle but difficult to do in practice. British creators are up in arms, led by artists such as Paul McCartney, and supported by the creative industries in the US. The British composer Ed Newton-Rex has penned a brilliant satire explaining how AI development in the UK will work if current proposal is enacted. The problem with an opt-out solution is essentially twofold; it doesn’t deal with content already absorbed by AI developers and it would be cumbersome if not impossible for many rightsholders to use.

Other governments have addressed the issue in different ways. Singapore has taken a very loose approach toward copyright protection, putting its thumb firmly on the scale in favour of AI developers. It is currently considering additional proposals that would strip even more protection from rights-holders, who are pushing back strongly. Japan had been widely and incorrectly reported to have been on the same path, resulting in a welcome clarification this year from the Agency for Cultural Affairs regarding the limits of Japan’s text and data mining (TDM) exception.

While AI dominated the copyright agenda in 2024, there were other issues relating to copyright and copyright industries that I wrote about. The ongoing question of payment for news content by large digital platforms continued to play out in different ways. In Canada, the struggle between the government and US tech giants Google and META was finally “resolved” (after a fashion) at the end of last year. Google agreed to “voluntarily” pay $100 million annually into a fund for Canadian journalism in return for being exempted from the Online News Act (ONA) while META called the government’s bluff by blocking Canadian news providers from its platform thus, in theory, avoiding being subject to the ONA. However, META has a very subjective interpretation as to what is Canadian news content, allowing some news providers to post to it, while many users have found workarounds, as documented by McGill’s Media Ecosystem Observatory. While the CRTC investigated, the issue is still unresolved.

Meanwhile in Australia, it seems that META intends to go down the same road of blocking news, announcing it will not renew the content deals it initially signed with Australian media in response to Australia’s News Media Bargaining Code, the model upon which Canada’s legislation was based. Unlike in Canada, the Australian government is planning a robust response. (More on this in a future blog post). Finally, on the same topic, California (which was threatening to introduce its own version of legislation to require digital platforms to compensate news content providers) emerged with an outcome very similar to that reached in Canada, with Google offering up some funding (although proportionally less than in Canada) while META appears to have walked away.

Controlled Digital Lending (CDL) was another copyright issue finally settled in 2024 (in the US). The Internet Archive, after losing a lawsuit brought against it by a consortium of publishers who argued that the digital copying of their works constituted copyright infringement, notwithstanding the Archive’s theory that they were simply lending a digital version of a legally obtained physical work held by them (or someone else associated with them), lost its appeal. In December, the deadline for further appeals expired, thus effectively ending this saga. Whether Canadian university libraries, some of whom are avid devotees of CDL, will take note remains to be seen.

The issue of circumventing a TPM (“Technological Protection Measure”), commonly referred to as a “digital lock” and often represented by a password allowing access to content behind a paywall, was also front and centre this year in Canada. In the case of Blacklock’s Reporter v Attorney General for Canada, the Federal Court found that an employee of Parks Canada, who shared a single subscription to Blacklock’s with a number of other employees by providing them with the password did not infringe Blacklock’s copyright since the employee did not circumvent (in the meaning of the law) the TPM and the purpose of the sharing was for “research“, which is a specified fair dealing purpose. Blacklock’s is a digital research service that sells access to its content and protects its content with a paywall, as is common for many online content providers, like magazines and newspapers.

Despite the hoo-ha of anti-copyright commentators asserting the Court had found that “digital lock rules do not trump fair dealing“, it was equally clear the Court had ruled that fair dealing does not trump digital locks (TPMs). The Court did not undermine the protection afforded to businesses to protect their content through use of TPMs. Rather, it determined that sharing a licitly obtained password did not constitute circumvention as outlined in the Act, as I explained here. (Fair Dealing, Passwords and Technological Protection Measures (TPMs) in Canada: Federal Court Confirms Fair Dealing Does Not Trump TPMs (Digital Lock Rules). Although the Court did not legitimize circumvention of a TPM for fair dealing purposes, contrary to claims stating the opposite, its acceptance of password sharing is an outcome that legal experts have disagreed with, (as do I for what it is worth). The law is very clear that fair dealing cannot be used as a pretext or a defence against violation of the anti-circumvention provisions of the Copyright Act. The decision now under appeal by Blacklock’s.

Finally, the last copyright point of note for 2024 is that this year marked the bicentenary of the introduction of the first copyright legislation in Canada, in the Assembly of Lower Canada, in 1824. It also marked the centenary of the entry in force of the first truly Canadian Copyright Act on January 1, 1924. This two hundred years of domestic copyright history is worth celebrating. The first legislation was introduced “for the Encouragement of Learning” so that more local school texts would be written and printed. Given the current standoff between the secondary and post-secondary educational establishment and Canadian authors and their copyright collective over license payments for use of copyrighted works in teaching, one wonders whether we have really learned anything about the role copyright plays in our society. (Copyright and Education in Canada: Have We Learned Nothing in the Past Two Centuries? (From the “Encouragement of Learning” to the “Great Education Free Ride”).

Leaving that question with you to ponder, gentle Reader, is probably a good way to end this look back over the past 12 months. Stay tuned for more commentary on copyright developments in 2025.

© Hugh Stephens, 2024. All Rights Reserved.

Another AI Scraping Copyright Case in Canada: News Media Companies Sue OpenAI

Image: Shutterstock (AI assisted)

First, I heard it on the radio. The word “copyright” caught my attention because that’s a word seldom heard on the morning news. Then the news stories started to appear, first on Canadian Press, which was “largely” accurate, then on the CBC, Globe and Mail, even the New York Times. A consortium of Canadian media, including the Toronto Star, Postmedia, the Globe and Mail and the CBC/Radio-Canada is suing OpenAI in Ontario Superior Court for copyright infringement and for violating their Terms of Use. The publishers are seeking CAD20,000 per infringement plus an injunction to prevent further infringement. The case largely parallels a similar one in the US brought by the New York Times against OpenAI and its largest investor Microsoft, which I wrote about earlier this year (When Giants Wrestle, the Earth Moves (NYT v OpenAI/Microsoft).

Despite what the press articles state, this is not the first case in Canada where copyright infringement has been alleged as a result of data being scraped to use in AI applications, as I noted last week. However, it is the first case where news organizations have gone after an AI development company. It also has nothing to do with the Online News Act as stated in the Canadian Press report. In fact, it is the absence of legislation in Canada regarding copyright and AI that is partly responsible for this being fought out in the courts.

OpenAI in its statement quoted “fair use” and “related international copyright principles” to justify its behaviour. The fact that the US fair use doctrine does not apply in Canada, combined with the closed nature of fair dealing exceptions, and the lack of a Text and Data Mining exception in Canadian law, could prove troublesome for OpenAI. It also has the effrontery to state that it offers “opt out” options for news publishers. When you are taking someone’s proprietorial content without permission or payment, it is an insult to tell them they can always opt out. To steal, and then to tell your victim to request that you not steal again, is hardly the way ethical companies operate.

One question to be decided is whether the scraped content falls under copyright as it is a well-established principle that the “news of the day” is not subject to copyright protection. See (Do News Publishers “Own” the News?) News media may not have a monopoly over reporting on what is happening in, say, Gaza but they certainly have the rights to their expression of what is happening through their coverage. OpenAI has also apparently said that its web crawlers are just “reading” publicly available material, as a human being would do. However, reading and copying are two different things, although proving reproduction may be difficult given the unwillingness of OpenAI to disclose its training methods, an issue that has come up in the New York Times case. “Publicly available” is irrelevant, since being publicly available on the internet, or in a library, or anywhere else, does not justify copyright infringement.

In their suit, the plaintiffs are also alleging circumvention of a TPM (technological protection measure, sometimes referred to as a digital lock, which puts content behind a paywall). This is a separate violation of the Copyright Act. In addition, they are alleging violation of their Terms of Use, which are linked to their websites. When a user accesses material on the publishers’ websites, they must agree to the Terms of Use which, among other things, state that the content to be accessed is for the “personal, non-commercial use of individual users only, and may not be reproduced or used other than as permitted under the Terms of Use”, unless consent is given.

The publishers state that OpenAI was well aware of the need to pay for their content and to obtain permission to use it. That is essentially the position also taken by the New York Times. OpenAI has reached licensing agreements with some publishers including the Associated Press, Axel Springer (Business Insider, Politico), the Financial Times, the publishers of People, Better Homes and Gardens and other titles, News Corp (Wall Street Journal and many others), The Atlantic, and others. But not the New York Times obviously (negotiations broke down, leading to the current lawsuit) and not with any of the Canadian media bringing suit. A licensing agreement acceptable to both parties will be the likely outcome of this case. As the US-based Copyright Alliance has pointed out, generative AI licensing isn’t just possible, it’s essential.

There is a vacuum when it comes to legislation in Canada, and elsewhere, regarding the intersection of copyright and AI development. Various models are being experimented with, from the “throw copyright under the bus” model in Singapore to a more nuanced model in Japan, to uncertainty elsewhere. Australia has just produced a Senate report in response to its public consultation on the issue. Among its recommendatons, the Select Committee Report on Adopting Artificial Intelligence called for changes that would ensure copyright holders are compensated for use of their material, while tech firms would be forced to reveal what copyrighted works they used to train their AI models. Canada initiated a public consultation on the topic last year and the Australian Committee’s recommendations with respect to copyrighted content are essentially what the Canadian copyright community asked for. However, since receiving input in January of this year and publishing the submissions received in June, there has been no further information released by the Canadian government. A conclusion similar to the recommendations in Australia would be welcome.

Canadian creators and rightsholders are waiting for some action. Meanwhile the only alternative is to toss the issue to the courts to adjudicate.

(c) Hugh Stephens, 2024. All Rights Reserved.

After Blocking News in Canada, Meta Challenges Australia (Again)

Image: Shutterstock via AI modification

It was inevitable. After Meta pulled the plug on news content on its platform in Canada as its way of complying with the obligations of the Online News Act, Australia, the model that Canada sought to emulate, was surely next in line. On March 1, Meta announced that it plans to stop paying publishers of news content in Australia, and will not renew its current agreements with Australian media once they expire. Most will expire this year.

Canada had modelled its Online News Act (Bill C-18) on Australia’s News Media Bargaining Code, albeit with “improvements”. Rod Sims, who was head of the Australian Competition and Consumer Commission (ACCC) at the time the Commission designed the Code (later incorporated into legislation as the Treasury Laws Amendment (News Media and Digital Platforms Mandatory Bargaining Code) Act 2021), was invited to testify before the Canadian Parliamentary committee examining Bill C-18. In his testimony, Sims talked about the success of the Code, its benefits for not just large media players but also many smaller “country” outlets, estimating the benefits to be north of A$200 million per year to journalism in Australia. He added that the institution of the Code “has transformed the journalism landscape in Australia. It’s gone from pessimism to optimism.”

Inspired by the results of the Australian legislation (which, by the way, ended up not designating either Google or Facebook under the Code, since they managed to come to sufficient “voluntary” agreements with Australian media), Canada moved ahead, basing its legislation on the Australian law but adding a couple of additional features. One was to increase transparency with regard to deals that would be struck under the law. Another was to require self-designation by platforms (while making it apparent that only Meta/Facebook and Google) met the criteria, allowing them an exemption if they reached acceptable deals with media. In this way, the companies could not avoid designation and would be subject to the law, something they strongly opposed, even though both had already engaged in voluntary programs on their own terms to provide some financial support to selected media outlets.

Just as happened in Australia, (see “Google’s Latest “Stoush” with Australia: What’s the Lesson from Germany’s Failed Effort? and “Facebook in Australia: “READY, FIRE, AIM”) both platforms pushed back strongly against the draft legislation, threatening to block news for Canadian users. (Facebook briefly and disastrously blocked news for Australian users during its campaign against the Code, but ultimately backed down). First, in the fall of 2022 Facebook said it might have to block postings of news on its Canadian platform, followed by Google which  threatened to block search for Canadian news in Canada by Canadian users. By the summer of 2023, when the Online News Act became law without any of the amendments proposed by the platforms, Meta upped the ante by declaring that it would end news availability on Facebook and Instagram for all users in Canada prior to the Act taking effect, set for December 2023. Again, just as in Australia, Canadian government leaders were public in their condemnation, accusing Meta of threatening and irresponsible behaviour. Alas, it was all to no avail. It appears Meta had already made its decision to not provide financial support for news content in Canada, and to end the few existing agreements that it had undertaken in the past. At the time, it indicated it would also be taking similar action elsewhere. Rather than submit to the legislation by negotiating with media entities, it complied (in letter if not in spirit) by blocking links to Canadian media. Negotiations with Google continued and eventually a compromise of sorts was reached whereby Google agreed to contribute to a fund which would be used to support journalism in Canada.

This was a somewhat pyrrhic victory (the fund will be about $100 million, less than half what had previously been estimated), but a victory nonetheless in the eyes of at least some of the news media. One can debate the overall success of the legislation (see MediaPolicy.ca’s The Online News Act is law: a buzzer-beater win or epic miscalculation?), but along with more government financial support, the Google funded pot will be welcomed by many smaller media outfits. Ironically, establishing a fund rather than requiring negotiations between the platforms and media for payment for content was an early proposal by some commentators. Now this has come to pass more by accident than design. Criteria for disbursing from the fund have been tweaked so that broadcast media, and in particular the CBC, who employ the bulk of news journalists in the country, get less than their proportional share would otherwise indicate.

The lesson for Canada, and now for Australia, is that the big digital platforms will not hesitate to play hardball if they feel their global interests are threatened. While Australia, followed by Canada, was first off the mark with legislation designed to level the playing field between a stressed journalism sector and the monolithic platforms, the response of the platforms was governed more by potential precedent than the specifics of those markets. The existence of draft legislation at the federal level in the US, (the Journalism Competition and Preservation Act, aka JCPA) as well as at the state level in California and Illinois, has not escaped the attention of Meta and Google. (Even the watered-down compromise settlement that Google made with Canada has led to some lip-smacking speculation in the US as to the amount of funding that could flow to US media). It appears that Meta, in the face of cost cutting and loss of market share in 2022, had made a business decision that if it had to pay for access to news content, it would do without. To what extent this is a wise business decision remains to be seen, but the company has clearly made a business decision in this regard. This decision may or may not affect Meta’s bottom line, but it will have the effect of leaving the platform as a purveyor of less than reliable information from nonprofessional sources. However, doing the most socially responsible thing as opposed to maximizing profits by cutting costs is not what Meta is about.  Having made its decision, it will need to unwind its commitments to Australian media, which it is now in the process of doing.

What does this mean for Australia and what can the Australian government do about it? Writing about this in Canada’s National Post, Rod Sims, now professor at the Crawford School of Public Policy at ANU, outlined some choices the Australian government needs to face. It could move to designate Meta under the Bargaining Code and force it into the negotiation and arbitration process. That would likely lead to Meta taking precisely the action that it took in Canada. The government could amend the legislation, but to what end? It could publicly criticize Meta, accusing it of unfairness and bad behaviour. It has already done this, with Prime Minister Albanese saying that what Meta is doing is “not the Australian way”. That will have zero influence on Mark Zuckerberg and the people who run Meta.

At the end of the day, Australia can stand up to Meta, and let the chips fall as they may, or it can allow Meta to free ride on Australian news content, accepting that there may be social benefits in allowing this to happen. A recent report by the Australian Broadcasting Commission (ABC) points out that Facebook is the largest social media platform for general news and half of Facebook’s users in Australia report using the social media platform for news. (Regardless of this, Meta’s beancounters give news no value to the platform). According to a University of Canberra report cited by ABC, 45% of Australians get their news from social media as opposed to less than 20% from print sources. The largest source of news is still TV at 58%. (The numbers are greater than 100 because many consumers get their news from more than one source).  In one sampling, 14% of Australians got their news from Instagram! While I find this personally appalling (indirectly revealing my age), that is the reality of our society today. Better that consumers find reliable, curated news somewhere–but we still need to recognize that responsible journalism needs to be paid for. Meta, apparently, has no desire to be a part of that equation. Without the infusion of responsible, curated journalism, Facebook will become an even greater home for misinformation than it already is. But does Meta care? Clearly not. Consumers need to be encouraged to find their news sources elsewhere. Easier said than done.

The Australian government is no doubt pondering how to respond in the best interest of Australia. Allowing Meta to wriggle out from its obligations under the Bargaining Code would not necessarily undermine the deals struck with Google, who appears to have accepted that its overall interest is best served by some form of accommodation. Having Microsoft, which has publicly stated it is willing to subject itself to both the Australian and Canadian legislation, breathing down its neck is undoubtedly a factor in this. Even if the Google deal won’t be undone, it is still galling that Meta can get away with it. Canada had to swallow that reality, yet stood up to Meta. What will Australia do? It’s a tough call.

© Hugh Stephens, 2024. All Rights Reserved

When Giants Wrestle, the Earth Moves (NYT v OpenAI/Microsoft)

Image:www.shutterstock.com

There is no better way to start out the New Year, 2024, with a commentary on Artificial Intelligence (AI) and copyright. It was the big emerging issue in 2023 and is going to be even bigger in 2024. The unlicensed and unauthorized reproduction of copyright-protected material to train AI “machines”, in the process often producing content that directly competes in the market with the original material, is the Achilles heel of AI development. To date, no one knows if it is legal to do so, in the US or elsewhere, as the issue is still before the courts. The cases brought to date by artists, writers and image content purveyors like Getty Images have not always been the strongest or best thought out. In one instance, the plaintiffs had not even registered the copyright on some of the works for which they were claiming infringement, a fatal flaw in the US where registration is a sine qua non in order to bring an infringement case. That may have been the most egregious example of a rookie error but in general the artists’ and writers’ cases have not gone too well so far, although the process continues. Some cases are on stronger grounds than others. Here is a good summary. The Getty Images case will be an interesting one to watch. And now the New York Times has weighed in with a billion-dollar suit against Open AI, and Microsoft. The big guys are now at the table and the sleeves are rolled up. The giants are wrestling.

What is at issue could be nothing less than the survival of the news media and the ability of individual creators to protect and monetize their work. It could also open a pathway to legitimacy for the burgeoning AI industry. The ultimate solution is surely not to put a halt to AI development, nor to put content creators out of business. It is to find a modus vivendi between the needs of AI developers to ingest content in order to train algorithms that will “create” (sort of) content–assembled from vast swathes of input–and the rights of content creators. While training sets are generally very large, some of the input can be very creator-specific and the output very creator-competitive. This is where the New York Times comes in.

The Times, like any enterprise, needs to be paid for the content it creates in order to stay in business and create yet more content. If its expensively acquired “product”, whether news, lifestyle, cooking, book reviews or any of the other content that Times’ readers crave and are willing to pay for, can be obtained for free through an AI algorithm (“What is the most popular brunch recipe in the NYT using eggs, bacon and spinach”, or “What does Thomas Friedman think of…..”), this creates a huge disincentive to go to the source and undermines journalism’s business model, already under severe stress and threat.

The Times is one of the few journals that has managed to thrive, relatively speaking, in the new digital age at a time when many of its competitors are dying on the vine. According to Press Gazette, the New York Times is the leading paywalled news publisher, with 9.4 million subscribers. (Wall Street Journal and Washington Post are numbers two and three respectively). You need to pay to read the Times, and why not? But paying for access does not give you the right to copy the content, especially for commercial purposes. (The Times offers various licensing agreements for reproduction of its content, with cost dependent on use). Technically, all it takes is one subscription from OpenAI and the content of the Times is laid bare to the reproduction machines, the “large language models”, or LLMs, used by the AI developers. The Times has now thrown down the gauntlet. Its legal complaint, 69 pages long, makes compelling reading. If there ever was a “smoking gun” putting the spotlight directly on the holus-bolus copying and ingestion of copyright protected proprietary content in order to produce an unfair directly-competing commercial product that harms the original source, this is it. It’s a far cry from earlier copyright infringement cases brought by some artists and writers.

While you can read the complaint yourself if you are interested (recommended reading), let me tease out a few of the highlights. After setting out the well-proven case for the excellence of its journalism, the Times’ complaint notes that while the defendants engaged in widespread copying from many sources, they gave Times’ content particular emphasis when building their LLMs, thus revealing a preference that recognized the value of that content. The result was a free ride on the journalism produced at great expense by the Times, using Times’ content to build “substitutive products” without permission or payment.

Not only does ChatGPT at times regurgitate the Times’ content verbatim, or closely summarizes it while mimicking its style, at other times it wrongly attributes false information to the Times. This is referred to in AI circles as “hallucination”, something the complaint labels misinformation that undermines the credibility of the Times’ reporting and reputation. Hallucination is a particularly dangerous element of AI produced content. Rather than admitting it doesn’t know the answer, the AI algorithm simply makes it up, complete with false references and attributions all of which make it very difficult for the average reader to separate fact from fiction. This misinformation is the basis of the Times’ complaint for trademark dilution that accompanies various other copyright related complaints of infringement. Concrete examples of such misinformation are provided in the complaint.

So too is ample evidence of users exploiting ChatGPT to pierce the Times’ paywall, by asking for the completion of stories that have been blocked for non-subscribers. There are concrete examples of carefully researched restaurant and product reviews that have been replicated virtually verbatim. Not only is the Times’ subscription model undermined, but the value it derives from reader-linked product referrals from its own platform bleeds to Bing when the product is accessed through Microsoft Search enabled by ChatGPT. Examples are given of full news articles based on extensive Times’ investigative reporting being reproduced by ChatGPT, with only the slightest variations. These are not composite news reports of what is happening in Gaza, for example, but a word-for- word lifting of a Times’ analysis of what Hamas knew about Israeli military intelligence. The Times’ complaint makes for chilling reading. AI’s hand has been caught firmly in the cookie jar.

What does the Times want out of all of this? The complaint does not specify a dollar amount, while noting the billions in increased valuation that has accrued to OpenAI and Microsoft as a result of ChatGPT. However, it asks for statutory and compensatory damages, “restitution, disgorgement, and any other relief that may be permitted by law or equity” as well as destruction of all LLM models incorporating New York Times’ content, plus, of course, costs. If the Times gets its way, this will be a huge setback for AI development as well as for OpenAI and Microsoft, but of course it may not come to that. The complaint notes that the Times had tried to reach a licensing deal with the defendants. OpenAI cried foul, expressing “disappointment”, and noting that they had been having “productive” and “constructive” discussions with the Times over licensing content. However, to me this is a bit like stealing the cookies, getting caught red-handed and offering to negotiate to pay for them, then crying foul when your offer is rebuffed. The Times has just massively upped the ante, making the potential licensing fees much more valuable.

The irony is that the use of NYT material by OpenAI or indeed other platforms like Google or Facebook potentially brings some advantage and drives some business to the Times, while obviously also providing commercial benefits to the AI program, search engines or social media platforms. The real question will be how that proprietary content is used, and how much is paid to use it. A similar issue is being played out in another context, most recently in Canada with Bill C-18 where news media content providers wanted the big platforms (Google and Meta/Facebook) that derive benefit from using or indexing that content to pay for accessing it. The result in Canada was both a standoff and a compromise. Facebook blocked Canadian news content rather than pay for it, while Google agreed to create a fund for access by the news media in return for being exempted from the Canadian legislation.

The NYT-OpenAI/Microsoft lawsuit is a different iteration of the same principle. Businesses that gain commercial advantage from using proprietary content of others should contribute to the creation of that content, either through licensing or some other means such as a media fund. The most logical outcome of the Times’ lawsuit is almost certainly going to be a licensing agreement. Given the seemingly unstoppable wave of AI development, meaningful licensing agreements would seem to be the best way to ensure fairness and balance of interests going forward.  

A Goliath like the New York Times is in a much better position to make this happen than a disparate group of writers and artists. Indeed, there are logistical challenges in being able to license the works of tens of thousands of content creators. In an earlier blog post, I postulated that perhaps copyright collectives might find a role for themselves in this area in future. In my view, ultimately the only logical solution to the conundrum of respecting rights-holders while facilitating the development of AI is to find common ground through fair and balanced licensing solutions. The wrestling giants of the NYT and Microsoft may help show the way.

© Hugh Stephens 2024. All Rights Reserved.