
Image: Shutterstock (AI modified)
Singapore already has one of the most permissive text and data mining (TDM) exceptions in copyright law found anywhere, allowing AI developers to ingest copyrighted content for AI training purposes subject only to a very few limitations, all of which are pretty minimal. These provisions were introduced back in 2021 when several changes were made to the Copyright Act, some, frankly, better than others from the perspective of rightsholders. The TDM exception is not among the more positive outcomes. It is made worse by a provision in Singapore law that, for all practical purposes, prevents rightsholders from exercising contract terms to prevent the unlicensed appropriation of their content for the commercial benefit of third parties, namely, AI developers.
The term used to describe text and data mining in Singaporean law is “computational data analysis”. As explained in this law firm blog post, this is defined as including;
“using a computer program to identify, extract and analyse information or data from the work; (or) using the work as an example of a type of information or data to improve the functioning of a computer program in relation to that type of information or data – a specific example being the use of images to train a computer program to recognise images”
The exception also permits supplying the works to other persons, provided this is for the purpose of “(i) verifying the results of the computational data analysis carried out by the latter; or (ii) collaborative research or study relating to the purpose of such analysis carried out by the latter.”
This is simply commercial, unlicensed web scraping of copyrighted content for AI development by another name, with very few limitations.
Singapore is now contemplating opening that door even wider by permitting the circumvention of digital locks (aka Technological Protection Measures, or TPMs) to allow text and data mining for AI training, as I wrote about here. (Singapore’s New Copyright Act Three Years On: There’s No Need to Open the AI Exception Door Even Wider). This proposal needs serious reconsideration as it would seriously tilt the copyright balance in favour of AI platforms to the detriment of rightsholders.
Another example of Singapore’s increasingly permissive approach to copyright protection is its undermining of the sanctity of contracts by limiting contractual terms that prevent unlicensed and unauthorized use of copyrighted content. In other words, the Act limits the ability of contractual terms to protect against, or override, the TDM exceptions. Rightsholders cannot “contract out” of the exceptions.
This provision was included when the Copyright Act was last updated (2021). The list of exceptions that cannot be restricted by contract was expanded to include the exception for computational data analysis use, as well as a couple of other uses; use of a work in judicial proceedings or for legal advice, and functions of galleries, libraries, archives, and museums, the so-called GLAM sector. The computational data analysis exception is the one of concern. It requires that to be enforceable, a contract that limits the ability of a user to scrape content must be individually negotiated. In other words, standard contract terms on websites that limit use (so called shrink-wrap or clickwrap agreements) cannot be used to override the TDM exception. This has the effect of rendering standard contractual terms virtually unenforceable. They become the exception rather than the norm.
The term “shrink-wrap agreement” was originally applied to the preprinted agreement included as part of the packaging containing a software program. By opening the packaging, the user agreed to comply with the licence terms of the software. The term has since been expanded to include “clickwrap agreements” that take effect when a user accepts the terms and conditions of a website. This can be used to specify that the content about to be accessed can only be used for certain purposes or under certain conditions. One of these conditions could be a restriction on use of unlicensed content for AI development. The Singapore legislation eliminates what is a standard practice used by rights-holders in many parts of the world to protect and control use of their content. It also means that robot.txt files used by rightsholders to signal that their content should not be freely scraped (compliance is voluntary) are unlikely to be respected in Singapore. Robot.txt limitations are often included in clickwrap agreements.
Not only does the Singapore law allow for a broad undermining of contractual terms, and prevents “contracting out”, but its TDM exception is very wide in terms of application. In the UK, while contractual terms cannot override the TDM exception, unlike in Singapore allowable TDM use is much narrower. The exception in the UK can be used only “for the sole purpose of research for a non-commercial purpose”. No such restriction exists in Singapore. In the EU, contractual terms can override the general TDM exception (Article 4), unless the unlicensed access is “conducted by research organisations and cultural heritage organisations” or is “for the purposes of scientific research”, (Article 3). In these limited cases only, the contractual override does not apply. This still provides broader protection for rights-holders, and where the contractual override is disallowed, it is for very limited purposes. This is a much more nuanced approach than the one adopted by Singapore.
Contract law is generally seen as the oil that lubricates the wheels of business. In the digital age, shortcuts in the form of clickwrap agreements have been used to convey contractual terms to users. In some jurisdictions, explicit consent is required by clicking “I Agree”. Singapore’s current copyright legislation undermines the sanctity of contracts by imposing unrealistic conditions, particularly with respect to limiting the rights of rightsholders to prevent web-crawlers from ingesting copyrighted content without licence or permission. To say this is problematic is an understatement.
Singapore can do better. As an exemplar of rule of law in the region, it should be as assiduous in protecting the rights of copyright owners as it seems to be in advancing the interests of AI developers. The motivation, apparently, is to promote “innovation”. This is a misread of what brings about innovation. True innovation comes with a partnership between rights-holders and users that protects and compensates rights-holders for the time, effort and investment they have put into developing content that is clearly of value to the AI community. That content should be licensed, or at the very least, rights-holders should be given the option to opt-out through the ability to enforce contract terms, including overriding text and data mining exceptions when necessary.
© Hugh Stephens, 2024. All Rights Reserved.

