This is the second part of our four-part series on the EUIPO study on GenAI and copyright. Read parts 1, 3, and 4.

In this second part of our four-part series exploring the EUIPO study on GenAI and copyright, we set out our key takeaways regarding GenAI inputs, including findings on the evolving interpretation of the legal text and data mining (TDM) rights reservation regime and existing opt-out measures.

Key Takeaway 1: The EUIPO suggests that the ‘expressly’ requirement for TDM reservations could be broadly interpreted

Article 4 of the 2019 EU Copyright in the Digital Single Market Directive (CDSM Directive)[1], stipulates that a rightsholder wishing to opt their content out of TDM for commercial purposes must reserve their rights ‘expressly’. The EUIPO study suggests that this ‘expressly’ requirement could, based on the limited existing case law as well as industry and scholarship discourse, be interpreted broadly, which would mean that a valid opt-out need not:

  • reference specific works within a larger corpus;
  • explicitly target TDM as a specific use case (but rather could address broader or other web-scraping use cases);
  • reference any enabling legal provision (such as Article 4 of the CDSM); or
  • target specific potential TDM users.

However, the EUIPO notes that literature supporting a strict interpretation of the ‘expressly’ requirement also exists, which argues that reservations should be use -and content- specific and reference the enabling legislation.

Key Takeaway 2: The EUIPO suggests that the ‘by the rightsholder’ requirement could include licensees and representatives

Pursuant to the Article 4 CDSM exception, a TDM opt-out in respect of any works protected by copyright must be communicated “by their rightsholder”. The study indicates that various legal principles may allow licensees or representatives of the ‘original’ rightsholder to communicate TDM reservations on their behalf, including via:

  • explicit assignments of authority to declare the opt-out;
  • existing delegated management of reproduction rights; and/or
  • implied authority through agency principles and licensee duties.

The EUIPO seems to believe that this approach would reflect the commercial reality of existing copyright management practices where intermediaries frequently act on behalf of rightsholders. For example, collecting societies, publishers and platform operators already manage various rights on behalf of rightsholders, often through licensing and onward-distribution arrangements. The study notes that strictly requiring the original creator/rightsholder personally to implement opt-outs could create inefficiencies in sectors (such as music, film and publishing) where content is typically managed through multi-layered licensing structures. Rightsholders should therefore be able to work through their commercial partners to protect their content. The study suggests that EU Member States’ courts would likely recognise these practical necessities when interpreting the “by the rightsholder” requirement, focusing on the substantive authority relationship rather than requiring direct action by the original rightsholder in all cases.

Key Takeaway 3: The EUIPO suggests that natural language reservations may satisfy the ‘appropriate means’ requirement

Article 4 CDSM states that rightsholders must reserve their rights “in an appropriate manner”, which would include “machine-readable means” for content made publicly available online. The EUIPO study notes that the relationship between “human-readable” and “machine-readable” reservation mechanisms is particularly important in determining what constitutes a valid opt-out for online content. This relationship may be evolving as AI capabilities advance, with the EUIPO seemingly regarding the line between what is “human-readable” versus “machine-readable” as becoming increasingly blurred. The EUIPO’s report therefore seems to expand what qualifies as “machine-readable” beyond traditional technical protocols like robots.txt files or structured metadata.

In this context, the study specifically refers to the German LAION v Kneschke case[2], in which the Hamburg Regional Court suggested in obiter dicta that natural language reservations (such as human-readable website terms and conditions) could potentially meet the requirement for online content to be opted out via “machine-readable” means. The Court reasoned that since modern AI systems are capable of processing and understanding natural language, terms of service prohibiting web scraping should be considered sufficiently “machine-readable” for opt-out purposes, especially in light of the EU AI Act, which requires providers of general-purpose AI models to use “state-of-the-art technologies” to identify such reservations.[3]

This interpretation would mean that rightsholders may not need to implement additional technical solutions if existing terms of service or website notices already clearly express a reservation of rights against TDM activities. However, the EUIPO notes that while natural language reservations may be legally sufficient, they may not always be the most effective method from a practical enforcement perspective. The report indicates that a combination of approaches—including both natural language terms and technical measures—might provide the most comprehensive protection for rightsholders.

Key Takeaway 4: Both legal and technical opt-out measures are available, but limited in efficacy

The EUIPO study analyses two broad categories of opt-out measures, along with their specific limitations:

Legally-driven measures:

  1. Unilateral declarations: Unilateral declarations are direct statements made by rightsholders expressing their reservation of rights against TDM uses. These declarations can be made through various channels such as public announcements or statements on the rightsholder’s official website. The EUIPO study notes that a potential limitation of unilateral declarations is whether TDM users are able to have constructive knowledge of such declarations when they are communicated independently (i.e., not attached to a specific copy of a work).
  2. Databases listing unilateral declarations: These databases serve as centralised repositories where rightsholders can register their opt-out preferences in a more structured and accessible format. They aim to create a single reference point for AI developers to check for rights reservations.
  3. Licensing constraints: Licensing constraints involve specific contractual terms that prohibit or restrict TDM activities on licensed content. These work by establishing legal boundaries through contractual agreements between rightsholders and users of their content.
  4. Website terms and conditions: Website terms and conditions function as a form of legal notice that can include prohibitions against scraping, crawling, or other forms of automated content extraction from websites. They establish usage rules with which visitors to the site may be required to comply. The EUIPO study notes that one limitation of this measure is its location-specific nature: it only relates to copies of works hosted on a specific website to which the T&Cs apply, but not to copies on other websites.

Technically-driven measures:

1. Location-based tools (Robots.txt and TDMRep)

  • Robots.txt: Robots.txt currently serves as a de facto standard for managing location-based web crawling and scaping activities. It is a standardised, machine-readable file placed at the root of a website that provides instructions to web crawlers about which parts of the site should not be accessed or indexed. The EUIPO study identifies one potential limitation of this approach: the protocol relies entirely on “voluntary” compliance from web crawlers as it lacks any automated enforcement mechanism. Crawlers can be deliberately programmed to ignore or circumvent the access restrictions specified in the robots.txt file. This limitation is compounded by the fact that rightsholders often lack control over all domains and websites on which their content may be hosted, meaning they cannot guarantee that the relevant domain operator will implement appropriate robots.txt protocols to protect their content.
  • TDMRep: TDMRep is a newer technical standard specifically designed for TDM opt-outs, providing more granular control than robots.txt. It works by allowing rightsholders to specify different levels of permission for different types of TDM activities.

2. Asset-based solutions

  • Metadata: Metadata solutions embed rights reservation information directly within the content itself, using techniques such as digital watermarking or standardised metadata fields. This approach aims to ensure the reservation travels with the content even when it is distributed or republished elsewhere. The EUIPO study notes that some rightsholders have expressed concern regarding metadata being stripped from content before being processed by AI, and that further protections might be needed, particularly if embedding content is considered as a mitigating tool.
  • Centralised registries: Centralised registries provide comprehensive databases of works that are opted out of TDM uses. These systems work by creating authoritative reference points that AI developers can query before using content for training.

These approaches can be used alone or in combination as part of opt-out strategies, taking into account the potential limitations of each approach.

Key Takeaway 5: Standardized opt-out solutions are needed, with a potential role for public authorities

The EUIPO study acknowledges that no single opt-out mechanism has emerged as a clear and generally-accepted standard since the introduction of the EU’s TDM opt-out regime. Current solutions may not consistently provide a straightforward means of technical enforcement. A one-size-fits-all approach may be inappropriate due to different content sectors having varying needs, which could mean that a hierarchy of measures may emerge with different levels of implicit authoritativeness. However, based on the EUIPO report, there appears to be broad consensus among stakeholders favouring standardised opt-out solutions. The EUIPO seems to view standardised opt-out measures as helping to provide a middle ground that could be both legally sufficient for rightsholders and more easily implementable at scale by AI developers, particularly in light of the EU AI Act’s requirement that providers of general-purpose AI models publish “a sufficiently detailed summary about the content used for training of the general-purpose AI model[4]. The study suggests a potential role for national or supranational IP offices and authorities in facilitating this, for example through federated registries aggregating opt-out information, which could have the added benefit of increased trust and certainty in the ecosystem via public institution involvement.

In the next part of our series, we will consider the EUIPO study’s insights on GenAI output, focusing on the key issues raised and emerging technical solutions.


[1] Directive (EU) 2019/790 of the European Parliament and of the Council of 17 April 2019 on copyright and related rights in the Digital Single Market and amending Directive 96/9/EC and 2001/29/EC (https://eur-lex.europa.eu/eli/dir/2019/790/oj/eng).

[2] Landgericht Hamburg, 310 O 227/23 (27.09.2024).

[3] Article 53(1)(c) of Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence and amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng).

[4] Article 53(1)(d) of Regulation (EU) 2024/1689 (https://eur-lex.europa.eu/eli/reg/2024/1689/oj/eng).