EU copyright law in relation to AI training models

Summary:

The mining of big data and machine learning require the compilation of corpora (e.g. literary works, public domain material, data) that are often “available on the internet”. The collection stage is usually followed by the processing and annotation of the collected data, depending on the type of learning (supervised/unsupervised) and the purpose of the algorithm. Copyright law has a direct impact on this process, as the corpora could include works protected by copyright and, any digital copy, temporary or permanent, in whole or in part, direct or indirect, has the potential to infringe copyright (Art. 2 InfoSoc Directive). Furthermore, the changes made in the collected material can amount to ‘adaptation’ and the relevant exceptions, such as research or text and data mining, might not sufficiently cover these activities of the stakeholders in this area.

This project will analyse case studies on data scraping, natural language processing and computer vision to assess whether the current legal framework is well equipped for the development of AI applications, especially in the field of machine learning, or, if not, what kind of measures should be developed (legal reform, policy initiatives, licences and licence compatibility tools, etc).

Funders:

Horizon 2020 – Rethinking digital copyright law for a culturally diverse, accessible, creative Europe (Grant Agreement No. 870626)

Team:

Thomas Margoni
Principal Investigator

Martin Kretschmer
Investigator

Pınar Oruç
Researcher

Duration:

2020-2022

Outputs:

Legal Approaches to Data: Scraping, Mining and Learning provides a gateway to project outputs, connected works and events.

Events:

UBDC and CREATe Workshop on the Legal approaches to Data: Scraping, Mining and Learning (27/5/2021)
Oruç, P. Report – UBDC and CREATe Workshop on the Legal approaches to Data: Scraping, Mining and Learning (07/07/2021).

Academic Outputs:

Kretschmer, M., Margoni, T., & Oruç, P. (2021). D3.6 Interim study on the state of harmonisation of the rights of reproduction and adaptation and connected exceptions. Zenodo.

Margoni, T. (2020), Text and Data Mining in Intellectual Property Law: Towards an Autonomous Classification of Computational Legal Methods. CREATe Working Paper 2020/1.

Back

Project: Public Domain