Skip to main content


Research Blog Series: OpenMinTeD

Posted on    by CREATe Team
BlogBlog Book

Research Blog Series: OpenMinTeD

By 10 January 2018March 18th, 2021No Comments

Thomas Margoni reports on the project aiming to develop a registry for text and data mining services and tools, for the Research Blog Series.

OpenMinTeD (Open Mining Infrastructure for Text and Data) is the H2020 e-infra project aiming to develop a registry for text and data mining services and tools. This will allow researchers, research institutions and data providers to find, use and combine resources for TDM purposes thereby enhancing the scientific playing field of the EU.

The project is run by a consortium of 16 EU partners. CREATe/University of Glasgow coordinates the legal interoperability activities which are conducted mainly within working group three (WG3). WG3 is formed by a team of more than 20 specialists with an interdisciplinary background under the scientific lead of Dr Thomas Margoni, and with coordination of Dr Giulia Dore and other CREATe fellows.

WG3 has been investigating, on the one hand, the causes and the degree of the limits imposed to text and data mining under the law of copyright and related rights – e.g., sui generis database right – and, on the other hand, the complex licensing framework in which the resources to be mined are set.

Regarding legal barriers, the main research questions attempt to determine which resources are protected by copyright and connected rights and the consequent possibility to use such resources in absence of a specific legal or contractual authorisation.

Regarding licence compatibility, the research focused on the problematic aspect of lack of licence compatibility across resources. The team is developing a set of compatibility tools that would help researchers to navigate in the confusion generated by non standardised licensing terms and conditions.

Among the early findings of the project, it has emerged that interoperability and standardisation are indeed crucial for the full development of text and data mining. This is especially true in the fragmented and often inconsistent EU legal framework within which TDM researchers and platforms operate. One of the recommendations of the project in order to enhance TDM and more generally R&D in the EU is the introduction of a general and open exception under EU IP law for TDM and similar uses. This would represent a decisive factor to enable TDM activities in the EU in the same way they are already successfully employed across the globe. Failing to do this would condemn the EU scientific and socio-economic sectors to fall behind in such a strategic sector. In the light of the current discussions about EU copyright law reform, it is essential that the EU legislator understands the importance of making the applicable IP legal framework for TDM at least as competitive as that of other R&D leading countries such as the U.S., Singapore or Japan.

At the same time, for the short term an appropriate breakdown of applicable licensing terms and conditions appears to be the best possible solution. Tools that can enable licence compatibility include the Licence Compatibility Matrix (soon to be released in beta version) to help researcher calculate whether the licensing terms of given resources are compatible and can therefore allow the resources to be combined and remixed, as well as other supporting and training materials such as the Open Science fact-sheet and the Open Access FAQs.

Additional information regarding the project status together with its current developments and results are visible at its official website, while other interesting outputs and events about the project are reported on, on the CREATe Blog and OpenAire website.