Will the Data Act open up machine data and help achieve sustainability?

I am delighted to share the preprint of Guido Noto La Diega & Estelle Derclaye, ‘Opening Up Big Data for Sustainability: What Role for Database Rights in the Fourth Industrial Revolution?’ in Ole-Andreas Rognstad, Taina Pihlajarinne, Jukka Mähönen (eds), Promoting Sustainable Innovation and the Circular Economy: Legal and Economic Aspects (Routledge 2022), available at SSRN

As hundreds of zettabytes of data are being generated by the bio-cyber-physical technologies of the Fourth Industrial Revolution (4IR),[1] it has become untenable to argue that the law needs to introduce incentives to produce more data.[2] Instead, as private corporations amass a wealth of data branded as ‘machine data’ to hide its value, we need incentives to open up enclosed data. The scarcity of resources is the main traditional justification for proprietary approaches.[3] The 4IR heralds an age of abundance of data,[4]  which justifies – even calls for – a commons-inspired open data approach.[5]

Open data is no longer just the mantra of sparse communities of developers and designers. The EU has recently started to embrace open data in a number of recent legislative instruments and proposals to push public entities and citizens to open up their data for the common good. These are epitomised by the new Open Data Directive[6] and the Data Governance Act,[7] respectively. The former creates the conditions for better access to and re-use of public sector information in the belief that openness can benefit society at large.[8]  The latter invites citizens to embrace a philosophy of data altruism on the grounds that better data sharing can not only bring economic value, but also, and more significantly, ‘new ways for tackling societal challenges (e.g. climate change).’[9]

Liberating data from its enclosures is pivotal to the achievement of the UN Sustainable Development Goals (SDG), as we will argue in the next section, but it is apparent that the real guardians of big data – the private corporations that are the key decision-makers in the 4IR – are not doing enough to facilitate the sharing and re-use of data in the public interest, including the pursuit of climate justice. The main instruments recently proposed to curb big tech power do not seem to embrace openness in any meaningful way. This is epitomised by the proposed AI Act. There, market surveillance authorities are granted access to the training, validation and testing datasetsused by the provider of high-risk AI systems, and under certain conditions also to the relevant source code.[10] This is a very limited right to access, which does not allow the re-use of this data and code for the common good, including for sustainability purposes. Crucially, this is consistent with the pro-proprietary stance of the AI Act summed up in the promise that ‘(t)he increased transparency obligations will also not disproportionately affect the right to protection of intellectual property (…) since they will be limited only to the minimum necessary information for individuals to exercise their right to an effective remedy and to the necessary transparency towards supervision and enforcement authorities.’[11] While there may be instances where Intellectual Property (IP) reasons may justify some limitations in the access to and re-use of big data held by corporations, it is our view that, in general, IP should not be used to hinder re-use of data to pursue the SDGs.

As Governments and citizens are being asked to open up data, now it is the time for private companies to open up as well. As the fight for sustainability has never been more pressing and as transparency rises as one of the pillars of global digital constitutionalism,[12] private power must do better and more to allow the access to and re-use of data for the common good. With this in mind, this paper aims to answer the following overarching research question: what does a sustainable legal framework for the achievement of the sustainability goals look like in the 4IR? First, we will illustrate the triple meaning of ‘data sustainability.’ Second, we will critically assess whether the database right (or ‘sui generis right’) can play a role in opening up corporate big data. Third, will imagine how a sustainable framework for sustainable data governance may look like. This focus is justified by the fact that the Database Directive, often accused of creating an unjustified monopoly on data, is in the process of being reformed by the proposed Data Act.

Our key conclusions are:

•Despite a number of laws on open data and access rights still insufficient incentives for private corporations to free data from their enclosures

Data Act is overall a step in the right direction, but it needs to be amended to clarify that

1.All 4IR databases are not protected by the sui generis right;

2.A dichotomy (obtained/created) is not replaced by another dichotomy (produced/inferred) – Risk of data monopolies of derived / inferred data

3.All user rights and freedoms, and ‘openness’ obligations prevail on any contrary contractual and technological measures

[1] By 4IR we mean an umbrella term for those technologies such as Internet of Things (IoT), Artificial Intelligence (AI), blockchain, quantum computing, nanotechnologies, etc. whose increasing fusion and interaction is rewriting the boundaries between biological, physical, and digital (Klaus Schwab, The Fourth Industrial Revolution (Penguin 2017). 4IR is often confused with Industry 4.0, which is more focused on the industrial applications of the IoT (‘smart manufacturing’) see e.g. Christoph Winterhalter, ‘A New Revolution in the Making’ (2018) November-December ISOfocus 3.

[2] Volumeof data/information created, captured, copied, and consumed worldwide from 2010 to 2025’ (Statista, 2022) < > accessed 12 January 2022.

[3] Mark A Lemley, ‘IP in a World Without Scarcity’ (2015) 90 NYU Law Review 461.

[4] See Andrea Ottolia and Cristiana Sappa, ‘A Topography of Data Commons: From Regulation to Private Dynamism’ [2021] GRUR International 1. The authors convincingly argue that a data commons approach is key to avoid excessive protection in the data-driven society. Unlike the authors, we aim at stating that the commons is the best solution to data governance in the 4IR.

[5] Data commons means not only open data but also collective practices such as the co-creation of data protection solutions to rebalance power between data subjects and data controllers. See Janis Wong, Tristan Henderson and Kirstie Ball, ‘Data Protection for the Common Good: Developing a Framework for a Data Protection-Focused Data Commons’ (2022) 4 Data & Policy 1.

[6] Directive (EU) 2019/1024 of the European Parliament and of the Council of 20 June 2019 on open data and the re-use of public sector information [2019] OJ L 172/56.

[7] Proposal for a Regulation of the European Parliament and of the Council on European data governance (Data Governance Act) COM/2020/767 final.

[8] Open Data Directive, recital 8.

[9] Data Governance Act, Explanatory Memorandum, para 3. See also recital 35.

[10] AI Act, art 64.

[11] AI Act, preamble, para 3.5.

[12] See e.g. Piotr Mikuli and Grzegorz Kuca (eds), Accountability and the Law: Rights, Authority and Transparency of Public Power (Routledge 2021).


I am Associate Professor of Intellectual Property Law and Privacy Law at the University of Stirling, Faculty of Arts and Humanities, where I lead the Media Law and Information Technology Law courses. I am an expert in the legal issues of Internet of Things, Artificial Intelligence, cloud computing, robotics, and blockchain.

