Posted on Wednesday March 4, 2015

The Current Europeana Dataset

Where does it come from?

The Europeana Dataset

The Europeana Dataset brings together the digitised content of Europe’s galleries, libraries, museums, archives and audio-visual collections. Currently, the portal provides integrated access to over 40 million books, films, paintings, museum objects and archival documents from over 3,000 data providers.

The data can currently be searched via the portal or queried via an API. There is also an experimental SPARQL endpoint

About the collection: selection policy

The current content policy of the Europeana Foundation is geared towards the aggregation of as much digital heritage metadata of European origin as possible. Europeana aggregates only metadata and thumbnails; they do not store content itself but rather links to content held by data providers. In 2011, the Europeana Foundation for the first time aggregated user-generated (memorabilia and stories) content with the projects Europeana 1914-1918 and Europeana 1989; however, this constitutes a small part of the collection.

About the dataset: aggregation process

The Europeana Foundation has around 40 employees, and is hosted by the National Library of the Netherlands in The Hague; therefore it cannot interact directly with all 3,000+ institutions at every stage. The Foundation makes use of a network of partner aggregators who collect, format and manage metadata from multiple data providers, providing services such as offering their own portal and acting as data provider to Europeana.

The Foundation works with three types of aggregators: national/regional, domain-level (libraries, museums) and theme-focused (Europeana Fashion, European Film Gateway). The largest domain-level aggregator for national and research libraries from Europe is The European Library, the biggest data provider to Europeana. There are also specific aggregators for archives (Apex), and film and television (EFG, EU SCreen).

A full list of aggregators is available on the Europeana Pro.

Harmonisation of the metadata

The metadata is collected from various types of aggregators from across Europe. Many different metadata standards are used - therefore the metadata needs to be harmonised. Each of the aggregators converts the native format into a single interoperable format. At first, the Europeana Semantic Elements model was used, an extended Dublin Core model, as an interoperable standard metadata. This provided a first step for making heterogeneous data compatible, but meant sacrificing some of the individual richness of particular data formats.

This was superseded by the Europeana Data Model. This is a richer model, allowing for domain-specific metadata standards to be incorporated without a loss of information, providing greater relationship to the semantic web as well as providing meaningful links to Europe’s cultural heritage data.

The Europeana Foundation enriches the metadata they receive from aggregators the data with controlled lists of names, places and subjects.

Licencing and Re-use

All standardised metadata made available via Europeana is marked as Creative Commons 0. This means the owners of the metadata have waived all rights in the data, and it can be used for any purpose, eg for research but also for commercial purposes.

The Europeana Foundation also uses 13 standardised rights statements describing copyright, access and (re-)use of the content that its metadata links to. Rights statements can be split into four categories:

  • Public Domain - where copyright does not exist, has expired or has been waived and best practice guidelines for use apply
  • Creative Commons licences - where the rights holder grants permission to apply one of the six Creative Commons licences
  • Rights Reserved - where access to the objects is provided, and additional permissions are required for re-use
  • Unknown Status - where the rights are unknown, or the object is a legally recognised Orphan Work

Within the (non-commercial) research domain, any content from a museum, library or other data provider with a Public Domain mark or a Creative Commons Licence can be used. The precise nature of this re-use (eg whether the work can be edited) depend on the exact licence being applied.

On the Europeana portal, The rights statements are accessible by clicking on the object you want to (re-)use. By clicking through the rights statement accompanying any object, you will access a webpage that describes the rights and permissions in more detail.

Alastair Dunning, The Europeana Foundation

Ingeborg Verspille, Consortium of Europeana Research Libraries