European experiments

Experiment 1: Explore the SCAPE Project’s Matchbox tool

The experiment addressed one of the recurring issues in archiving digital assets, such as unwanted duplication of content – a problem that applies to any kind of digital content, from images to audio files, to videos, to research data.
SCAPE Project's Matchbox was then tested as a tool to allow DCH practitioners to detect duplicates in digital images. KIK-IRPA (Belgium) and KANUT (Estonia) participated in this experiment. In particular the objectives were:

  • Ease of installation of the tool
  • Ease of use for digital librarians and archivists
  • Tool accuracy in detecting duplicate images

The results of the experiment demonstrated that the basic tool seems to work fine. The code is clearly written and it is stable enough to handle broken files in the test. It looks promising and there can be a demand for such a tool. However several drawbacks exist and need to be addressed before Matchbox can be usable in a production environment.

Experiment 2: Investigate the SCIDIP-ES Project’s HAPPI platform

This experiment regarded the Handling Authenticity, Provenance and Persistent Identifiers (HAPPI) toolkit, developed by SCIDIP-ES project, that supports the digital archivist in collecting this part of the PDI, the evidence history of the digital artefact that needs archiving. It consisted of two separated tests:

  • Deployment and Setup of SCIDIP-ES HAPPI in the EGI Federated Cloud Environment
  • Evaluation of the SCIDIP-ES HAPPI Data Model in the DCH community

For what may concern the first test, HAPPI toolkit has been successfully packaged into a virtual appliance, that is deployed on the EGI federated Cloud infrastructure. This demonstrated how easy was to install and provision the toolkit for small IT departments or IT-experienced individuals. Since its deployment and setup, HAPPI toolkit 1.5.0 is continuously running without issues or interruptions of operation. This allows to assess its good level of maturity, as well as the underlying Cloud Infrastructures. Moreover, the HAPPI toolkit instance does not integrate with the EGI authentication framework, demonstrating effective separation of infrastructure management authentication and infrastructure user authentication.

The second test is still on-going, and the participants have agreed to keep this experiment going beyond the DCH-RP project. Even if it is not possible to discuss final results due to this decision, it is reasonable to assert that “HAPPI is a sample service for data provenance, facilitating repeatable science”, as well as it could be applied to DCH-RP community too, for its generic provenance model based on OPM and PREMIS.

Experiment 3: Evaluate EUDAT storage services

The aim of this experiment was to verify if and how services developed, both by EUDAT and nationally, may address the needs of DCH community. EUDAT’s B2SHARE and B2SAFE services were tested, respectively, as a solution for data publication and sharing and for ensuring robust and reliable data replication and guards against data loss.

The detailed objectives of this experiment were:

  • Verify usability of EUDAT B2SHARE service for DCH communities in the terms of the following requirements:
  • simple data upload and access
  • easy and effective data sharing
  • assuring data referability of the data for long term
  • Examine the usefulness of European-wide and national solutions for long-term data and meta-data preservation. Usefulness is considered in following aspects:
  • reliability of the long-term storage process
  • transparency of the data protection mechanisms from the point of view of the service directly interfaced by end-users (such as e.g. data publication and sharing service)

Preliminary results of the evaluation show that B2SHARE service provides a suitable solution suitable for the needs of small cultural institutions and citizens as “publishers” or “curators”. However there are bugs and limitations that prohibit from using this service in a production environment. More thorough testing needs to be done to detect more major and minor bugs, and users should be consulted to upgrade a number of functions.

Experiment 4: Re-evaluate the eCSG and remote Grid/Cloud storage services from PoC1

The experiment aimed to demonstrate that the use the e-Culture Science Gateway (eCSG), as well as the procedure of automatic upload of data into Grid/Cloud storage, could have been made simpler and easier also for general uploaders, given the unavoidable limitation in file management on external storages.
To ease the experiment uptake and support locally at ICCU, a side-project was conducted to establish and configure an Identity provider service at ICCU with the help of INFN Catania and GARR. The test successfully demonstrated that customised uploaders can allow DCH institutions to make use of eCSG for the storing of their digital assets in automatic way. Moreover, there is now the possibility of creating an uploader portlet that can be easily customised for different metadata schemas and formats. (See also Experiment 6)

Experiment 5: a long-term data preservation platform

This experiment explored the capabilities and service levels of data preservation with respect to standards, services and methods in the Cloud, aiming to deploy a platform that would be able to create and access community specific and OAIS compliant Information Packages (Ips).
The deployed system was based on open technologies, standards and recommendations like Tomcat, ODE, SOLR and RDF (OAI-ORE). The system is up and running at http://kokum.fernuni-hagen.de:8080/JSFGui/start.jsf


For the first use case, data from OpenAire.eu, a metadata repository service, was used. The test focused initially on the metadata package generation, including community metadata. The following steps were investigated:

  • Harvest a collection of data objects including meta-data and supplementary data consisting of PDF documents via OAI-PMH
  • Focus on OAIS compliant metadata packaging.

The results following the experiment will be investigated with other use case partners for more complex supplementary data objects like 3D visualisations.