The CENDARI White Book of Archives
File Type:
PDFItem Type:
ReportDate:
2016Author:
Access:
openAccessCitation:
JENNIFER EDMOND, Jakub Beneš, Nataša Bulatović, Milica Knežević, Jörg Lehmann, Francesca Morselli, Andrei Zamoiski, 'The CENDARI White Book of Archives', [Report], 2016Download Item:
Abstract:
Over the course of its four year project timeline, the CENDARI project has collected archival descriptions and metadata in various formats from a broad range of cultural heritage institutions. These data were drawn together in a single repository and are being stored there. The repository contains curated data which has been manually established by the CENDARI team as well as data acquired from small, ‘hidden’ archives in spreadsheet format or from big aggregators with advanced data exchange tools in place.
While the acquisition and curation of heterogeneous data in a single repository presents a technical challenge in itself, the ingestion of data into the CENDARI repository also opens up the possibility to process and index them through data extraction, entity recognition, semantic enhancement and other transformations. In this way the CENDARI project was able to act as a bridge between cultural heritage institutions and historical researchers, insofar as it drew together holdings from a broad range of institutions and enabled the browsing of this heterogeneous content within a single search space.
This paper describes a broad range of ways in which the CENDARI project acquired data from cultural heritage institutions as well as the necessary technical background. In exemplifying diverse data creation or acquisition strategies, multiple formats and technical solutions, assets and drawbacks of a repository, this “White Book” aims at providing guidance and advice as well as best practices for archivists and cultural heritage institutions collaborating or planning to collaborate with infrastructure projects.
Sponsor
Grant Number
European Union Framework Programme 7 (FP7)
Author's Homepage:
http://people.tcd.ie/edmondjDescription:
While the acquisition and curation of heterogeneous data in a single repository presents a technical challenge in itself, the ingestion of data into the CENDARI repository also opens up the possibility to process and index them through data extraction, entity recognition, semantic enhancement and other transformations. In this way the CENDARI project was able to act as a bridge between cultural heritage institutions and historical researchers, insofar as it drew together holdings from a broad range of institutions and enabled the browsing of this heterogeneous content within a single search space. This document describes a broad range of ways in which the CENDARI project acquired data from cultural heritage institutions as well as the necessary technical background. In exemplifying diverse data creation or acquisition strategies, multiple formats and technical solutions, assets and drawbacks of a repository, this “White Book” aims at providing guidance and advice as well as best practices for archivists and cultural heritage institutions collaborating or planning to collaborate with infrastructure projects.
Author: EDMOND, JENNIFER; Beneš, Jakub; Bulatović, Nataša; Knežević, Milica; Lehmann, Jörg; Morselli, Francesca; Zamoiski, Andrei
Type of material:
ReportCollections
Availability:
Full text availableMetadata
Show full item recordLicences: