Team Digital Preservation
In June 2022, a dedicated team was established at The National Library of Norway to manage the preservation of its digital collection. The team is responsible for handling all types of digital material, whether digitized from analog sources or born-digital. This includes media formats such as websites, text documents, images, audio, and moving images.
Areas of responsibility include managing long-term digital preservation solutions and working across the entire process: ingest, quality control, storage, preservation, and access. Data included for long term preservation typically consists of large, high-quality files, as opposed to compressed access copies.
The Digital Preservation Team collaborates closely with several other specialized media teams within the institution. In addition to receiving digital material covered by the Legal Deposit Act, the National Library of Norway also produces large vulumes of data through digitization efforts. This includes both material from its own collections and from institutions across the archive, library, and museum (ALM) sector.
The team is a members of the Digital Preservation Coalition.
Organisation
The Digital Preservation team consist of 8 members:








This team reports to a committee of leaders responsible for this area in the National Library. The members are:
- IT Director (Product owner)
- Director of Digitalizing Cultural Heritage
- Head of Metadata Standards Development Section
- Head of IT Platform Section
The National Library’s digital collection in numbers
- Over 2 billion files
- More than 100 different file formats
- 18 Petabytes of data (that’s 18,000 Terabytes!) stored in 3 copies
- The largest single file is 2.5 Terabytes
- Daily ingest of new material averages over 6 Terabytes
Data volume by type
- Video and television: 22%
- Film: 21%
- Newspapers: 19%
- Web Archive: 16%
- Radio and audio: 12%
- Books: 8%
- Photos: 2%
Technology choices used when working with digital preservation
- Apache Kafka for sending messages between systems
- Apache NiFi for running the data flows that validate, move, and package data
- MariaDB as the database engine
- DROID og Siegfried for identifisering av filformater
- Grafana for statistics and reporting
- IBM High Performance Storage System (HPSS) as bit repository
- CentOS Linux as server platform
- CommonsIP for pakking og validering av arkivpakker på E-ARK standard