Digital Preservation Plan

The Digital preservation plan for research data and media related at the SMNS includes collection based research data, biodiversity monitoring, biobanking, and DNA sequencing data. This preservation plan is reviewed at last once a year, updates are done on change of one of the components.

Responsible for backup, archiving, assessment and recovery is the „Stabsstelle IT-Infrastruktur und Biodiversitätsinformatik“ at the SMNS. IT is responsible for the different storage types and also for the recovery. It is responsible for the hardware at the SMNS aswell as software for primary and secondary storage.

The primary storage for data is the curated storage level, i.e. storage that is accessible by clients for management of their data and consists of relational databases and files. Copies, database dumps and backups are stored on secondary storage, which is hard disk-based and on servers other than primary storage. These backups are stored with two identical copies at two different buildings of the SMNS.

The data and files are copied and archived on the following schedule:

  1. primary storage: on change and hourly incremental backups
  2. secondary storage: daily
  3. long-term archiving: daily

The funding for purchasing storage media comes from institutional budget plus additional funding from third parties.

The archiving workflow at the SMNS is setup according to OAIS standard. SIPs are preserved as they are received. AIPs are created from the curated original data on a regular basis.

The disaster planning covers following scenarios:

  1. Natural disaster:
    1. Flooding or burning: recovery from tape archives
    2. Overvoltage: overvoltage protection, if this fails and media failure occurs: see 3)
  2. Human failure:
    1. Deletion or overwriting of information: change track of editing steps (Diversity Workbench)
    2. Overwriting of backup copies or database dumps: files are stored with a timestamp included in the filename, this prevents accidental overwriting.
  3. Media failure:
    1. Aging of servers or storage media: hardware is renewed every five to six years
    2. Obsolescence of archiving media: tapes are backward compatible for the last but one version. One version cycle is 5 years, therefore readability is guaranteed for 10 years.
    3. Obsolescence of software: regular updates of system applications. Original data is regularly saved as text files and can therefore be restored to other systems.

Recovery of files on failure:

  1. SQL Server: from Backups via „Restore Database“
  2. PostgreSQL/MySQL: from database dumps
  3. Media: from backup copies

In case of file format obsolescence, i.e. unreadability of backup copies by the system in use, the data are restored from the backup text files.

Plans for obsolescence:

Responsible for integrity of digital files is the Biodiversity Informatics section at the SMNS. Fixity checks are planned for the future with the setup of Tripwire to monitor file changes.