Digitization of SMNSs Collections

FIXME

Definition

Digitization at ZFMK: implementation and maintenance of a Digital Collection Catalogue (DCC).

The Catalogue contains all the information from the collections objoects, annotated with information about Places, Persons, Literature, and related research data (e.g. genetic barcodes).

Infratructure

Rationale:

Practical challenges for researchers in data sharing: https://figshare.com/articles/Infographic_-_Practical_challenges_for_researchers_in_data_sharing/5996786

The Fourth Paradigm: Data-Intensive Scientific Discovery (https://www.microsoft.com/en-us/research/publication/fourth-paradigm-data-intensive-scientific-discovery/)

  1. „eScience is where IT meets scientists“ (Fourth Paradigm) page XViX, fig. 2
  2. empirical science → machine supported data exploration
  3. Hauptproblem: Erfassung einfacher als Kuration (Jim Grey) page xiii-xiv, xx

Images

  • Images in: P:\CollDig\<Section name>\Bilder. Create a subdirectory per Taxon.
  • Recommended image format is TIF. Resolution should be 300 dpi at a minimum for A4 size.
  • Image name: collectionnumber_running number
  • Color-Depth should be 8 bit. Images that are going to be used in a scientific context, e.g. CT scans or histological images can have 16 bit.

Person names

Taxon Names

Data Quality

Sampling-event datasets

Taken from: http://bid.gbif.org/en/community/data-quality/#sampling

Learn more about sampling-event and other classes of datasets currently supported on GBIF.org Darwin Core record details

Term
Status
eventID
Required
eventDate
Required
countryCode
Required
samplingProtocol
Required
samplingSizeValue & samplingSizeUnit
Required
parentEventID
Strongly recommended
samplingEffort
Strongly recommended
locationID
Strongly recommended
decimalLatitude & decimalLongitude2
Strongly recommended
geodeticDatum
Strongly recommended
coordinateUncertaintyInMeters
Strongly recommended
footprintWKT
Strongly recommended
occurrenceStatus
Strongly recommended

FIXME I (Björn) am quite unsure with this kind of recommendations because of:

  • As long as we do (can) not provide an easy to use infrastructure for external IDs it is not applicable to provide locationIDs, parentEventIDs, eventIDs. There is a real danger we are collect a hay stack of private IDs that can never be resolved to events, locations etc. It will be much better to set up a small set of relevant Event parameters that must be provided in an standardized form: Time, Location in WGS 84 coordinates, Names and Institutions of involved persons
  • The Darwin Core Reference Guide provides vague example texts and examples for external references but neither a real vocabulary nor a data model
  • There are no applicable standard for defining efforts of sampling, most of these informations will be irrelevant because other researchers are in complete different situations when doing their own research
  • footprintWKT would recommend that our users know WKT
  • occurrenceStatus is present or absent, does not provide any relevant information when we deal with collected specimens, but would be relevant for ecological or occurrence studies
  • The recommendations are not sufficient for ecological or occurrence studies