Metainformationen zur Seite
Digitization of SMNSs Collections
Definition
Digitization at ZFMK: implementation and maintenance of a Digital Collection Catalogue (DCC).
The Catalogue contains all the information from the collections objoects, annotated with information about Places, Persons, Literature, and related research data (e.g. genetic barcodes).
Infratructure
An infrastructure is setup for the implementation of the DCC consisting of:
Some Links
- ICDIG
- Deliverables
- WeDigBio
- Worldwide Engagement for Digitizing Biocollections
Rationale:
Practical challenges for researchers in data sharing: https://figshare.com/articles/Infographic_-_Practical_challenges_for_researchers_in_data_sharing/5996786
The Fourth Paradigm: Data-Intensive Scientific Discovery (https://www.microsoft.com/en-us/research/publication/fourth-paradigm-data-intensive-scientific-discovery/)
- „eScience is where IT meets scientists“ (Fourth Paradigm) page XViX, fig. 2
- empirical science → machine supported data exploration
- Hauptproblem: Erfassung einfacher als Kuration (Jim Grey) page xiii-xiv, xx
Images
- Images in: P:\CollDig\<Section name>\Bilder. Create a subdirectory per Taxon.
- Recommended image format is TIF. Resolution should be 300 dpi at a minimum for A4 size.
- Image name: collectionnumber_running number
- Color-Depth should be 8 bit. Images that are going to be used in a scientific context, e.g. CT scans or histological images can have 16 bit.
- Current development in GFBio: https://gfbio.biowikifarm.net/internal/Subtask_5.4.3:_Support_for_multimedia_data
Person names
Taxon Names
Data Quality
Sampling-event datasets
Taken from: http://bid.gbif.org/en/community/data-quality/#sampling
Learn more about sampling-event and other classes of datasets currently supported on GBIF.org Darwin Core record details
- Term
- Status
- eventID
- Required
- eventDate
- Required
- countryCode
- Required
- samplingProtocol
- Required
- samplingSizeValue & samplingSizeUnit
- Required
- parentEventID
- Strongly recommended
- samplingEffort
- Strongly recommended
- locationID
- Strongly recommended
- decimalLatitude & decimalLongitude2
- Strongly recommended
- geodeticDatum
- Strongly recommended
- coordinateUncertaintyInMeters
- Strongly recommended
- footprintWKT
- Strongly recommended
- occurrenceStatus
- Strongly recommended
I (Björn) am quite unsure with this kind of recommendations because of:
- As long as we do (can) not provide an easy to use infrastructure for external IDs it is not applicable to provide locationIDs, parentEventIDs, eventIDs. There is a real danger we are collect a hay stack of private IDs that can never be resolved to events, locations etc. It will be much better to set up a small set of relevant Event parameters that must be provided in an standardized form: Time, Location in WGS 84 coordinates, Names and Institutions of involved persons
- The Darwin Core Reference Guide provides vague example texts and examples for external references but neither a real vocabulary nor a data model
- There are no applicable standard for defining efforts of sampling, most of these informations will be irrelevant because other researchers are in complete different situations when doing their own research
- footprintWKT would recommend that our users know WKT
- occurrenceStatus is present or absent, does not provide any relevant information when we deal with collected specimens, but would be relevant for ecological or occurrence studies
- The recommendations are not sufficient for ecological or occurrence studies