====== Digitization of SMNSs Collections ====== FIXME ====== Definition ====== Digitization at ZFMK: implementation and maintenance of a **Digital Collection Catalogue** (DCC). The Catalogue contains all the information from the collections objoects, annotated with information about Places, Persons, Literature, and related research data (e.g. genetic barcodes). ====== Infratructure ====== An infrastructure is setup for the implementation of the DCC consisting of: * [[dwb:overview| Diversity Workbench framework]] * [[d_a_m:overview| Digital Asset Management System]] FIXME * [[biodivinf:digitization:portal| Digital Collection Portal]] * [[biodivinf:digitization:statistics| Digitization Statistics]] * [[biocase:overview| BioCASe Provider Software]] * [[biodivinf:digitization:material| Equipment for Digititization]] * [[biodivinf:digitization:withhold| Withhold demands]] * [[biodivinf:digitization:versioning| Versioning]] ===== Some Links ===== ; ICDIG : [[https://icedig.eu/content/deliverables|Deliverables]] ; WeDigBio : [[https://wedigbio.org/|Worldwide Engagement for Digitizing Biocollections]] ===== Rationale: ===== Practical challenges for researchers in data sharing: https://figshare.com/articles/Infographic_-_Practical_challenges_for_researchers_in_data_sharing/5996786 The Fourth Paradigm: Data-Intensive Scientific Discovery (https://www.microsoft.com/en-us/research/publication/fourth-paradigm-data-intensive-scientific-discovery/) - "eScience is where IT meets scientists" (Fourth Paradigm) page XViX, fig. 2 - empirical science -> machine supported data exploration - Hauptproblem: Erfassung einfacher als Kuration (Jim Grey) page xiii-xiv, xx ===== Images ===== * Images in: P:\CollDig\
\Bilder. Create a subdirectory per Taxon. * Recommended image format is TIF. Resolution should be 300 dpi at a minimum for A4 size. * Image name: collectionnumber_running number * Color-Depth should be 8 bit. Images that are going to be used in a scientific context, e.g. CT scans or histological images can have 16 bit. * [[https://physalia.evolution.uni-bonn.de/dzviewer/index.html?imageURL=https://physalia.evolution.uni-bonn.de/dumping/ZFMK/ZFMK_53.jpg|DeepZoom-Viewer]] * Current development in GFBio: https://gfbio.biowikifarm.net/internal/Subtask_5.4.3:_Support_for_multimedia_data ===== Person names ===== ; VIAF : http://viaf.org/viaf/partnerpages/DNB.html ; DNB : https://portal.dnb.de/opac.htm?query=per%3D%22M%C3%BCller%2C+Johann+Friedrich+Theodor%22+sortBy+ka%2Fsort.ascending&method=simpleSearch&cqlMode=true => Eintragen der ID in URL Feld: http://d-nb.info/gnd/118737457 ===== Taxon Names ===== [[https://bdj.pensoft.net/articles.php?id=9787&instance_id=3338935 | Taxonaut: an application software for comparative display of multiple taxonomies with a use case of GBIF Species API]] [[https://www.gbif-uat.org/developer/species|GBIF Species API]] ===== Data Quality ===== ==== Sampling-event datasets ==== Taken from: http://bid.gbif.org/en/community/data-quality/#sampling Learn more about sampling-event and other classes of datasets currently supported on GBIF.org Darwin Core record details ; //Term// : //Status// ; eventID : Required ; eventDate : Required ; countryCode : Required ; samplingProtocol : Required ; samplingSizeValue & samplingSizeUnit : Required ; parentEventID : Strongly recommended ; samplingEffort : Strongly recommended ; locationID : Strongly recommended ; decimalLatitude & decimalLongitude2 : Strongly recommended ; geodeticDatum : Strongly recommended ; coordinateUncertaintyInMeters : Strongly recommended ; footprintWKT : Strongly recommended ; occurrenceStatus : Strongly recommended FIXME I (Björn) am quite unsure with this kind of recommendations because of: * As long as we do (can) not provide an easy to use infrastructure for external IDs it is not applicable to provide locationIDs, parentEventIDs, eventIDs. There is a real danger we are collect a hay stack of private IDs that can never be resolved to events, locations etc. It will be much better to set up a small set of relevant Event parameters that must be provided in an standardized form: Time, Location in WGS 84 coordinates, Names and Institutions of involved persons * The [[http://tdwg.github.io/dwc/terms/index.htm|Darwin Core Reference Guide]] provides vague example texts and examples for external references but neither a real vocabulary nor a data model * There are no applicable standard for defining efforts of sampling, most of these informations will be irrelevant because other researchers are in complete different situations when doing their own research * footprintWKT would recommend that our users know WKT * occurrenceStatus is present or absent, does not provide any relevant information when we deal with collected specimens, but would be relevant for ecological or occurrence studies * The recommendations are not sufficient for ecological or occurrence studies