$30 off During Our Annual Pro Sale. View Details »

How we developed a data exchange format: Lesson...

Peter Desmet
October 13, 2023

How we developed a data exchange format: Lessons learned from Camtrap DP

Talk at the TDWG 2023 annual conference - October 13, 2023.

Abstract: https://doi.org/10.3897/biss.7.111425

Recording: https://drive.google.com/file/d/1criNdDJqImxNbUpaS7yWBBRBzRz_h0iL/view?usp=sharing

Peter Desmet

October 13, 2023
Tweet

More Decks by Peter Desmet

Other Decks in Science

Transcript

  1. How we developed a data exchange format Lessons learned from

    Camtrap DP TDWG 2023 13 October 2023 Peter Desmet & Jakub Bubnicki 0000-0002-8442-8025 & 0000-0002-2064-3113
  2. - “Camera Trap Data Package” - Model and format to

    exchange camera trapping data - Designed to capture all essential data and metadata of a single camera trap study What is Camtrap DP? Desmet et al. (2021) bit.ly/camtrap-dp-tdwg-2021
  3. 📈 Massive amounts of camera trap data ✅ Data management

    ✅ Data processing ❌ Data exchange, harmonization, publication ❌ Suitable standards Why Camtrap DP? Bubnicki et al. (2023) doi.org/10.32942/X2BC8J Agouti Wildlife Insights TRAP PER eMam mal
  4. 1. Camtrap DP should allow easy and interoperable data exchange

    2. Camtrap DP should be developed openly and collaboratively Two guiding principles
  5. Use a simple data model - This is hard! 🤯

    - Three tables: - Deployments - Media - Observations - Supports wide range of: - Deployment designs - Classification techniques - Analytical use cases
  6. Build on Frictionless Standards - Open, generic specifications (JSON Schemas)

    to describe: - Datasets (Data Package) - Data files (Data Resource) - Table fields (Table Schema) - Simple, machine-usable & extensible - Camtrap DP = Frictionless DP - Existing software can be used to read and validate data specs.frictionlessdata.io
  7. { "name": "mediaID", "description": "Unique identifier of the media file.",

    "skos:broadMatch": "http://purl.org/dc/terms/identifier", "type": "string", "constraints": { "required": true, "unique": true } }, { "name": "deploymentID", "description": "Identifier of the deployment …", "skos:broadMatch": "http://rs.tdwg.org/dwc/terms/parentEventID", "type": "string", "constraints": { "required": true } }, { "name": "captureMethod", "description": "Method used to capture the media file.", "skos:broadMatch": "http://rs.tdwg.org/ac/terms/resourceCreationTechnique", "type": "string", "constraints": { "enum": ["motionDetection", "timeLapse"] } }, { "name": "timestamp", "description": "Date and time at which the media file …", "skos:exactMatch": "http://ns.adobe.com/xap/1.0/CreateDate", "type": "datetime", "format": "%Y-%m-%dT%H:%M:%S%z", "constraints": { "required": true }, } Reuse existing standards - Frictionless Standards: - Metadata terms, csv dialect, field names, data types, required values, controlled values, etc. - TDWG and other standards: - Darwin Core - Audiovisual Core - Humboldt Extension - Dublin Core - DataCite Metadata Schema - Camtrap DP = Domain-specific and highly interoperable
  8. Develop openly - Code on GitHub: - Open source (MIT

    license) - Versioned (incl. semantic) - Collaboration on GitHub: - Suggestions as issues - Review, discuss, implement - Automated tests - Documentation on GitHub: - Website (updates automatically) - Example dataset github.com/tdwg/camtrap-dp
  9. Build into software - Entire ecosystem: - Management: Agouti, Trapper,

    other systems - Analysis: Camtraptor - Publication: GBIF IPT - Challenging to coordinate 😅 - Feedback from adopters agouti.eu os-conservation.org/projects/trapper inbo.github.io/camtraptor gbif.org/ipt
  10. Reyserhove et al. (2023) doi.org/10.35035/doc-0qzp-2x37 - Short, sticky name -

    Outreach: - Conferences & GBIF webinar - GBIF publication guide - Co-authors on paper - Use cases - Give it time, build trust Embrace participation
  11. - Versioning: - Datasets point to a version of Camtrap

    DP - Camtraptor provides conversion - People: - Committed maintainers - TDWG community - Funding: - LifeWatch, OSCF, NLBIF - Biodiversa+ Think about maintenance
  12. Summary - Use a simple data model - Build on

    Frictionless Standards - Reuse existing standards - Develop openly - Built into software - Embrace participation - Think about maintenance
  13. Thank you camtrap-dp.tdwg.org Desmet P & Bubnicki J (2023) How

    we developed a data exchange format: Lessons learned from Camtrap DP. Presentation at TDWG 2023. https://bit.ly/camtrap-dp-tdwg-2023 Open Science Conservation Fund