Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Open ideas, data and code sharing: epidemiologi...

Open ideas, data and code sharing: epidemiologists should be in front!

Botanical epidemiologists have long been leaders in using mathematical, statistical and computational approaches to tackle theoretical and applied research problems. Such skills always distinguished us from other plant pathology disciplines and naturally allowed us to bring together quantitative researchers (e.g. mathematicians, statisticians, programmers), which has been beneficial to our area of research. The availability of increasing amounts of data at scales from genomes to landscapes requires an even more diverse set of skills and enhanced ability to interact widely and advance the field. Additionally, donors, governments and journals are pushing for increased transparency and reproducibility (Bond-Lamberty et al. 2016). Because of this an open approach to science is quickly becoming more accepted, including unconstrained access and sharing of scientific content, data collection and computer code. It is envisioned that fostering open science attitudes within our research communities will lead to improved reproducibility of the research, both in relation to the methods and the findings. Adopting reproducible research practices directly benefits us as researchers. Between complicated analyses, reviews and revisions and questions years later about the data that was collected or analysis that was conducted, it’s extremely beneficial to be able to easily reproduce your work quickly and easily. Second, it is beneficial to the end-user or reader to be able to verify the validity of the methods used and recreate the analysis which helps with knowledge transfer. Lastly, sharing work openly and making it discoverable can lead to collaborations. While relatively few examples of reproducible research in plant pathology exist (Shah and Madden 2004), it is changing (Del Ponte 2018, Duku et al. 2015, Sparks et al. 2018). To help facilitate this change, we founded the Open Plant Pathology community (Del Ponte and Sparks 2018), which aims to foster relationships between researchers and promote open, transparent and reproducible research using shared data and reusable software. With our history of moving plant pathology forward using computational resources, botanical epidemiologists should be in front leading the way for plant pathology with these new methods.

References

Bond-Lamberty, B., Smith, A.P., & Bailey, V. 2016: Running an open experiment: transparency and reproducibility in soil and ecosystem science. Environ. Res. Lett., 11:084004.
Del Ponte, E.M.: Reproducible report: Meta-analysis of relationships between white mold and soybean yield. [WWW document] URL https://emdelponte.github.io/paper-white-mold-meta-analysis. Cited 22 May. 2018.
Del Ponte E.M. & Sparks A.H.: Open Plant Pathology: a Community to Promote Open Science Practices Including Data, Code and Research Outcomes in Plant Pathology. [WWW document] URL https://www.openplantpathology.org/. Cited 22 May. 2018.
Duku, C., Sparks, A.H. & Zwart, S.J. 2016: Spatial modelling of rice yield losses in Tanzania due to bacterial leaf blight and leaf blast in a changing climate. Clim. Change., 135:569-583.
Shah, D.A., and Madden, L.V. 2004: Nonparametric analysis of ordinal data in designed factorial experiments. Phytopathology., 94:33-43.
Sparks, A.H., Del Ponte, E.M., Everhart, S., Foster, Z.S.L., Grünwald, N., (2018). Compendium of R code and data for ‘Status and Best Practices for Reproducible Research In Plant Pathology’. DOI: 10.5281/zenodo.1250665. Cited 22 May. 2018.

Emerson M. Del Ponte

June 12, 2018
Tweet

More Decks by Emerson M. Del Ponte

Other Decks in Science

Transcript

  1. Emerson M. Del Ponte Adam H. Sparks Open ideas, data

    and code sharing: epidemiologists should be in front! OpenPlantPathology
  2. Science Collect Analyze Publish Write Simplified research life-cycle Review Summarize

    Reproduce Re-analyze (meta-analysis) Share data Open repository Share code open/free tools Collaborative tools Citation manager Pre-prints Open Access
  3. Why to embrace Open Practices? - Sponsors/journals require data (standard

    in molecular) - Allows reproducibility (data and/or methods) - Technology (less cumbersome) is becoming available - Enhanced visibility/transparency - Multiple citable outcomes: data, code, manuscript, etc.
  4. Are data available? Sparks et al (unpublished) 0 - Not

    available 1 - Upon request to authors 2 - Online behind paywall 3 - Free access
  5. Are codes made available? Sparks et al (unpublished) 0 -

    Not available 1 - Upon request to authors 2 - Online behind paywall 3 - Free archive
  6. Software citation? Sparks et al (unpublished) 0 - not mentioned

    1 - mentioned by name only 2 - cited with version number 3 - full citation (procs, package, etc)
  7. Barriers for open practices? - Lack of interest/knowledge (supplemental rarely

    posted) - Low incentive/pressure - that may change! - Takes (huge) time and effort - Document data and code - Versioning code and maintaining - FOBS - Fear of being scooped? - Not valued/taught in our graduate programs
  8. Open Plant Pathology (OPP) fosters a diverse community culture that

    values open, transparent and reproducible research using shared data and reusable software Vision and mission
  9. By creating a social network with a welcoming and sharing

    scientific community openplantpathology.slack.com Expanding network Sharing knowledge Brainstorming Building capacity More transparent, reproducible, efficient and reliable Plant Pathology research Social Workspace 1.
  10. to promote the initiative and publish community outcomes Infrastructure Community

    chat Websites Directory Blog Data catalog Data repository Code Files 2.
  11. #general #welcome #forum Outcome-oriented #teaching #r_package_development Subject-matter #epidemictheory #genomics #reproducibility

    Drops us an email to get an invitation to join Slack [email protected] How to join? Check-in at the member directory database Join channels Create public/private channels 1 2 5 4 Introduce yourself ! 3 Join other channels
  12. - join the conversation in #slack - share your ideas!

    - ask a question or provide an answer - write an OPP Note; - propose workshops - collaborate on a paper with others After joining
  13. Planned activities: OPP members Population Genomics in R Introduction to

    R for Plant Pathologists Introduction to Multivariate Statistics Using R
  14. Data availability - EM Del Ponte, AH Sparks, (2018). Compendium

    of R code and data for 'Open ideas, data and code sharing: epidemiologists should be in front!'. Accessed 09 Jun 2018. Online at https://doi.org/10.5281/zenodo.1286101 - https://openplantpathology.github.io/OPP.at.IEW12/ - Sparks, A.H., Del Ponte, E.M., Everhart, S., Foster, Z.S.L., Grünwald, N., (2018). Compendium of R code and data for ‘Status and Best Practices for Reproducible Research In Plant Pathology’. Accessed 08 Jun 2018. Online at https://doi.org/10.5281/zenodo.1250665 - https://openplantpathology.github.io/Reproducibility_in_Plant_Pathology/