reproducibility as the engine of science: tool...

Lindsey Heagy
February 11, 2020

reproducibility as the engine of science: tools for reproducible research

Presented at the workshop: "Scientific publication beyond the text: Sharing research objects"


Thanks to Rowan Cockett, Chris Holdgraf and Fernando Perez for helping shape these ideas and this presentation.

  1. hello (a bit about me) geophysical inversions open-source software open

    research & education geoscience + data science +
  2. questions in the geosciences Observations / Data After Hamman, 2018

    Theory & Ideas EMAG2: Earth Magnetic Anomaly Grid (2-arc-minute resolution). Image credit: Dom Fournier (toolkit.geosci.xyz) Simulations, Computation
  3. evolving research outputs & audiences Variety of “consumers”: • peers

    • students • decision makers & the public Drives diversity in outputs • journal publications • web apps • educational resources • ...
  4. interactive, exploratory computing a community of people and an ecosystem

    of open tools and standards for interactive computing
  5. JupyterLab: a grand unified theory of Jupyter Huge Team Effort!

    C. Colbert, S. Corlay, A. Darian, B. Granger, J. Grout, P. Ivanov, I. Rose, S. Silvester, C. Willing, J. Zosa-Forde …
  6. JupyterLab is extensible: FlyBrainLab An Interactive Computing Platform for the

    Fly Brain BIONET Group, Columbia University http://www.bionet.ee.columbia.edu Aurel A. Lazar (PI) Tingkai Liu Mehmet K. Turkcan Chung-Heng Yeh Yiyin Zhou http://fruitflybrain.org
  7. JupyterHub distributions The Littlest JupyterHub tljh.jupyter.org JupyterHub on Kubernetes z2jh.jupyter.org

    A pre-configured JupyterHub setup with sensible defaults and lots of documentation, fit for many use-cases ☁
  8. Scalable in both users and in resources Uses Docker for

    environment management Agnostic to the provider and hardware configuration Zero to JupyterHub for Kubernetes z2jh.jupyter.org
  9. Harnessing the power of cloud computing to study the whole

    Earth interactively Interactivity Distributed computing Data models / numerics
  10. Jupyter meets the Earth: an NSF grant (2M / 3Y)!

    Fernando Pérez Joe Hamman Laurel Larsen Kevin Paul Lindsey Heagy Chris Holdgraf Yuvi Panda Research use-cases Tech developments • Climate data analysis • Hydrology • Geophysics • Data discovery • Interactivity • Cloud/HPC infrastructure For more: http://bit.ly/jupytearth
  11. the science more than the paper An article about computational

    science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. -- Buckheit and Donoho (paraphrasing Claerbout) WaveLab and Reproducible Research, 1995
  12. An article about computational science in a scientific publication is

    not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures. (and a place to run the code?) the science more than the paper -- Buckheit and Donoho (paraphrasing Claerbout) WaveLab and Reproducible Research, 1995
  13. New development: publishing executable books QuantEcon IAB Jupyter Book PDF

    HTML ... execution and text content sync citations, cross-refs, rich metadata
  14. Groundwater in Myanmar • Bring DC resistivity equipment to Mon

    state • Train local stakeholders • Provide open-source software and educational resources
  15. Reaching new audiences Diverse research outputs: • Papers • Notebooks

    • Apps • Web-based textbooks “Consumers” of science • Scientists • Students • Public
  16. Blurring the line between scientists and audience? • Open tools

    are ◦ accessible ◦ explorable ◦ extensible
  17. An open ecosystem supports the engine of science • Open

    tools are a starting point for… ◦ reproducibility of work ◦ collaboration at the level of computation ◦ extension of ideas • And provide a trajectory for “consumers” to become creators