Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Pangeo Tutorial - Ocean Sciences 2020

Ryan Abernathey
February 17, 2020

Pangeo Tutorial - Ocean Sciences 2020

Slides for tutorial at 2020 Ocean Sciences Meeting:
https://agu.confex.com/agu/osm20/meetingapp.cgi/Session/85251

Ryan Abernathey

February 17, 2020
Tweet

More Decks by Ryan Abernathey

Other Decks in Science

Transcript

  1. Pa n g e o T u t o r

    i a l Ryan Abernathey (Columbia / LDEO) Chelle Gentemann (Farallon Institute) Cloud / HPC Distributed storage
  2. !2 W h at D r i v e s

    P r o g r e s s i n 
 E a r t h S c i e n c e ?
  3. !2 W h at D r i v e s

    P r o g r e s s i n 
 E a r t h S c i e n c e ? New Ideas / Hypotheses
  4. !2 W h at D r i v e s

    P r o g r e s s i n 
 E a r t h S c i e n c e ? New Ideas / Hypotheses E 5 r 0 jUj p ðN/jUj jfj/jUj P 1D (k) ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi N2 2 jUj2k2 q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jUj2k2 2 f2 q dk, (3) where k 5 (k, l) is now the wavenumber in the reference frame along and across the mean flow U and P 1D (k) 5 1 2p ð1‘ 2‘ jkj jkj P 2D (k, l) dl (4) is the effective one-dimensional (1D) topographic spectrum. Hence, the wave radiation from 2D topogra- phy reduces to an equivalent problem of wave radiation from 1D topography with the effective spectrum given by P1D (k). The effective 1D spectrum captures the effects of 2D c. Bottom topography Simulations are configured with multiscale topogra- phy characterized by small-scale abyssal hills a few ki- lometers wide based on multibeam observations from Drake Passage. The topographic spectrum associated with abyssal hills is well described by an anisotropic parametric representation proposed by Goff and Jordan (1988): P 2D (k, l) 5 2pH2(m 2 2) k 0 l 0 1 1 k2 k2 0 1 l2 l2 0 !2m/2 , (5) where k0 and l0 set the wavenumbers of the large hills, m is the high-wavenumber spectral slope, related to the pa- FIG. 3. Averaged profiles of (left) stratification (s21) and (right) flow speed (m s21) in the bottom 2 km from observations (gray), initial condition in the simulations (black), and final state in 2D (blue) and 3D (red) simulations.
  5. !2 W h at D r i v e s

    P r o g r e s s i n 
 E a r t h S c i e n c e ? New Ideas / Hypotheses New Observations E 5 r 0 jUj p ðN/jUj jfj/jUj P 1D (k) ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi N2 2 jUj2k2 q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jUj2k2 2 f2 q dk, (3) where k 5 (k, l) is now the wavenumber in the reference frame along and across the mean flow U and P 1D (k) 5 1 2p ð1‘ 2‘ jkj jkj P 2D (k, l) dl (4) is the effective one-dimensional (1D) topographic spectrum. Hence, the wave radiation from 2D topogra- phy reduces to an equivalent problem of wave radiation from 1D topography with the effective spectrum given by P1D (k). The effective 1D spectrum captures the effects of 2D c. Bottom topography Simulations are configured with multiscale topogra- phy characterized by small-scale abyssal hills a few ki- lometers wide based on multibeam observations from Drake Passage. The topographic spectrum associated with abyssal hills is well described by an anisotropic parametric representation proposed by Goff and Jordan (1988): P 2D (k, l) 5 2pH2(m 2 2) k 0 l 0 1 1 k2 k2 0 1 l2 l2 0 !2m/2 , (5) where k0 and l0 set the wavenumbers of the large hills, m is the high-wavenumber spectral slope, related to the pa- FIG. 3. Averaged profiles of (left) stratification (s21) and (right) flow speed (m s21) in the bottom 2 km from observations (gray), initial condition in the simulations (black), and final state in 2D (blue) and 3D (red) simulations.
  6. !2 W h at D r i v e s

    P r o g r e s s i n 
 E a r t h S c i e n c e ? New Ideas / Hypotheses New Observations E 5 r 0 jUj p ðN/jUj jfj/jUj P 1D (k) ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi N2 2 jUj2k2 q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jUj2k2 2 f2 q dk, (3) where k 5 (k, l) is now the wavenumber in the reference frame along and across the mean flow U and P 1D (k) 5 1 2p ð1‘ 2‘ jkj jkj P 2D (k, l) dl (4) is the effective one-dimensional (1D) topographic spectrum. Hence, the wave radiation from 2D topogra- phy reduces to an equivalent problem of wave radiation from 1D topography with the effective spectrum given by P1D (k). The effective 1D spectrum captures the effects of 2D c. Bottom topography Simulations are configured with multiscale topogra- phy characterized by small-scale abyssal hills a few ki- lometers wide based on multibeam observations from Drake Passage. The topographic spectrum associated with abyssal hills is well described by an anisotropic parametric representation proposed by Goff and Jordan (1988): P 2D (k, l) 5 2pH2(m 2 2) k 0 l 0 1 1 k2 k2 0 1 l2 l2 0 !2m/2 , (5) where k0 and l0 set the wavenumbers of the large hills, m is the high-wavenumber spectral slope, related to the pa- FIG. 3. Averaged profiles of (left) stratification (s21) and (right) flow speed (m s21) in the bottom 2 km from observations (gray), initial condition in the simulations (black), and final state in 2D (blue) and 3D (red) simulations.
  7. !2 W h at D r i v e s

    P r o g r e s s i n 
 E a r t h S c i e n c e ? New Ideas / Hypotheses New Observations New Simulations E 5 r 0 jUj p ðN/jUj jfj/jUj P 1D (k) ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi N2 2 jUj2k2 q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jUj2k2 2 f2 q dk, (3) where k 5 (k, l) is now the wavenumber in the reference frame along and across the mean flow U and P 1D (k) 5 1 2p ð1‘ 2‘ jkj jkj P 2D (k, l) dl (4) is the effective one-dimensional (1D) topographic spectrum. Hence, the wave radiation from 2D topogra- phy reduces to an equivalent problem of wave radiation from 1D topography with the effective spectrum given by P1D (k). The effective 1D spectrum captures the effects of 2D c. Bottom topography Simulations are configured with multiscale topogra- phy characterized by small-scale abyssal hills a few ki- lometers wide based on multibeam observations from Drake Passage. The topographic spectrum associated with abyssal hills is well described by an anisotropic parametric representation proposed by Goff and Jordan (1988): P 2D (k, l) 5 2pH2(m 2 2) k 0 l 0 1 1 k2 k2 0 1 l2 l2 0 !2m/2 , (5) where k0 and l0 set the wavenumbers of the large hills, m is the high-wavenumber spectral slope, related to the pa- FIG. 3. Averaged profiles of (left) stratification (s21) and (right) flow speed (m s21) in the bottom 2 km from observations (gray), initial condition in the simulations (black), and final state in 2D (blue) and 3D (red) simulations.
  8. !2 W h at D r i v e s

    P r o g r e s s i n 
 E a r t h S c i e n c e ? New Ideas / Hypotheses New Observations New Simulations E 5 r 0 jUj p ðN/jUj jfj/jUj P 1D (k) ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi N2 2 jUj2k2 q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jUj2k2 2 f2 q dk, (3) where k 5 (k, l) is now the wavenumber in the reference frame along and across the mean flow U and P 1D (k) 5 1 2p ð1‘ 2‘ jkj jkj P 2D (k, l) dl (4) is the effective one-dimensional (1D) topographic spectrum. Hence, the wave radiation from 2D topogra- phy reduces to an equivalent problem of wave radiation from 1D topography with the effective spectrum given by P1D (k). The effective 1D spectrum captures the effects of 2D c. Bottom topography Simulations are configured with multiscale topogra- phy characterized by small-scale abyssal hills a few ki- lometers wide based on multibeam observations from Drake Passage. The topographic spectrum associated with abyssal hills is well described by an anisotropic parametric representation proposed by Goff and Jordan (1988): P 2D (k, l) 5 2pH2(m 2 2) k 0 l 0 1 1 k2 k2 0 1 l2 l2 0 !2m/2 , (5) where k0 and l0 set the wavenumbers of the large hills, m is the high-wavenumber spectral slope, related to the pa- FIG. 3. Averaged profiles of (left) stratification (s21) and (right) flow speed (m s21) in the bottom 2 km from observations (gray), initial condition in the simulations (black), and final state in 2D (blue) and 3D (red) simulations.
  9. !2 W h at D r i v e s

    P r o g r e s s i n 
 E a r t h S c i e n c e ? New Ideas / Hypotheses New Observations New Simulations E 5 r 0 jUj p ðN/jUj jfj/jUj P 1D (k) ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi N2 2 jUj2k2 q ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jUj2k2 2 f2 q dk, (3) where k 5 (k, l) is now the wavenumber in the reference frame along and across the mean flow U and P 1D (k) 5 1 2p ð1‘ 2‘ jkj jkj P 2D (k, l) dl (4) is the effective one-dimensional (1D) topographic spectrum. Hence, the wave radiation from 2D topogra- phy reduces to an equivalent problem of wave radiation from 1D topography with the effective spectrum given by P1D (k). The effective 1D spectrum captures the effects of 2D c. Bottom topography Simulations are configured with multiscale topogra- phy characterized by small-scale abyssal hills a few ki- lometers wide based on multibeam observations from Drake Passage. The topographic spectrum associated with abyssal hills is well described by an anisotropic parametric representation proposed by Goff and Jordan (1988): P 2D (k, l) 5 2pH2(m 2 2) k 0 l 0 1 1 k2 k2 0 1 l2 l2 0 !2m/2 , (5) where k0 and l0 set the wavenumbers of the large hills, m is the high-wavenumber spectral slope, related to the pa- FIG. 3. Averaged profiles of (left) stratification (s21) and (right) flow speed (m s21) in the bottom 2 km from observations (gray), initial condition in the simulations (black), and final state in 2D (blue) and 3D (red) simulations.
  10. W h at S c i e n c e

    d o w e w a n t t o d o w i t h O c e a n D ata? !4
  11. !5 Take the mean! W h at S c i

    e n c e d o w e w a n t t o d o w i t h C l i m at e D ata?
  12. !6 Analyze spatiotemporal variability W h at S c i

    e n c e d o w e w a n t t o d o w i t h C l i m at e D ata?
  13. !7 Machine learning! Image Credit: Manucharyan et al. (2020); https://doi.org/10.31223/osf.io/m8f3x

    W h at S c i e n c e d o w e w a n t t o d o w i t h C l i m at e D ata? manuscript submitted to Journal of Advances in Modeling Earth Systems (JAMES) Input: SSH Day 20 Day 0 2x(32x32) Output: SSH Day 10 32x32 … + + + + + + Global Avg. pool Fully Connected layer(1024) Conv2D 2,32 Conv2D 64,128 Conv2D 32,64 Conv2D 128,256 + x Identity X Add Conv2D Batch Norm Leaky ReLU Conv2D Batch Norm Leaky ReLU Conv2D Batch Norm Add X Leaky ReLU Conv2D, BatchNorm Leaky ReLu, MaxPool Conv2D 256, 512 + + Conv2D M,N … … y y=max(0.1x,x) Leaky ReLU
  14. D o w n l o a d !9 FTP

    / OPeNDAP / etc.
  15. D o w n l o a d !10 MB

    FTP / OPeNDAP / etc.
  16. D o w n l o a d !11 GB

    FTP / OPeNDAP / etc.
  17. D o w n l o a d !12 TB

    FTP / OPeNDAP / etc.
  18. D o w n l o a d !13 PB

    FTP / OPeNDAP / etc.
  19. N e v e r M i n d …

    !14 H o w ? Let’s “bring the compute to the data”!
  20. C l o u d C o m p u

    t i n g !15 Data Provider’s $ Data Consumer’s $
  21. C l o u d C o m p u

    t i n g !15 Analysis Ready Data
 Cloud Optimized Formats Data Provider’s $ Data Consumer’s $
  22. C l o u d C o m p u

    t i n g !15 Analysis Ready Data
 Cloud Optimized Formats Data Provider’s $ Data Consumer’s $ Scalable Parallel Computing Frameworks
  23. C l o u d C o m p u

    t i n g !15 Analysis Ready Data
 Cloud Optimized Formats Data Provider’s $ Data Consumer’s $ Scalable Parallel Computing Frameworks
  24. • Open Community • Open Source Software • Open Source

    Infrastructure !16 W h at i s Pa n g e o ? “A community platform for Big Data geoscience”
  25. !17 Pa n g e o C o m m

    u n i t y http://pangeo.io
  26. Pa n g e o S o f t w

    a r e E c o s y s t e m !18 Inspiration: Stephan Hoyer, Jake Vanderplas (SciPy 2015) SciPy
  27. !19 Pa n g e o A r c h

    i t e c t u r e Jupyter for interactive access remote systems Cloud / HPC Xarray provides data structures and intuitive interface for interacting with datasets Parallel computing system allows users deploy clusters of compute nodes for data processing. Dask tells the nodes what to do. Distributed storage “Analysis Ready Data”
 stored on globally-available distributed storage.
  28. !20 Pa n g e o D e p l

    o y m e n t s NASA Pleiades o c e a n . pa n g e o . i o NCAR Cheyenne http://pangeo.io/deployments.html
  29. Pa n g e o C l o u d

    D ata C ata l o g !21 catalog.pangeo.io
  30. • Don’t! If you don’t have big data, just use

    python, xarray, dask, etc. on your computer. • Install Pangeo on your own cluster:
 instructions at http://pangeo.io/setup_guides/hpc.html • Use Pangeo on NCAR’s Cheyenne:
 https://discourse.pangeo.io/t/jupyterhub-access-on-cheyenne/253 • Use one of our cloud-based Jupyterhubs:
 http://pangeo.io/deployments.html 
 Note: we don’t really have funding (yet) to support the whole world • Deploy your own Pangeo environment in the cloud
 http://pangeo.io/setup_guides/cloud.html 
 Costs $ - talk to your program manager! H o w c a n y o u U S E Pa n g e o ? !22
  31. • Ask questions on our forum: discourse.pangeo.io
 What’s the best

    way to calculate X? Where can I find dataset Y? • Attend a weekly telecon or working-group meeting:
 pangeo.io/meeting-notes.html • Create your own binders and add them to our gallery:
 github.com/pangeo-gallery (website still under development) • Put data on the cloud: github.com/pangeo-data/pangeo-datastore
 (Note: cloud storage costs money! We are working with cloud providers, funding agencies, and data providers to bring more analysis-ready data to the cloud in a sustainable way.) H o w c a n y o u C o n t r i b u t e ? !23
  32. G r a s s - R o o t

    s A d o p t i o n !25