Data variables used for computation Coordinates describe data Indexes align data Attributes metadata ignored by operations + land_cover “netCDF meets pandas.DataFrame”
x.sum('time') • Select values by label (or logical location) instead of integer location: x.sel(time='2014-01-01') • Mathematical operations (e.g., x - y) vectorize across multiple dimensions (array broadcasting) based on dimension names, not shape. • Deep dask.array integration • Easily use the split-apply- combine paradigm with groupby: x.groupby(‘time.dayofyear') .mean(). • Database-like alignment based on coordinate labels that smoothly handles missing values: x, y = xr.align(x, y, join=‘outer'). • Keep track of arbitrary metadata in the form of a Python dictionary: x.attrs.
for {discipline}” tutorials • Larger gallery of examples • Internal refactoring: • Entry points to make things more pluggable / customizable • Marketing! Xarray is underused in pydata ecosystem…