$30 off During Our Annual Pro Sale. View Details »

Python's Visualization Landscape (PyCon 2017)

Python's Visualization Landscape (PyCon 2017)

So you want to visualize some data in Python: which library do you choose? From Matplotlib to Seaborn to Bokeh to Plotly, Python has a range of mature tools to create beautiful visualizations, each with their own strengths and weaknesses. In this talk I’ll give an overview of the landscape of dataviz tools in Python, as well as some deeper dives into a few, so that you can intelligently choose which library to turn to for any given visualization task.

Video: https://www.youtube.com/watch?v=FytuB8nFHPQ

Jake VanderPlas

May 21, 2017
Tweet

More Decks by Jake VanderPlas

Other Decks in Technology

Transcript

  1. @jakevdp
    Jake VanderPlas
    Jake VanderPlas
    @jakevdp #PyCon2017
    Python’s Visualization
    Landscape

    View Slide

  2. @jakevdp
    Jake VanderPlas
    [Python’s Visualization Landscape]
    From the abstract:
    “In this talk I’ll give an overview of the
    landscape of dataviz tools in Python . . .”

    View Slide

  3. @jakevdp
    Jake VanderPlas
    [Python’s Visualization Landscape]
    From the abstract:
    “In this talk I’ll give an overview of the
    landscape of dataviz tools in Python . . .”

    View Slide

  4. @jakevdp
    Jake VanderPlas
    From the abstract:
    “In this talk I’ll give an overview of the
    landscape of dataviz tools in Python . . .”
    [Python’s Visualization Landscape]

    View Slide

  5. @jakevdp
    Jake VanderPlas
    From the abstract:
    “In this talk I’ll give an overview of the
    landscape of dataviz tools in Python . . .”
    [Python’s Visualization Landscape]

    View Slide

  6. @jakevdp
    Jake VanderPlas
    From the abstract:
    “In this talk I’ll give an overview of the
    landscape of dataviz tools in Python . . .”
    [Python’s Visualization Landscape]

    View Slide

  7. @jakevdp
    Jake VanderPlas
    From the abstract:
    “In this talk I’ll give an overview of the
    landscape of dataviz tools in Python . . .”
    [Python’s Visualization Landscape]

    View Slide

  8. @jakevdp
    Jake VanderPlas
    From the abstract:
    “In this talk I’ll give an overview of the
    landscape of dataviz tools in Python . . .”
    [Python’s Visualization Landscape]

    View Slide

  9. @jakevdp
    Jake VanderPlas
    From the abstract:
    “In this talk I’ll give an overview of the
    landscape of dataviz tools in Python . . .”
    [Python’s Visualization Landscape]
    From the abstract:
    “In this talk I’ll give an overview of the
    landscape of dataviz tools in Python . . .”

    View Slide

  10. @jakevdp
    Jake VanderPlas
    [Python’s Visualization Landscape]
    From the abstract:
    “In this talk I’ll give an overview of the
    landscape of dataviz tools in Python . . .”

    View Slide

  11. @jakevdp
    Jake VanderPlas
    [Making Sense of the Deluge]

    View Slide

  12. @jakevdp
    Jake VanderPlas
    matplotlib

    View Slide

  13. @jakevdp
    Jake VanderPlas
    matplotlib
    basemap/
    cartopy

    View Slide

  14. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    basemap/
    cartopy

    View Slide

  15. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    networkx
    basemap/
    cartopy

    View Slide

  16. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap/
    cartopy

    View Slide

  17. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy

    View Slide

  18. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy

    View Slide

  19. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript

    View Slide

  20. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    bokeh
    plotly

    View Slide

  21. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    bqplot
    bokeh
    toyplot
    plotly

    View Slide

  22. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    ipyleaflet

    View Slide

  23. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    ipyleaflet

    View Slide

  24. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    ipyleaflet

    View Slide

  25. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    d3js
    mpld3
    ipyleaflet

    View Slide

  26. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    d3js
    mpld3
    ipyleaflet
    Vega-Lite
    Vega

    View Slide

  27. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    d3js
    mpld3
    Altair
    Vincent
    ipyleaflet
    d3po
    Vega-Lite
    Vega

    View Slide

  28. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    d3js
    mpld3
    Altair
    Vincent
    ipyleaflet
    d3po
    Vega-Lite
    Vega

    View Slide

  29. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    datashader
    d3js
    mpld3
    Altair
    Vincent
    ipyleaflet
    d3po
    Vega-Lite
    Vega

    View Slide

  30. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    datashader
    d3js
    mpld3
    Altair
    Vincent
    ipyleaflet
    d3po
    Vega-Lite
    Vega
    Vaex

    View Slide

  31. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    holoviews
    datashader
    d3js
    mpld3
    Altair
    Vincent
    ipyleaflet
    d3po
    Vega-Lite
    Vega
    Vaex

    View Slide

  32. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    holoviews
    datashader
    d3js
    mpld3
    Altair
    Vincent
    OpenGL
    ipyleaflet
    d3po
    Vega-Lite
    Vega
    Vaex

    View Slide

  33. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    holoviews
    datashader
    d3js
    mpld3
    Altair
    Vincent
    OpenGL
    Glumpy
    Vispy
    ipyleaflet
    d3po
    Vega-Lite
    Vega
    Vaex

    View Slide

  34. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    holoviews
    datashader
    d3js
    mpld3
    Altair
    Vincent
    OpenGL
    Glumpy
    Vispy
    ipyleaflet
    d3po
    Vega-Lite
    Vega
    graphviz
    Vaex
    graph-tool

    View Slide

  35. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    holoviews
    datashader
    d3js
    mpld3
    Altair
    Vincent
    OpenGL
    Glumpy
    Vispy
    ipyleaflet
    Lightning
    GlueViz
    YT
    d3po
    Vega-Lite
    Vega
    MayaVi
    graphviz
    GR
    framework
    PyQTgraph
    pygal chaco
    Vaex
    graph-tool

    View Slide

  36. @jakevdp
    Jake VanderPlas
    Python’s Visualization
    Landscape
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    holoviews
    datashader
    d3js
    mpld3
    Altair
    Vincent
    OpenGL
    Glumpy
    Vispy
    ipyleaflet
    Lightning
    GlueViz
    YT
    d3po
    Vega-Lite
    Vega
    MayaVi
    graphviz
    GR
    framework
    PyQTgraph
    pygal chaco
    Vaex
    graph-tool

    View Slide

  37. @jakevdp
    Jake VanderPlas

    View Slide

  38. @jakevdp
    Jake VanderPlas
    How did we get here?

    View Slide

  39. @jakevdp
    Jake VanderPlas
    In the beginning was matplotlib*
    * well, actually… Python visualization
    existed before matplotlib, but was not
    very mature.

    View Slide

  40. @jakevdp
    Jake VanderPlas
    Plotting with Matplotlib
    Strengths:
    - Designed like MatLab: switching was easy

    View Slide

  41. @jakevdp
    Jake VanderPlas
    Plotting with Matplotlib
    Strengths:
    - Designed like MatLab: switching was easy
    - Many rendering backends

    View Slide

  42. @jakevdp
    Jake VanderPlas
    Plotting with Matplotlib
    Strengths:
    - Designed like MatLab: switching was easy
    - Many rendering backends
    - Can reproduce just about any plot (with a bit of effort)

    View Slide

  43. @jakevdp
    Jake VanderPlas
    Plotting with Matplotlib
    Strengths:
    - Designed like MatLab: switching was easy
    - Many rendering backends
    - Can reproduce just about any plot (with a bit of effort)
    - Well-tested, standard tool for over a decade

    View Slide

  44. @jakevdp
    Jake VanderPlas
    Matplotlib Gallery

    View Slide

  45. @jakevdp
    Jake VanderPlas
    import pandas as pd
    iris = pd.read_csv('iris.csv')
    iris.head()
    Tidy data: i.e. rows are samples, columns are features
    Example: Statistical Data

    View Slide

  46. @jakevdp
    Jake VanderPlas
    “I want to scatter petal length vs.
    sepal length, and color by species”
    Just a simple visualization . . .

    View Slide

  47. @jakevdp
    Jake VanderPlas
    color_map = dict(zip(iris.species.unique(),
    ['blue', 'green', 'red']))
    for species, group in iris.groupby('species'):
    plt.scatter(group['petalLength'], group['sepalLength'],
    color=color_map[species],
    alpha=0.3, edgecolor=None,
    label=species)
    plt.legend(frameon=True, title='species')
    plt.xlabel('petalLength')
    plt.ylabel('sepalLength')
    Just a simple visualization . . .

    View Slide

  48. @jakevdp
    Jake VanderPlas
    Plotting with Matplotlib
    Strengths:
    - Designed like MatLab: switching was easy
    - Many rendering backends
    - Can reproduce just about any plot with a bit of
    effort
    - Well-tested, standard tool for over a decade
    Weaknesses:
    - API is imperative & often overly verbose
    - Sometimes poor stylistic defaults
    - Poor support for web/interactive graphs
    - Often slow for large & complicated data

    View Slide

  49. @jakevdp
    Jake VanderPlas
    Everyone’s Goal:
    Improve on the weaknesses of matplotlib
    (without sacrificing the strengths!)

    View Slide

  50. @jakevdp
    Jake VanderPlas
    Building on Matplotlib. . .
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy

    View Slide

  51. @jakevdp
    Jake VanderPlas
    Building on Matplotlib. . .
    Common Idea: Keep matplotlib as a versatile,
    well-tested backend, and provide a new
    domain-specific API.
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy

    View Slide

  52. @jakevdp
    Jake VanderPlas
    Building on Matplotlib. . .
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy

    View Slide

  53. @jakevdp
    Jake VanderPlas
    Pandas plotting API
    Key Features:
    - Pandas provides a DataFrame object
    - Also provides a simple API for plotting DataFrames

    View Slide

  54. @jakevdp
    Jake VanderPlas
    iris.plot.scatter('petalLength', 'petalWidth')

    View Slide

  55. @jakevdp
    Jake VanderPlas
    from pandas.tools.plotting import andrews_curves
    andrews_curves(iris, 'species')
    - More sophisticated statistical visualization tools have
    recently been added

    View Slide

  56. @jakevdp
    Jake VanderPlas
    http://seaborn.pydata.org
    Key Features:
    - Like Pandas, wraps matplotlib
    - Nice set of color palettes & plot styles
    - Focus on statistical visualization & modeling
    Seaborn: statistical data visualization

    View Slide

  57. @jakevdp
    Jake VanderPlas
    import seaborn as sns
    sns.lmplot('petalLength', 'sepalWidth', iris,
    hue='species', fit_reg=False)
    Seaborn examples

    View Slide

  58. @jakevdp
    Jake VanderPlas
    sns.pairplot(iris, hue='species')
    Seaborn examples

    View Slide

  59. @jakevdp
    Jake VanderPlas
    Javascript-based Viz:
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    ipyleaflet

    View Slide

  60. @jakevdp
    Jake VanderPlas
    Javascript-based Viz:
    Common Idea: build a new API that produces a plot
    serialization (often JSON) that can be displayed in the
    browser (often in Jupyter notebooks)
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    ipyleaflet

    View Slide

  61. @jakevdp
    Jake VanderPlas
    Javascript-based Viz:
    javascript
    pythreejs
    bqplot
    toyplot
    ipyvolume
    cufflinks
    ipyleaflet
    bokeh
    plotly

    View Slide

  62. @jakevdp
    Jake VanderPlas
    Plotting with Bokeh

    View Slide

  63. @jakevdp
    Jake VanderPlas
    Bokeh Gallery

    View Slide

  64. @jakevdp
    Jake VanderPlas
    Plotting with Bokeh
    Advantages:
    - Web view/interactivity
    - Imperative and Declarative layer
    - Handles large and/or streaming datasets
    - Geographical visualization
    - Fully open source
    Disadvantages:
    - No vector output (need PDF/EPS? Sorry)
    - Newer tool with a smaller user-base than
    matplotlib

    View Slide

  65. @jakevdp
    Jake VanderPlas
    Basic Plotting with Plotly

    View Slide

  66. @jakevdp
    Jake VanderPlas
    Plotly Gallery

    View Slide

  67. @jakevdp
    Jake VanderPlas
    Plotting with Plotly
    Advantages:
    - Web view/interactivity
    - Multi-language support
    - 3D plotting capability
    - Animation capability
    - Geographical visualization
    Disadvantages:
    - Some features require a paid plan

    View Slide

  68. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    holoviews
    datashader
    d3js
    mpld3
    Altair
    Vincent
    OpenGL
    Glumpy
    Vispy
    ipyleaflet
    Lightning
    GlueViz
    YT
    d3po
    Vega-Lite
    Vega
    MayaVi
    graphviz
    GR
    framework
    PyQTgraph
    pygal chaco
    Vaex
    graph-tool
    Visualization for Larger Data . . .

    View Slide

  69. @jakevdp
    Jake VanderPlas
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    holoviews
    d3js
    mpld3
    Altair
    Vincent
    OpenGL
    Glumpy
    Vispy
    ipyleaflet
    Lightning
    GlueViz
    YT
    d3po
    Vega-Lite
    Vega
    MayaVi
    graphviz
    GR
    framework
    PyQTgraph
    pygal chaco
    Vaex
    graph-tool
    Visualization for Larger Data . . .
    datashader

    View Slide

  70. @jakevdp
    Jake VanderPlas
    Datashader
    Fast server-side engine for dynamic data aggregation

    View Slide

  71. @jakevdp
    Jake VanderPlas
    Datashader
    - Compute layer that works with Bokeh
    - Rather than sending data to the client, it aggregates
    data and sends pixels.
    - Can handle interactive visualization of billions of rows.

    View Slide

  72. @jakevdp
    Jake VanderPlas
    Datashader
    - Compute layer that works with Bokeh
    - Rather than sending data to the client, it aggregates
    data and sends pixels.
    - Can handle interactive visualization of billions of rows.

    View Slide

  73. @jakevdp
    Jake VanderPlas
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    pythreejs
    bqplot
    toyplot
    plotly
    ipyvolume
    cufflinks
    holoviews
    datashader
    mpld3
    Vincent
    OpenGL
    Glumpy
    Vispy
    ipyleaflet
    Lightning
    GlueViz
    YT
    d3po
    Vega-Lite
    Vega
    MayaVi
    graphviz
    GR
    framework
    PyQTgraph
    pygal chaco
    Vaex
    graph-tool
    Toward Declarative Visualization . . .
    d3js
    javascript
    bokeh
    matplotlib
    Altair

    View Slide

  74. @jakevdp
    Jake VanderPlas
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    pythreejs
    bqplot
    toyplot
    plotly
    ipyvolume
    cufflinks
    holoviews
    mpld3
    Vincent
    OpenGL
    Glumpy
    Vispy
    ipyleaflet
    Lightning
    GlueViz
    YT
    d3po
    Vega-Lite
    Vega
    MayaVi
    graphviz
    GR
    framework
    PyQTgraph
    pygal chaco
    Vaex
    graph-tool
    Toward Declarative Visualization . . .
    d3js
    javascript
    bokeh
    matplotlib
    Altair
    datashader

    View Slide

  75. @jakevdp
    Jake VanderPlas
    Holoviews
    - Datasets themselves stored in objects that
    automatically produce intelligent visualizations
    - Composition & Interactivity via operator overloading
    - Renders to Bokeh, DataShader, and Matplotlib

    View Slide

  76. @jakevdp
    Jake VanderPlas
    Holoviews
    - Also can handle geographic data & time-series

    View Slide

  77. @jakevdp
    Jake VanderPlas
    What if instead of passing
    around pixels, we pass around
    visualization specifications plus data?
    Altair

    View Slide

  78. @jakevdp
    Jake VanderPlas
    What if instead of passing
    around pixels, we pass around
    visualization specifications plus data?
    “Declarative Visualization”
    Altair

    View Slide

  79. @jakevdp
    Jake VanderPlas
    What if instead of passing
    around pixels, we pass around
    visualization specifications plus data?
    “Declarative Visualization”
    Altair

    View Slide

  80. @jakevdp
    Jake VanderPlas
    Declarative Visualization:
    Viz for data science
    Declarative
    - Specify What should be
    done
    - Details determined
    automatically
    - Separates Specification
    from Execution
    Imperative
    - Specify How something
    should be done.
    - Must manually specify
    plotting steps
    - Specification &
    Execution intertwined.
    Declarative visualization lets you think about data
    and relationships, rather than incidental details.

    View Slide

  81. #JSM2016
    Jake VanderPlas
    From D3 to Altair . . .
    (link to live version)

    View Slide

  82. #JSM2016
    Jake VanderPlas
    But working in D3 can
    be challenging . . .

    View Slide

  83. #JSM2016
    Jake VanderPlas
    Bar Chart: d3
    var margin = {top: 20, right: 20, bottom: 30, left: 40},
    width = 960 - margin.left - margin.right,
    height = 500 - margin.top - margin.bottom;
    var x = d3.scale.ordinal()
    .rangeRoundBands([0, width], .1);
    var y = d3.scale.linear()
    .range([height, 0]);
    var xAxis = d3.svg.axis()
    .scale(x)
    .orient("bottom");
    var yAxis = d3.svg.axis()
    .scale(y)
    .orient("left")
    .ticks(10, "%");
    var svg = d3.select("body").append("svg")
    .attr("width", width + margin.left + margin.right)
    .attr("height", height + margin.top + margin.bottom)
    .append("g")
    .attr("transform", "translate(" + margin.left + "," + margin.top + ")");
    d3.tsv("data.tsv", type, function(error, data) {
    if (error) throw error;
    x.domain(data.map(function(d) { return d.letter; }));
    y.domain([0, d3.max(data, function(d) { return d.frequency; })]);
    svg.append("g")
    .attr("class", "x axis")
    .attr("transform", "translate(0," + height + ")")
    .call(xAxis);
    svg.append("g")
    .attr("class", "y axis")
    .call(yAxis)
    .append("text")
    .attr("transform", "rotate(-90)")
    .attr("y", 6)
    .attr("dy", ".71em")
    .style("text-anchor", "end")
    .text("Frequency");
    svg.selectAll(".bar")
    .data(data)
    .enter().append("rect")
    .attr("class", "bar")
    .attr("x", function(d) { return x(d.letter); })
    .attr("width", x.rangeBand())
    .attr("y", function(d) { return y(d.frequency); })
    .attr("height", function(d) { return height - y(d.frequency); });
    });
    function type(d) {
    d.frequency = +d.frequency;
    return d;
    }
    D3 is a Javascript package that
    streamlines manipulation of
    objects on a webpage.

    View Slide

  84. #JSM2016
    Jake VanderPlas
    Bar Chart: Vega
    {
    "width": 400,
    "height": 200,
    "padding": {"top": 10, "left": 30, "bottom": 30, "right": 10},
    "data": [
    {
    "name": "table",
    "values": [
    {"x": 1, "y": 28}, {"x": 2, "y": 55},
    {"x": 3, "y": 43}, {"x": 4, "y": 91},
    {"x": 5, "y": 81}, {"x": 6, "y": 53},
    {"x": 7, "y": 19}, {"x": 8, "y": 87},
    {"x": 9, "y": 52}, {"x": 10, "y": 48},
    {"x": 11, "y": 24}, {"x": 12, "y": 49},
    {"x": 13, "y": 87}, {"x": 14, "y": 66},
    {"x": 15, "y": 17}, {"x": 16, "y": 27},
    {"x": 17, "y": 68}, {"x": 18, "y": 16},
    {"x": 19, "y": 49}, {"x": 20, "y": 15}
    ]
    }
    ],
    "scales": [
    {
    "name": "x",
    "type": "ordinal",
    "range": "width",
    "domain": {"data": "table", "field": "x"}
    },
    {
    "name": "y",
    "type": "linear",
    "range": "height",
    "domain": {"data": "table", "field": "y"},
    "nice": true
    }
    ],
    "axes": [
    {"type": "x", "scale": "x"},
    {"type": "y", "scale": "y"}
    ],
    "marks": [
    {
    "type": "rect",
    "from": {"data": "table"},
    "properties": {
    "enter": {
    "x": {"scale": "x", "field": "x"},
    "width": {"scale": "x", "band": true, "offset": -1},
    "y": {"scale": "y", "field": "y"},
    "y2": {"scale": "y", "value": 0}
    },
    "update": {
    "fill": {"value": "steelblue"}
    Vega is a detailed declarative
    specification for visualizations,
    built on D3.

    View Slide

  85. #JSM2016
    Jake VanderPlas
    Bar Chart: Vega-Lite
    {
    "description": "A simple bar chart with embedded data.",
    "data": {
    "values": [
    {"a": "A","b": 28}, {"a": "B","b": 55}, {"a": "C","b": 43},
    {"a": "D","b": 91}, {"a": "E","b": 81}, {"a": "F","b": 53},
    {"a": "G","b": 19}, {"a": "H","b": 87}, {"a": "I","b": 52}
    ]
    },
    "mark": "bar",
    "encoding": {
    "x": {"field": "a", "type": "ordinal"},
    "y": {"field": "b", "type": "quantitative"}
    }
    }
    Vega-Lite is a simpler
    declarative specification aimed
    at statistical visualization.

    View Slide

  86. #JSM2016
    Jake VanderPlas
    Bar Chart: Altair
    Altair is a Python API for creating
    Vega-Lite specifications.

    View Slide

  87. @jakevdp
    Jake VanderPlas
    From Declarative API
    to declarative Grammar
    chart = Chart(data).mark_circle(
    opacity=0.3
    ).encode(
    x='petalLength:Q',
    y='sepalWidth:Q',
    color='species:N',
    )
    chart.display()

    View Slide

  88. @jakevdp
    Jake VanderPlas
    From Declarative API
    to declarative Grammar
    >>> chart.to_dict()
    {'config': {'mark': {'opacity': 0.3}},
    'data':
    {'url': 'https://vega.github.io/vega-datasets/data/iris.json'},
    'encoding': {'color': {'field': 'species', 'type': 'nominal'},
    'x': {'field': 'petalLength', 'type': 'quantitative'},
    'y': {'field': 'sepalWidth', 'type': 'quantitative'}},
    'mark': 'circle'}

    View Slide

  89. #JSM2016
    Jake VanderPlas
    (Visualizations from
    jakevdp/altair-examples).

    View Slide

  90. #JSM2016
    Jake VanderPlas
    Coming Very Soon: Altair 2.0
    - Includes a Grammar of Interaction

    View Slide

  91. @jakevdp
    Jake VanderPlas
    or
    $ conda install altair --channel conda-forge
    $ pip install altair
    $ jupyter nbextension install --sys-prefix --py vega
    Try Altair:
    http://github.com/ellisonbg/altair/
    For a Jupyter notebook tutorial, type
    import altair
    altair.tutorial()

    View Slide

  92. @jakevdp
    Jake VanderPlas
    Python’s Visualization
    Landscape
    matplotlib
    seaborn
    pandas
    ggpy
    scikit-
    plot
    Yellow
    brick
    networkx
    basemap
    /cartopy
    javascript
    pythreejs
    bqplot
    bokeh
    toyplot
    plotly
    ipyvolume
    cufflinks
    holoviews
    datashader
    d3js
    mpld3
    Altair
    Vincent
    OpenGL
    Glumpy
    Vispy
    ipyleaflet
    Lightning
    GlueViz
    YT
    d3po
    Vega-Lite
    Vega
    MayaVi
    graphviz
    GR
    framework
    PyQTgraph
    pygal chaco
    Vaex
    graph-tool

    View Slide

  93. @jakevdp
    Jake VanderPlas
    Email: [email protected]
    Twitter: @jakevdp
    Github: jakevdp
    Web: http://vanderplas.com/
    Blog: http://jakevdp.github.io/
    Thank You!

    View Slide