$30 off During Our Annual Pro Sale. View Details »

What Can UX Designers Teach Us About Being Better Devops Practitioners?

maltzj
June 11, 2020

What Can UX Designers Teach Us About Being Better Devops Practitioners?

This is a set of slides from "What Can UX Designers Teach Us About Being Better Devops Practitioners?" from Agile + Devops West in June 2020.

Product for Internal teams: https://medium.com/@skamille/product-for-internal-platforms-9205c3a08142 Fundamentals of Design: How to Think Like A Designer https://www.skillshare.com/classes/Fundamentals-of-Design-How-to-Think-Like-a-Designer/1986357063 Don’t Make Me Think: http://sensible.com/dmmt.html
Rocket Surgery Made Easy: http://sensible.com/rsme.html Bootstrap: https://getbootstrap.com/
Ant.Design: https://ant.design/
Design Sprints: https://www.gv.com/sprint/

maltzj

June 11, 2020
Tweet

More Decks by maltzj

Other Decks in Programming

Transcript

  1. View Slide

  2. View Slide

  3. View Slide

  4. View Slide

  5. What Can UX Designers Teach Us
    About Being Better Devops
    Practitioners?
    Jonathan Maltz - June 11th, 2020

    View Slide

  6. 6
    Agenda
    Culture
    Process
    Tools

    View Slide

  7. 7
    Agenda
    Culture
    Tools

    View Slide

  8. Culture

    View Slide

  9. Build a culture of talking
    to users

    View Slide

  10. View Slide

  11. 11

    View Slide

  12. Infrastructure teams build
    products.

    View Slide

  13. 13
    What Are Your Products?
    Jenkins
    Kubernetes
    Pagerduty
    Splunk

    View Slide

  14. 14
    Jenkins
    Kubernetes
    Pagerduty
    Splunk
    A product to help users reliably run CI jobs
    What Are Your Products?

    View Slide

  15. 15
    Jenkins
    Kubernetes
    Pagerduty
    Splunk
    A product to help users reliably run CI jobs
    A product to help users deploy + scale applications
    What Are Your Products?

    View Slide

  16. 16
    Jenkins
    Kubernetes
    Pagerduty
    Splunk
    A product to help users reliably run CI jobs
    A product to help users deploy + scale applications
    A product to help users respond to incidents
    What Are Your Products?

    View Slide

  17. 17
    Jenkins
    Kubernetes
    Pagerduty
    Splunk
    A product to help users reliably run CI jobs
    A product to help users deploy + scale applications
    A product to help users respond to incidents
    A product to help users debug failures in their system
    What Are Your Products?

    View Slide

  18. 18
    Who are your users?

    View Slide

  19. 19
    Who are your users?
    Name: Just-Get-It-Done Jonathan
    Role: Software engineer
    Background:
    - Previously worked as a mobile engineer
    - Now building services for risk operators at
    Stripe
    - Occasionally goes to devops conferences
    Goals:
    - Get his services deployed easily
    - Help his team effectively use infra tools

    View Slide

  20. 20
    These products have interfaces
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: nginx-deployment
    spec:
    selector:
    matchLabels:
    app: nginx
    replicas: 2
    template:
    metadata:
    labels:
    app: nginx
    spec:
    containers:
    - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80

    View Slide

  21. 21
    These products have interfaces
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: nginx-deployment
    spec:
    selector:
    matchLabels:
    app: nginx
    replicas: 2
    template:
    metadata:
    labels:
    app: nginx
    spec:
    containers:
    - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80
    kubectl apply service.yaml

    View Slide

  22. 22
    These products have interfaces
    service_image: stripe/test-service
    instance_size: r2.xlarge
    min_instances: 1
    max_instances: 100

    View Slide

  23. 23
    These products have interfaces
    service_image: stripe/test-service
    instance_size: r2.xlarge
    min_instances: 1
    max_instances: 100

    View Slide

  24. Your users are all around
    you.

    View Slide

  25. Your users are all around
    you. Talk to them

    View Slide

  26. View Slide

  27. 27
    How to Talk to Users
    Find users
    Ask them questions
    Synthesize to make changes
    Repeat

    View Slide

  28. 28
    How to Talk to Users
    Find users
    Ask them questions
    Synthesize to make changes
    Repeat

    View Slide

  29. Get diversity

    View Slide

  30. Get diversity of role

    View Slide

  31. Get diversity of experience

    View Slide

  32. Get diversity of background

    View Slide

  33. 33
    How to Talk to Users
    Find users
    Ask them questions
    Synthesize to make changes
    Repeat

    View Slide

  34. Ask open ended questions

    View Slide

  35. “Do you like using Jenkins
    for testing services?”

    View Slide

  36. “Do you like using Jenkins
    for testing services?”

    View Slide

  37. “Can you talk to me about
    how you interacted with
    Jenkins on your last PR?”

    View Slide

  38. “Can you talk to me about
    how you interacted with
    Jenkins on your last PR?” ✅

    View Slide

  39. Users

    View Slide

  40. They are not

    View Slide

  41. They are not wrong

    View Slide

  42. 42
    How to Talk to Users
    Find users
    Ask them questions
    Synthesize to make changes
    Repeat

    View Slide

  43. “I’m not really sure what that
    does. Whenever I have a CI
    failure I have to spend 3
    minutes clicking around to
    find the stack trace ”

    View Slide

  44. “I’m not sure which buttons I
    can safely click here”

    View Slide

  45. “I often keep a few branches
    running because CI takes a
    long time.”

    View Slide

  46. Themes
    46

    View Slide

  47. Themes
    Speed of results
    1
    47

    View Slide

  48. Themes
    Speed of results
    Simple error reporting
    1
    2
    48

    View Slide

  49. Themes
    Speed of results
    Simple error reporting
    Ability to dig in as required
    1
    2
    49
    3

    View Slide

  50. View Slide

  51. View Slide

  52. View Slide

  53. View Slide

  54. 54
    How to Talk to Users
    Find users
    Ask them questions
    Synthesize to make changes
    Repeat

    View Slide

  55. View Slide

  56. View Slide

  57. Interview to
    understand
    needs

    View Slide

  58. Interview to
    understand
    needs
    Decide on
    approaches

    View Slide

  59. Interview to
    understand
    needs
    Decide on
    approaches
    Create
    lightweight
    mock-ups

    View Slide

  60. Interview to
    understand
    needs
    Decide on
    approaches
    Create
    lightweight
    mock-ups

    View Slide

  61. Interview to
    understand
    needs
    Decide on
    approaches
    Create
    lightweight
    mock-ups
    Decide on
    final
    approach

    View Slide

  62. Interview to
    understand
    needs
    Decide on
    approaches
    Create
    lightweight
    mock-ups
    Decide on
    final
    approach
    Bug bash

    View Slide

  63. Tools

    View Slide

  64. View Slide

  65. 65
    Major Principles
    Similarity Principle
    Proximity Principle
    Past Experience Principle
    Common Region

    View Slide

  66. 66
    Major Principles
    Similarity Principle
    Proximity Principle
    Past Experience Principle
    Common Region

    View Slide

  67. 67
    Similarity Principle

    View Slide

  68. 68
    Similarity Principle - In Practice
    Load Balancer
    Web
    Application
    Queue Workers
    Data
    Warehouse
    MySQL
    Queue
    Consumers

    View Slide

  69. 69
    Similarity Principle - In Practice
    Load Balancer
    Web
    Application
    Queue Workers
    Data
    Warehouse
    MySQL
    Queue
    Consumers

    View Slide

  70. 70
    Similarity Principle - In Practice
    Load Balancer
    Web
    Application
    Queue
    Consumers
    Data
    Warehouse
    MySQL
    Queue
    Workers

    View Slide

  71. 71
    Similarity Principle - In Practice
    Load Balancer
    Web
    Application
    Queue
    Consumers
    Data
    Warehouse
    MySQL
    Queue
    Workers

    View Slide

  72. 72
    Major Principles
    Similarity Principle
    Proximity Principle
    Past Experience Principle
    Common Region

    View Slide

  73. 73
    Proximity Principle

    View Slide

  74. Queue worker
    throughput
    Overall error rate
    P99 response times
    74
    Proximity Principle - In Practice
    P99 response times
    Overall error rate
    Queue worker
    throughput
    Queue Size
    Underreplicated
    partitions
    Frontend error rate
    P50 response times
    Checkout fulfillment
    error rate
    P95 response times

    View Slide

  75. 75
    Proximity Principle - In Practice
    P99 response times
    Overall error rate
    Queue worker
    throughput
    Queue Size
    Underreplicated
    partitions
    Frontend error rate
    P95 response times P50 response times
    Checkout fulfillment
    error rate

    View Slide

  76. 76
    Past Experience Principle
    Similarity Principle
    Proximity Principle
    Past Experience Principle
    Common Region

    View Slide

  77. 77
    Past Experiences Principle

    View Slide

  78. 78
    Past Experience Principle - In Practice
    Are you sure you want to delete this graph?
    Yes
    No

    View Slide

  79. 79
    Past Experience Principle - In Practice
    Are you sure you want to delete this graph?
    Yes
    No

    View Slide

  80. 80
    Past Experience Principle - In Practice
    Are you sure you want to turn off this this
    cluster?
    Yes
    No

    View Slide

  81. 81
    Past Experience Principle
    Similarity Principle
    Proximity Principle
    Past Experience Principle
    Common Region

    View Slide

  82. 82
    Common Region

    View Slide

  83. 83
    Common Region

    View Slide

  84. 84
    Common Region

    View Slide

  85. 85
    Common Region - In Practice
    Your Deployments
    ● Deployment #1: aaaaaa
    ● Deployment #2: bbbbb
    ● Deployment #3: cccccc
    All Deployments
    ● Deployment #10: 123456
    ● Deployment #22: 987654
    ● Deployment #33: 024689

    View Slide

  86. 86
    Common Region - In Practice
    Your Deployments
    ● Deployment #1: aaaaaa
    ● Deployment #2: bbbbb
    ● Deployment #3: cccccc
    All Deployments
    ● Deployment #10: 123456
    ● Deployment #22: 987654
    ● Deployment #33: 024689

    View Slide

  87. 87
    Common Region - In Practice
    Your Deployments
    ● Deployment #1: aaaaaa
    ● Deployment #2: bbbbb
    ● Deployment #3: cccccc
    All Deployments
    ● Deployment #10: 123456
    ● Deployment #22: 987654
    ● Deployment #33: 024689

    View Slide

  88. Design Systems

    View Slide

  89. 89
    Design Systems

    View Slide

  90. Task: Build a tool to know if a service
    is healthy and create an incident if it’s
    not.

    View Slide

  91. 91
    What makes the service healthy?
    Responding reasonably quickly
    Users are seeing an acceptable level of errors
    Data is flowing to the data warehouse

    View Slide

  92. Response times (p99)
    Response times (p95)
    Error Rates
    Checkout Errors
    Redshift Exporter Duration
    Redshift Delay (seconds)
    Create Incident

    View Slide

  93. Response times (p99)
    Response times (p95)
    Error Rates
    Checkout Errors
    Redshift Exporter Duration
    Redshift Delay (seconds)
    Create Incident
    Open
    Open Open Open
    Open Open

    View Slide

  94. Response times (p99)
    Response times (p95)
    Error Rates
    Checkout Errors
    Redshift Exporter Duration
    Redshift Delay (seconds)
    Create Incident
    Open
    Open Open Open
    Open Open

    View Slide

  95. Response Latency
    Response times (p99)
    Error Rates
    Total Errors
    Redshift Connector
    Redshift Delay (seconds)
    Create Incident
    Healthy Healthy Unhealthy
    Open Open Open

    View Slide

  96. Response Latency
    Response times (p99)
    Error Rates
    Total Errors
    Redshift Connector
    Redshift Delay (seconds)
    Create Incident
    Healthy Healthy Unhealthy
    Open Open Open

    View Slide

  97. Response Latency
    Response times (p99)
    Error Rates
    Total Errors
    Redshift Connector
    Redshift Delay (seconds)
    Create Incident

    Open Open Open

    View Slide

  98. Response Latency
    Response times (p99)
    Error Rates
    Total Errors
    Redshift Connector
    Redshift Delay (seconds)
    Create Incident

    Open Open Open

    View Slide

  99. Response Latency
    Response times (p99)
    Error Rates
    Total Errors
    Redshift Connector
    Redshift Delay (seconds)
    Create Incident

    Open Open Open

    View Slide

  100. Summary

    View Slide

  101. Three Takeaways
    101

    View Slide

  102. Three Takeaways
    You’re building products. Your users are your colleagues
    1
    102

    View Slide

  103. Three Takeaways
    You’re building products. Your users are your colleagues
    Involve your users in open-ended conversations
    1
    2
    103

    View Slide

  104. Three Takeaways
    You’re building products. Your customers are your colleagues
    Involve your customers in open-ended conversations
    You don’t need a design degree to make usable tools
    1
    2
    104
    3

    View Slide

  105. 105

    View Slide

  106. Jonathan Maltz - @maltzj

    View Slide

  107. Resources
    Product for Internal teams: https://medium.com/@skamille/product-for-internal-platforms-9205c3a08142
    Fundamentals of Design: How to Think Like A Designer
    https://www.skillshare.com/classes/Fundamentals-of-Design-How-to-Think-Like-a-Designer/1986357063
    Don’t Make Me Think: http://sensible.com/dmmt.html
    Rocket Surgery Made Easy: http://sensible.com/rsme.html
    Bootstrap: https://getbootstrap.com/
    Ant.Design: https://ant.design/
    Design Sprints: https://www.gv.com/sprint/
    107

    View Slide

  108. 108

    View Slide