Seeing Inside Your Service

Seeing Inside Your Service Monitoring and logging in GCP Amir
Hermelin Product Manager Asaph Zemach Engineering Manager

New logs pipeline and viewer The Time Series API in
depth Putting it all together using Compute Engine 1 2 3 4 Cloud Platform vision for monitoring and logging Agenda

What we’ve learned • Catch problems early, before they turn
into user visible outages. • Noise is bad: reduce false positives! • Alert on symptoms: will this affect users? Monitoring at Google

Monitoring Customer Perspective “I love being woken up at 3am
to reboot my VM instance” “We had alerts but they were too noisy so we ended up just ignoring them.” “We produce too much logs volume to use the GAE logs viewer” “I know we need better monitoring and alerting for our production services but who has the time (or expertise) to set it up?” Absolutely nobody Lots of customers Too many customers Most customers Here’s what we’re hearing from...

Monitoring Customer Perspective “I love being woken up at 3am
to reboot my VM instance” “We had alerts but they were too noisy so we ended up just ignoring them.” “We produce too much logs volume to use the GAE logs viewer” “I know we need better monitoring and alerting for our production services but who has the time (or expertise) to set it up?” Absolutely nobody Lots of customers Too many customers Most customers Here’s what we’re hearing from... Actionable Scalable Easy to use Smart You’re telling us we need to be...

We want dashboards and alerts that: • Surface only relevant
metrics and events • Minimize false-positives • Automatically detect issues and help find related events Timely and scalable metrics gathering along with reliable, efficient logs collection and search means YOU can then connect the dots faster, minimize troubleshooting time and take immediate action. Where We Are Going

Cloud Platform vision for monitoring and logging New logs pipeline
and viewer The Time Series API in depth Putting it all together using Compute Engine 2 4 Agenda 3 1

New Logs Pipeline and Viewer App Engine Cloud Storage BigQuery
Logs viewer in Cloud Console Logs Pipeline Buffer

source: Google data Logs Viewer

Logs Viewer Improvements • Infinite scroll • Automatically searches through
logs until enough results are found • Search supports both labels and regexp • Suggest labels as you type

and viewer The Time Series API in depth Putting it all together using Compute Engine 2 4 1 3 Agenda 1

Collection of System Metrics Periodically sample important system counters Monitoring
API Monitoring Data Satisfied User Google Metrics Store

Time Series Data in Cloud Console source: Google data

Compute Engine App Engine Metrics Available for Query /http/server/pagespeed_response_count /http/server/response_count
/http/server/response_latencies /http/server/response_style_count /http/server/dos_intercept_count /http/server/quota_denial_count /system/cpu/usage /system/network/pagespeed_sent_bytes_count /system/network/received_bytes_count /system/network/sent_bytes_count /instance/uptime /instance/cpu/usage_time /instance/cpu/reserved_cores /instance/disk/read_ops_count /instance/disk/write_ops_count /instance/disk/read_bytes_count /instance/disk/write_bytes_count /instance/disk/read_latencies /instance/disk/write_latencies /instance/network/received_bytes_count /instance/network/sent_bytes_count /instance/network/received_packets_count /instance/network/sent_packets_count /firewall/dropped_bytes_count /firewall/dropped_packets_count

Example of what a read request looks like GET https://www.googleapis.com/cloudmonitoring/
\ # Access monitoring API v2beta1/ \ # (that’s still in beta) projects/myproject/ \ # for myproject timeseries/ \ # to get a time series of points compute.googleapis.com/ \ # for the Compute Engine service /instance/cpu/usage_time # that has CPU usage by instance Metrics Read API Request

Example of what a read request looks like Metrics Read
API Response { "kind": "cloudmonitoring#listTimeseriesResponse", ... "timeseries": [ { "timeseriesDesc": { "project": "1016230248573", "metric": "compute.googleapis.com/instance/cpu/usage_time", "labels": { "cloud.googleapis.com/service": "compute.googleapis.com", ... "compute.googleapis.com/instance_name": "ae-engine-1-03-0" } }, "points": [ { "start": "2014-03-07T18:57:09.000Z", "end": "2014-03-20T00:13:13.000Z", "singularValue": 13400.60009765625 },

and viewer The Time Series API in depth Putting it all together using Compute Engine 4 2 1 3 Agenda

Example using GCE CloudMemeBackEnd Request Metrics Google Metrics Store

Complete Example: Demo

Platform Metrics under development • Cloud SQL • API usage
metrics • more coming... API Features: • Read • List • Create • Write Upcoming Metrics and APIs

4 z New logs pipeline and viewer The Time Series
API in depth Putting it all together using Compute Engine 2 Summary 3 1 Cloud Platform vision for monitoring and logging 4 4 1 3 1

Thank You and Questions We’d LOVE your feedback and thoughts!
[email protected]

Seeing Inside Your Service

Seeing Inside Your Service

Kazunori Sato

More Decks by Kazunori Sato

Other Decks in Technology

Featured

Transcript