to reboot my VM instance” “We had alerts but they were too noisy so we ended up just ignoring them.” “We produce too much logs volume to use the GAE logs viewer” “I know we need better monitoring and alerting for our production services but who has the time (or expertise) to set it up?” Absolutely nobody Lots of customers Too many customers Most customers Here’s what we’re hearing from...
to reboot my VM instance” “We had alerts but they were too noisy so we ended up just ignoring them.” “We produce too much logs volume to use the GAE logs viewer” “I know we need better monitoring and alerting for our production services but who has the time (or expertise) to set it up?” Absolutely nobody Lots of customers Too many customers Most customers Here’s what we’re hearing from...
to reboot my VM instance” “We had alerts but they were too noisy so we ended up just ignoring them.” “We produce too much logs volume to use the GAE logs viewer” “I know we need better monitoring and alerting for our production services but who has the time (or expertise) to set it up?” Absolutely nobody Lots of customers Too many customers Most customers Here’s what we’re hearing from...
to reboot my VM instance” “We had alerts but they were too noisy so we ended up just ignoring them.” “We produce too much logs volume to use the GAE logs viewer” “I know we need better monitoring and alerting for our production services but who has the time (or expertise) to set it up?” Absolutely nobody Lots of customers Too many customers Most customers Here’s what we’re hearing from...
to reboot my VM instance” “We had alerts but they were too noisy so we ended up just ignoring them.” “We produce too much logs volume to use the GAE logs viewer” “I know we need better monitoring and alerting for our production services but who has the time (or expertise) to set it up?” Absolutely nobody Lots of customers Too many customers Most customers Here’s what we’re hearing from... Actionable Scalable Easy to use Smart You’re telling us we need to be...
metrics and events • Minimize false-positives • Automatically detect issues and help find related events Timely and scalable metrics gathering along with reliable, efficient logs collection and search means YOU can then connect the dots faster, minimize troubleshooting time and take immediate action. Where We Are Going
\ # Access monitoring API v2beta1/ \ # (that’s still in beta) projects/myproject/ \ # for myproject timeseries/ \ # to get a time series of points compute.googleapis.com/ \ # for the Compute Engine service /instance/cpu/usage_time # that has CPU usage by instance Metrics Read API Request