Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MicroXchg: Logging and Metrics in Microservice ...

MicroXchg: Logging and Metrics in Microservice Architectures

Alexander Heusingfeld

February 05, 2016
Tweet

More Decks by Alexander Heusingfeld

Other Decks in Technology

Transcript

  1. Don’t Fly Blind Logging and Metrics in Microservice Architectures Tammo

    van Lessen | [email protected] Alexander Heusingfeld | [email protected] #microxchg #logging #metrics www.innoQ.com
  2. If you’re able to treat every Bounded Context as a

    separately deployable, independent component… © innoQ/Roman Stranghöner
  3. … you’ll have a self-contained system - which can lead

    to a 
 microservice architecture Introduction to self-contained systems: https://www.innoq.com/de/links/self-contained-systems-infodeck/
  4. Requirements > Apply a well-thought logging concept > Aggregate logs

    in different formats from different systems > Search & Correlate > Visualize & Drill-down > Alerting
  5. Use Thread Contexts / MDCs %-5p: [%X{loginId}] %m%n ThreadContext.put("loginId", login);

    logger.error("Something bad happened!"); ThreadContext.clear(); + Layout: ERROR: [John Doe] Something bad happened! Log:
  6. Use Thread Contexts / MDCs { "@version" => "1", "@timestamp"

    => "2014-04-29T14:21:14.988-07:00", "logger" => "com.example.LogStashExampleTest", "level" => "ERROR", "thread" => "Test worker", "message" => "Something bad happened!", "Properties" => { "loginId" => "John Doe" } } ThreadContext.put("loginId", login); logger.error("Something bad happened!"); ThreadContext.clear(); + JSON Layout Log:
  7. Define QoS for Log Messages > Log messages may have

    different QoS > Use Markers and Filters to enable fine- grained routing of messages to dedicated appenders > Use Filters and Lookups to dynamically configure logging https://www.innoq.com/en/blog/per-request-debugging-with-log4j2/
  8. Requirements > Apply a well-thought logging concept > Aggregate logs

    in different formats from different systems > Search & Correlate > Visualize & Drill-down > Alerting
  9. Default ELK-Stack Setup Shipper / 
 Logstash Forwarder Storage &

    Search Visualize https://www.elastic.co/products/logstash Push
  10. Distributed Logstash Setup Shipper / 
 Logstash Forwarder Broker Indexer

    Storage & Search Visualize https://www.elastic.co/products/logstash Push Pull
  11. Requirements > Apply a well-thought logging concept > Aggregate logs

    in different formats from different systems > Search & Correlate > Visualize & Drill-down > Alerting
  12. Requirements > Apply a well-thought logging concept > Aggregate logs

    in different formats from different systems > Search & Correlate > Visualize & Drill-down > Alerting
  13. Filter Log Stream For Alerts input { … } filter

    { if [message] =~ /.*(CRITICAL|FATAL|ERROR|EXCEPTION).*/ { mutate { add_tag => "alarm" } } if [message] =~ /.*(?i)ignoreme.*/ { mutate { remove_tag => "alarm" } } } output { if [type] == "production" { if "alarm" in [tags] { pagerduty { description => "%{host} - %{log_level}: %{log_message}" details => { "timestamp" => "%{@timestamp}" "host" => "%{host}" "log_level" => "%{log_level}" "message" => "%{log_message}" "path" => "%{path}" } … } } } }
  14. Logging is cool… And I can use it to collect

    metrics as well, right? © http://www.flickr.com/photos/dkeats/3128150892/
  15. Logging is cool… And I can use it to collect

    metrics as well, right? Watch out! © http://www.flickr.com/photos/dkeats/3128150892/
  16. Gauges A gauge is an instrument that measures a value.

    © https://secure.flickr.com/photos/profilerehab/4974589604/
  17. Counters A counter is a simple incrementing and decrementing integer.

    © https://secure.flickr.com/photos/mwichary/2273099939/
  18. Meters A meter measures the rate at which a set

    of events occur. © https://www.flickr.com/photos/springfieldhomer/1244320899
  19. Timers A timer is a histogram over a duration. ©

    https://secure.flickr.com/photos/psd/4686988937/
  20. Distributed Metrics Architecture Measure Collect & Sample Store Query &

    Graph Anomaly Detection Alerting CEP Dashboards
  21. + producer unaware of target + multiple targets possible +

    flexible interval - might miss short-lived services - requires service-discovery P T P Push + event-based de-/registration + routable event stream + producer pushes when ready - producer aware of target - packet-loss might be missed Pull P T P vs.
  22. Some Recommendations > Think about what metrics are of importance

    for operating your application > Consider retention policies > Carefully design your dashboards > Think about non-standard graph types
  23. Conclusions > Create and document concepts for logging and metrics

    > Collect & aggregate distributed logs and metrics > Create dashboards tailored for your audience > Correlate your data to make conscious decisions > Don’t create your very own big data problem
  24. Prevent the apocalypse! Logging shows events. Metrics show state. Don't

    fly blind! © http://www.flickr.com/photos/pasukaru76/5067879762
  25. Tammo van Lessen | @taval [email protected] Alexander Heusingfeld | @goldstift

    [email protected] Thank you! Questions? Comments? innoQ Deutschland GmbH Krischerstr. 100 D-40789 Monheim am Rhein Germany Phone: +49 2173 3366-0 innoQ Schweiz GmbH Gewerbestr. 11 CH-6330 Cham Switzerland Phone: +41 41 743 0116 www.innoq.com Ohlauer Straße 43 D-10999 Berlin Germany Phone: +49 2173 3366-0 Ludwigstr. 180 E D-63067 Offenbach Germany Phone: +49 2173 3366-0 Kreuzstr. 16 D-80331 München Germany Telefon +49 2173 3366-0 https://www.innoq.com/en/talks/