that no one cares, and so breaking more windows costs nothing. (James Q. Wilson, George L. Kelling ) ΞϥʔτνϟϯωϧΛ࠶ఆٛ͢Δ An unrepaired window says nobody cares…an untriaged alert says the same! Image: AI-generated (OpenAI / Raycastʣ
quickly indicate the direction you should take when an alert comes in. As environments become more complex, not everyone on the team knows every system, and runbooks become an excellent way to spread knowledge.” (Mike Julian, Practical Monitoring, O’Reilly, 2017) खॱॻʢaka RunbooksʣΛ࣋ͭͱ αΠϩԽΛݮΒͤΔ
{ message: [ "A foo error occurred.", "Please check resources at https://example.com.", "If it's bar, please proceed”, “with the standard workaround." ] }]; RC00001 = 1[(ext.reason_code) = { message: [ "5xx errors returned from foo's system.", "Their system might be experiencing issues.", "Check logs and if it's not temporary,”, "contact the responsible party through PM.", "foo, Inc.”, "080-1111-1111" ] }]; } 23 Single Source of Truth: Proto File • Runbookͱඥ͚ΔϢχʔΫίʔυ Message (Runbook) • Reason code ʹඥ͚ΒΕͨ Runbook ຊମ Reason Code
{ message: [ "A foo error occurred.", "Please check resources at https://example.com.", "If it's bar, please proceed”, “with the standard workaround." ] }]; RC00001 = 1[(ext.reason_code) = { message: [ "5xx errors returned from foo's system.", "Their system might be experiencing issues.", "Check logs and if it's not temporary,”, "contact the responsible party through PM.", "foo, Inc.”, "080-1111-1111" ] }]; } 24 Single Source of Truth: Proto File Message (Runbook) • Reason code ʹඥ͚ΒΕͨ Runbook ຊମ • Runbookͱඥ͚ΔϢχʔΫίʔυ Reason Code
{ message: [ "A foo error occurred.", "Please check resources at https://example.com.", "If it's bar, please proceed”, “with the standard workaround." ] }]; RC00001 = 1[(ext.reason_code) = { message: [ "5xx errors returned from foo's system.", "Their system might be experiencing issues.", "Check logs and if it's not temporary,”, "contact the responsible party through PM.", "foo, Inc.”, "080-1111-1111" ] }]; } 25 • Runbookͱඥ͚ΔϢχʔΫίʔυ Single Source of Truth: Proto File Reason Code Message (Runbook) • Reason code ʹඥ͚ΒΕͨ Runbook ຊମ
EDIT. --}} {{ log.message }} {{#is_match "log.attributes.error.reason_codes" "\"RC00000\"" }} reason_code: `RC00000` A foo error occurred. Please check resources at https://example.com. If it's bar, please proceed with the standard workaround. {{/is_match}} {{#is_match "log.attributes.error.reason_codes" "\"RC00001\"" }} reason_code: `RC00001` 5xx errors returned from bar's system. Their system might be experiencing issues. Check the logs, and if it's not temporary, contact the responsible party via Slack. bar, Inc. 555-1234-5678 {{/is_match}} ... {{#is_alert}} @${notification_alert_channel} {{/is_alert}} ------------------------------------------ monitor.tf: ------------------------------------------ resource "datadog_monitor" "application_error_alert" { ... message = templatefile( “${path.module}/messages/application_error_alert_message.pb.md”, { notification_alert_channel = var.notification_alert_channel } ) } 37 Monitors as Code • Monitor ͷ Markdown Λࣗಈੜ • is_match ͰͷذΛࣗಈੜ • templatefile ͰಡΈࠐΉ
EDIT. --}} {{ log.message }} {{#is_match "log.attributes.error.reason_codes" "\"RC00000\"" }} reason_code: `RC00000` A foo error occurred. Please check resources at https://example.com. If it's bar, please proceed with the standard workaround. {{/is_match}} {{#is_match "log.attributes.error.reason_codes" "\"RC00001\"" }} reason_code: `RC00001` 5xx errors returned from bar's system. Their system might be experiencing issues. Check the logs, and if it's not temporary, contact the responsible party via Slack. bar, Inc. 555-1234-5678 {{/is_match}} ... {{#is_alert}} @${notification_alert_channel} {{/is_alert}} ------------------------------------------ monitor.tf: ------------------------------------------ resource "datadog_monitor" "application_error_alert" { ... message = templatefile( “${path.module}/messages/application_error_alert_message.pb.md”, { notification_alert_channel = var.notification_alert_channel } ) } 38 Monitors as Code ෳࡶͰେͳذΛࣗಈੜʹ ΑͬͯθϩίετͰ࣮ݱ • Monitor ͷ Markdown Λࣗಈੜ • is_match ͰͷذΛࣗಈੜ • templatefile ͰಡΈࠐΉ