Upgrade to Pro — share decks privately, control downloads, hide ads and more …

KubernetesでDatadogを飼うならオートディスカバリーを使わないと損

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

 KubernetesでDatadogを飼うならオートディスカバリーを使わないと損

Avatar for Atsushi Tanaka

Atsushi Tanaka

August 07, 2024
Tweet

More Decks by Atsushi Tanaka

Other Decks in Technology

Transcript

  1. © 2024 Wantedly, Inc. $ whoami @bgpat / Atsushi Tanaka

    ウォンテッドリー株式会社 Infrastructure Engineer Kubernetes / Terraform SRE / Platform Engineering Datadog 歴 6〜7年くらい
  2. © 2024 Wantedly, Inc. 設定が書ける箇所 Annotation に書く • Pod: ad.datadoghq.com/<CONTAINER_IDENTIFIER>.checks

    • Service: ad.datadoghq.com/service.checks • Endpoint: ad.datadoghq.com/endpoints.checks (オートディスカバリに近い?機能) • Tag Labels: ◦ tags.datadoghq.com/env ◦ tags.datadoghq.com/version ◦ tags.datadoghq.com/service • ConfigMap: ad_identifiers に一致するイメージに適用
  3. © 2024 Wantedly, Inc. 利用できるテンプレート変数 https://docs.datadoghq.com/containers/guide/template_variables/ • %%host%%, %%host_<NETWORK_NAME>%% •

    %%port%%, %%port_<NUMBER_X>%%, %%port_<NAME>%% • %%pid%%, %%hostname%% • %%env_<ENV_VAR>%% • %%kube_namespace%%, %%kube_pod_name%%, %%kube_pod_uid%% https://docs.datadoghq.com/ja/agent/configuration/secrets-management/ • ENC[file@/path/to/file] • ENC[k8s_secret@some_namespace/some_name/a_key]
  4. © 2024 Wantedly, Inc. 使用例 apiVersion: v1 kind: Pod metadata:

    annotations: ad.datadoghq.com/redis.checks: | { "redisdb": { "init_config": {}, "instances": [ { "host": "%%host%%", "port":"%%port%%", "password":"%%env_REDIS_PASSWORD%%" } ] } } ad.datadoghq.com/redis.logs: '[{"source":"redis"}]' spec: containers: - name: redis ︙ Pod の annotation に書く
  5. © 2024 Wantedly, Inc. 使用例 apiVersion: v1 kind: Service metadata:

    annotations: ad.datadoghq.com/service.checks: | { "redisdb": { "init_config": {}, "instances": [ { "host": "%%host%%", "port":"%%port%%", "password":"ENC[k8s_secret@my-redis/redis-secret/password]" } ] } } ︙ Service の annotation に書く
  6. © 2024 Wantedly, Inc. 動作確認 (デバッグ) 方法 1. (Cluster Agent

    の場合) 監視している Agent を探す i. Cluster Agent のリーダーを調べる ii. agent clusterchecks を実行して対象の Check を探す iii. 対象 Agent の pod を頑張って探す 2. Agent のステータスを確認 i. agent status を実行して対象の Check を探す
  7. © 2024 Wantedly, Inc. 動作確認 (デバッグ) 方法 (Cluster Agent の場合)

    監視している Agent を探す # Cluster Agent のリーダーを調べる $ kubectl -n default get cm datadog-agent-leader-election -o json | \ jq '.metadata.annotations["control-plane.alpha.kubernetes.io/leader"] | fromjson.holderIdentity' "datadog-agent-cluster-agent-7474855779-z2zjf" # agent clusterchecks を実行して対象の Check を探す $ kubectl -n default exec datadog-agent-cluster-agent-7474855779-z2zjf -- agent clusterchecks ︙ ===== Checks on i-0123456789abcdef0 ===== === postgres check === Configuration provider: kubernetes-services Configuration source: kube_services:kube_service://default/aurora-postgres Config for instance ID: postgres:65af62e418817e1e ︙ # 対象 Agent の pod を頑張って探す (ウォンテッドリーの環境は providerID から特定できた) $ kubectl get no -o json | \ jq '.items[] | select(.spec.providerID | endswith(" i-0123456789abcdef0")) | .metadata.name' "ip-10-3-96-189.ap-northeast-1.compute.internal" $ kube sandbox -n default get po \ --field-selector spec.nodeName= ip-10-3-96-189.ap-northeast-1.compute.internal -l app=datadog-agent NAME READY STATUS RESTARTS AGE datadog-agent-dncs6 3/3 Running 0 61m
  8. © 2024 Wantedly, Inc. 動作確認 (デバッグ) 方法 Agent のステータスを確認 #

    agent status を実行して対象の Check を探す $ kubectl -n default exec datadog-agent-dncs6 -- agent status Defaulted container "agent" out of: agent, trace-agent, process-agent, init-volume (init), init-config (init) disable most components. It's recommended to use autoconfig_exclude_features and autoconfig_include_features to activate/deactivate features selectively Getting the status from the agent. =============== Agent (v7.54.0) =============== ︙ postgres (18.2.2) ----------------- Instance ID: postgres:65af62e418817e1e [OK] Configuration Source: kube_services:kube_service://default/aurora-postgres Total Runs: 33 Metric Samples: Last Run: 10,503, Total: 346,599 Events: Last Run: 0, Total: 0 Database Monitoring Metadata Samples: Last Run: 1, Total: 2 Service Checks: Last Run: 1, Total: 33 Average Execution Time : 498ms ︙
  9. © 2024 Wantedly, Inc. まとめ • Kubernetes と Datadog は相性がいい

    • 監視対象はコンテナとサービスが選択可能 • 便利なテンプレート機能も利用可能 • デバッグ方法に難あり (いい方法があれば知りたい)