Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Visualizing Your E-mail with Elastic Stack
Search
Kosho Owa
April 20, 2016
Technology
2
320
Visualizing Your E-mail with Elastic Stack
警視庁の犯罪・防犯情報提供サービス「メールけいしちょう」で受信したメッセージを Elasticsearch でインデックスし、Kibana で可視化する方法を紹介します。
Kosho Owa
April 20, 2016
Tweet
Share
More Decks by Kosho Owa
See All by Kosho Owa
Introducing Machine Learning for the Elastic Stack
kosho
2
12k
Elastic Stack X-Pack 5.0 for IT Security Workshop
kosho
1
310
Elastic Stack X-Pack 5.0 for IT Ops Workshop
kosho
0
330
[Developers Summit 2017] Anomaly Detection with the Elastic Stack
kosho
1
710
Anomaly Detection with the Elastic Stack
kosho
1
1.8k
Getting Started with Elastic Cloud and Beats for Log Analytics
kosho
0
100
Elastic{ON} Seminar Tokyo 2016 Product Update
kosho
0
170
Introducing Elastic Cloud
kosho
0
76
Gearing Up for Elastic Stack, X-Pack 5.0 Releases
kosho
0
150
Other Decks in Technology
See All in Technology
リーダーになったら未来を語れるようになろう/Speak the Future
sanogemaru
0
230
バイブコーディングと継続的デプロイメント
nwiizo
2
380
データエンジニアがこの先生きのこるには...?
10xinc
0
430
PythonとLLMで挑む、 4コマ漫画の構造化データ化
esuji5
1
120
Goに育てられ開発者向けセキュリティ事業を立ち上げた僕が今向き合う、AI × セキュリティの最前線 / Go Conference 2025
flatt_security
0
320
いま注目しているデータエンジニアリングの論点
ikkimiyazaki
0
570
タスクって今どうなってるの?3.14の新機能 asyncio ps と pstree でasyncioのデバッグを (PyCon JP 2025)
jrfk
1
220
"複雑なデータ処理 × 静的サイト" を両立させる、楽をするRails運用 / A low-effort Rails workflow that combines “Complex Data Processing × Static Sites”
hogelog
3
1.7k
ZOZOのAI活用実践〜社内基盤からサービス応用まで〜
zozotech
PRO
0
140
神回のメカニズムと再現方法/Mechanisms and Playbook for Kamikai scrumat2025
moriyuya
4
300
AIAgentの限界を超え、 現場を動かすWorkflowAgentの設計と実践
miyatakoji
0
110
Goを使ってTDDを体験しよう!
chiroruxx
1
250
Featured
See All Featured
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.4k
Rebuilding a faster, lazier Slack
samanthasiow
84
9.2k
For a Future-Friendly Web
brad_frost
180
9.9k
GraphQLとの向き合い方2022年版
quramy
49
14k
Side Projects
sachag
455
43k
Code Review Best Practice
trishagee
72
19k
The Art of Programming - Codeland 2020
erikaheidi
56
14k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.6k
Practical Orchestrator
shlominoach
190
11k
Testing 201, or: Great Expectations
jmmastey
45
7.7k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
610
Transcript
‹#› Kosho Owa, Solutions Architect, Elastic April 20th, 2016 Visualizing
Your E-mail ʮϝʔϧ͚͍ͪ͠ΐ͏ʯΛՄࢹԽ͢Δ
ରσʔλ • ܯࢹிͷϝʔϧ͚͍ͪ͠ΐ͏(ొແྉ) http://www.keishicho.metro.tokyo.jp/about_mpd/joho/mail_info.html • ʮ൜ࡑൃੜใʯʮ൜ใʯΛϝʔϧ৴ • CC BY 2.1
JP Ͱఏڙ 2 Subject: ۄܯॺ(ࢠͲʢߦʣ) Body: 4݄16ʢʣɺޕޙ4࣌40͜Ζɺੈా୩۠Ԟ̍ஸͷ࿏্Ͱɺࣇಐ͕௨ߦதɺஉʹಥ ͖ඈ͞Ε·ͨ͠ɻʢ൜ਓʢஉʣͷಛʹ͍ͭͯɺ̑̌ࡀɺ170cm Ґɺதɺޱͻ ͛ɺ৭ͬΆ্͍ҥɺࠇ৭ͬΆ͍ζϘϯʣ ʲ߹ͤઌʳۄܯॺ 03-3705-0110ʢઢ2612ʣ
ํ • ϝʔϧΛIMAPͰऔಘ • ϑΟʔϧυΛߏԽ͢Δ • λΠϓΛߟྀͯ͠ΠϯσοΫε • analyzed, not_analyzedϑΟʔϧυͦΕͧΕΛ༻ͯ͠ՄࢹԽ͢Δ
3
Logstash Pipeline and Plugins ϓϥάΠϯՄೳͳΞʔΩςΫνϟʔͱɺ։ൃऀʹ༏͍͠ΤίγεςϜ 4 input {} filter {}
output {} beats, file, graphite, http, imap, kafka, rss, redis, stdin, sqlite, s3, syslog, zenoss and etc. csv, cloudwatch, email, elasticsearch, exec, file, graphite, http, kafka, mongodb, nagios, redis, s3, syslog, stdout, zabbix and etc.
Input Plugin - imap 5 input { imap { host
=> "imap.gmail.com" port => 993 user => "_IMAP_USER_" password => "_IMAP_PASSWORD_" folder => "_IMAP_FOLDER_" type => "_TYPE_" check_interval => 300 codec => plain { charset => "ISO-2022-JP" } } } • ϝʔϧຊจͷΤϯίʔυΛcodecͰࢦఆ͢Δ • ͋Β͔͡ΊIMAPͷfolderΛ͚ • ෳͷλΠϓϝʔϧΛॲཧ͢Δ߹ʹλά(tags)ΛՃ͢Δ https://www.elastic.co/guide/en/logstash/current/plugins-inputs-imap.html • : ίϛϡχςΟϓϥάΠϯ
Filter Plugin • ϝʔϧͷຊจ͔Βൈ͖ग़͢ϑΟʔϧυ: city, area, place • λΠτϧ͔Βൈ͖ग़͢ϑΟʔϧυ: police_station,
incident • λΠϜελϯϓͱͯ͠࠾༻: datetime 6 filter { grok { match => { "message" => "%{DATA:[@metadata][datetime]}͜Ζɺ%{NOTSPACE:city}(۠|ࢢ)% {NOTSPACE:area}(ͷ|ۙ)(%{NOTSPACE:place}|)Ͱɺ%{GREEDYDATA}" } } date { match => ["[@metadata][datetime]", "M݄dʢEʣɺaK࣌m"] locale => ja timezone => "Asia/Tokyo" } grok { match => { "subject" => "%{NOTSPACE:police_station}ܯॺ\(%{NOTSPACE:incident}\)" } } }
ೖྗσʔλ grok ग़ྗ Filter Plugin - grok • ύλʔϯʹϚονͨ͠จࣈྻΛϑΟʔϧυʹؔ࿈͚ɺඇߏσʔλΛߏԽ͢Δ https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html
7 “subject” => “ۄܯॺ(ࢠͲʢߦʣ)” grok { match => { "subject" => "%{NOTSPACE:police_station}ܯॺ\(%{NOTSPACE:incident}\)" } } “police_station” => "ۄ" “incident" => "(ࢠͲʢߦʣ)"
ೖྗσʔλ date ग़ྗ Filter Plugin - date ϑΟʔϧυΛύʔε͠ɺLogstashͷΠϕϯτͱͯ͠༻ 8 "datetime"
=> “4݄16ʢʣɺޕલ7࣌40” "@timestamp" => "2016-04-16T07:40:00.000Z" • ͷऔಘʹࣦഊͨ͠߹ʹɺॲཧ͕࣌@timestampͱͯ͠࠾༻͞ΕΔ (tag_on_failure => true ݕ౼) https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html date { match => ["[@metadata][datetime]", "M݄dʢEʣɺaK࣌m"] locale => ja timezone => "Asia/Tokyo" }
Output Plugin - elasticsearch 9 output { stdout { codec
=> dots } elasticsearch { hosts => ["http://127.0.0.1:9200/"] index => "mail-%{+YYYY.MM}" } } • stdout { codec => dots } ͰɺҰ݅ॲཧ͝ͱʹυοτΛग़ྗ͢Δ • ΠϯσοΫε͕దͳαΠζʹͳΔΑ͏ɺΠϯσοΫε໊Λݕ౼͢Δ https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html
Logstash Tips • ग़ྗ࣌ʹύΠϓϥΠϯΛදࣔ • ϫʔΧʔΛదʹઃఆ͢Δ • ҟͳΔछྨͷσʔλɺLogstashͷೖྗલʹ͚͓ͯ͘ • grok
ϔϧύʔπʔϧΛ͏ http://grokdebug.herokuapp.com http://grokconstructor.appspot.com 10 output { stdout { codec => rubydebug } } $ logstash -w [NUMBER OF WORKERS] -f [PATH TO CONFIG]
Elasticsearch - Mapping • text (analyzed strings), keyword(not_analyzed strings)ϑΟʔϧυ5.0͔Βಋೖ •
textϑΟʔϧυͷanalyzerʹkuromojiΛࢦఆ͢Δ • terms aggregationΛߦ͏ͨΊʹɺmulti-fieldػೳΛͬͯkeywordϑΟʔϧυΛࢦఆ͢Δ 11 PUT /_template/mail-1 { "template": "mail-*", "mappings": { "_default_": { "properties": { "message": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }, "analyzer": "kuromoji" },... }}}}
Kibana - Visualize “Terms Aggregation” keywordϑΟʔϧυͰaggregation͢Δ 12
Kibana - Visualize “Filters Aggregation” analyzedϑΟʔϧυͰaggregation͢Δ 13
ؔ࿈ใ 14