Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Design Patterns for Collecting and Analyzing Sc...
Search
Sotaro Karasawa
August 09, 2013
Technology
5
840
Design Patterns for Collecting and Analyzing Schemaless Log
スキーマレスなログデータの収集と集計のためのデザインパターン
at
http://www.zusaar.com/event/876003
Sotaro Karasawa
August 09, 2013
Tweet
Share
More Decks by Sotaro Karasawa
See All by Sotaro Karasawa
P2B Haus法人サポータープランのご提案
sotarok
2
1.4k
ソフトウェアxスタートアップから見た飲食と配送の世界 / The World of Food Deliverlies and Restaurant Businesses from a Software and Startup Perspective
sotarok
2
1.2k
CTO 3度目の正直 / My 3rd CTO Career
sotarok
21
10k
Introduction to the Corporate Solutions Engineering at MTC2018
sotarok
1
36k
Mercari meetup for Corporate Engineering #1 / What is "Corporate Engineering"?
sotarok
2
2.3k
Markdown and WYSIWYG
sotarok
1
6.1k
20 Jan 2017 / Moving Beyond Borders - Mercari DAY
sotarok
8
15k
PHPBLT の心得 / PHPBLT #5 @ペパボ
sotarok
5
3.5k
Wiki についての今昔物語 / Crowi
sotarok
5
15k
Other Decks in Technology
See All in Technology
Claude Code Actionを使ったコード品質改善の取り組み
potix2
PRO
4
1.6k
Agentic DevOps時代の生存戦略
kkamegawa
0
1k
Oracle Audit Vault and Database Firewall 20 概要
oracle4engineer
PRO
3
1.6k
プロダクトエンジニアリング組織への歩み、その現在地 / Our journey to becoming a product engineering organization
hiro_torii
0
110
Observability infrastructure behind the trillion-messages scale Kafka platform
lycorptech_jp
PRO
0
130
JSX - 歴史を振り返り、⾯⽩がって、エモくなろう
pal4de
3
1.1k
より良いプロダクトの開発を目指して - 情報を中心としたプロダクト開発 #phpcon #phpcon2025
bengo4com
1
380
IIWレポートからみるID業界で話題のMCP
fujie
0
730
実践! AIエージェント導入記
1mono2prod
0
140
Amazon Bedrockで実現する 新たな学習体験
kzkmaeda
1
400
Agentic Workflowという選択肢を考える
tkikuchi1002
1
390
Navigation3でViewModelにデータを渡す方法
mikanichinose
0
210
Featured
See All Featured
Embracing the Ebb and Flow
colly
86
4.7k
GraphQLの誤解/rethinking-graphql
sonatard
71
11k
Practical Orchestrator
shlominoach
188
11k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
VelocityConf: Rendering Performance Case Studies
addyosmani
330
24k
Build The Right Thing And Hit Your Dates
maggiecrowley
36
2.8k
Site-Speed That Sticks
csswizardry
10
650
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Making the Leap to Tech Lead
cromwellryan
134
9.3k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.8k
KATA
mclloyd
29
14k
Transcript
Crocos, Inc. Sotaro Karasawa @sotarok http://facebook.com/sotarok εΩʔϚϨεͳ ϩάσʔλͷ ऩूͱूܭͷͨΊͷ σβΠϯύλʔϯ
#ds2013 ·ͨ Treasure Data ϋΠύʔ׆༻ज़
ࣗݾհ 4PUBSP,BSBTBXB!TPUBSPL ฑ૱ଠ EIBUFOBOFKQTPUBSPL גࣜձࣾΫϩίε$SPDPT*OD 1)1 3FE#VMM
ࣾһਓͰۀ ։ൃऀ࣌ਓ ݄ʹαʔϏεϩʔϯν ݄ʹ5%ಋೖ
ࠓ͍ͨ͜͠ͱ ΞϓϦέʔγϣϯϩάΛͲ͏ू ΊΔ͔ εΩʔϚϨεͳϩάͱ جຊతͳϩάઃܭ ϩάऩूͷσβΠϯύλʔϯ
ࠓͷత ใڞ༗ɺใަ ͏ͪͰ͜͏ͬͯΔΑɺͱ͍͏Ұྫ ܾͯ͠ߨࢣͱͯ͠ɺ͜͏Γ·͠ΐ͏ͱݴ ͍ʹདྷͨΘ͚Ͱͳ͘ɻ ࠓޙɺ͜͏͍͏ωλ͕σΟεΧογϣϯͰ ͖Ε͍͍ͳͱ
ओʹ 8FCΞϓϦέʔγϣϯ ͷ Ͱ͕͢ɺ8FCΞϓϦέʔγϣϯଟ༷Խ͠ ͍ͯ·͢ ޙͷσβΠϯύλʔϯͷͳ͔Ͱ͍͔ͭ͘ ৮ΕΒΕΔ͔ͳʁ
2ϩάऩूΛ͍ͯ͠Δ 2qVFOUEΛ͍ͬͯΔ 25%Λ͍ͬͯΔ
ͲΜͳϩάΛूΊͯΔʁ
8FCαʔόͷϩά
ϩάͱ͍͑ 8FCαʔόʔͷϩά 5SFBTVSF%BUBͷνϡʔτϦ Ξϧ"QBDIFͷϩά http://docs.treasure-data.com/articles/quickstart
͚ͩͲຊʹཉ͍͠ͷ
ͲΜͳϢʔβʔ͕ʁ ͲΜͳͰʁͲ͔͜Βʁ ͍ͭԿΛͨ͠ͷ͔ʁ ͲΜͳϘλϯΛΫϦοΫͨ͠ ͷ͔ʁλοϓͨ͠ͷ͔ʁ
ΞϓϦέʔγϣϯϩά
ͲΜͳϢʔβʔ͕ʁ ɹˠϢʔβʔొใ ͲΜͳͰʁͲ͔͜Βʁ ɹˠ6"(&0 ͍ͭԿΛͨ͠ͷ͔ʁ ɹˠ63*ΞΫγϣϯ
ΞϓϦέʔγϣϯϩάΛ Ͳ͏ूΊΔ͔
εΩʔϚϨεϩάͱʁ
εΩʔϚϨεϩάͱʁ εΩʔϚͷແ͍ϩά
ϩάͷεΩʔϚ ͜Ε·Ͱ ˠྫ͑547
ΧϥϜUJNF ΧϥϜTUBUVT ΧϥϜVSJ ΧϥϜVTFS@JE JOEFY
ΧϥϜUJNF ΧϥϜTUBUVT ΧϥϜVSJ ΧϥϜVTFS@JE JOEFY εΩʔϚ
for line in open('app.log', 'r'): columns = line.split("\t") time =
columns[0] ...
߲ͷΘ͔ΓͮΒ͞ εΩʔϚมߋͷ͠͞ ੳऀͱऩूऀͷೝࣝࠩҟʹ ΑΔࣄނ
5%ͷϩά ͱ͍͏͔qVFOUE +40/ { "time":1373876885, "status":200, "request_uri":"/52495/facebook", "session_id":"kn6avn2fuh21r25a65mgm3rjh3", "fb_id":"7c40c5dd2e55cde37a8c40ed80e1", ...
}
Θ͔Γ͍͢ ߲ΛՃͰ͖Δ σʔλྔ૿͑Δɾ
ΞϓϦέʔγϣϯϩάΛ Ͳ͏ूΊΔ͔
جຊతͳϩάઃܭ
ΠϕϯτϨίʔυͱͳΔΑ ͏ʹه͢Δ ˞8FCΞϓϦέʔγϣϯͷ߹ɺΞΫηε
Πϕϯτͱ 8FCΞϓϦέʔγϣϯͳΒ ɾΞΫηε ωΠςΟϒΞϓϦͳΒ ɾΠϕϯτ
جຊతͳεΩʔϚΛܾΊΔ
εΩʔϚϨεͱ͍ͬͯ Ͳ͏͍͏ϩάΛѻ͍ͬͯΔͷ͔ ֤ϨίʔυͰҙຯ͕ҧͬͯҙ ຯ͕ແ͍
جຊతͳεΩʔϚΛܾΊΔ UJNF TUBUVT VSJ VB SFGFSSFS
جຊతͳεΩʔϚΛܾΊΔ UJNF TUBUVT VSJ VB SFGFSSFS LTSVͬΆ໊͍લʹ ߹Θ͓ͤͯ͘ͱΘ ͔Γ͍͔͢
8FCαʔόʹ͋Δϩά ͚ͩͰͳ͘ BQQ SPVUF DPOUSPMMFS QSPDFTT@UJNF EFWJDF
8FCαʔόʹ͋Δϩά ͚ͩͰͳ͘ BQQ SPVUF DPOUSPMMFS QSPDFTT@UJNF EFWJDF ϑϨʔϜϫʔΫͰͷϧʔ ςΟϯά໊ͱ͔ɺίϯτ
ϩʔϥ໊ͱ͔ (uri ʹϊΠζ͕͋ͬͯ routing ໊ͰूܭͰ͖Δ)
ΞϓϦέʔγϣϯͷΓ͏Δ ଐੑΛඇਖ਼نԽͯ͠Ϩίʔυ ʹؚΊΔ
ඇਖ਼نԽ͞ΕͨϨίʔυ TFTTJPO@JE VTFS@JE HFOEFS BHF EFWJDF
ͳͥඇਖ਼نԽ͔ͷϝϦοτ +0*/ͤͣʹूܭؔʹ͔ΔͨΊ
ͪͳΈʹ VTFS@JE TFTTJPO@JE ͳͲIBTIԽ͓ͯ͘͠
·ͱΊΔͱ ΠϕϯτϨίʔυͱͳΔΑ͏ ʹه͢Δूܭؔʹ͔ΔͨΊ جຊతͳεΩʔϚΛܾΊΔ ΞϓϦέʔγϣϯͷΓ͏Δଐ ੑΛඇਖ਼نԽͯ͠ϨίʔυʹؚΊΔ
͜͜·ͰདྷΔͱɺ͏ੳ͕Մೳ
ੳͷྫ SELECT AVG(v[‘process_time’]) FROM access WHERE v[‘route’] = ‘crocos_index’
ੳͷྫ SELECT v[‘gender’], COUNT(*) FROM access GROUP BY v[‘gender’] ඇਖ਼نԽ͓͍ͯ͠
ͯΑ͔ͬͨʂ
ੳͷྫ SELECT v[‘gender’], COUNT(*) FROM access GROUP BY v[‘gender’]
ੳͷྫ Τϥʔͷௐࠪʹ SELECT v[‘route’], v[‘status’], v[‘ua’] FROM access WHERE v[‘user_id’]
= ‘xxx’
˞͘ͳΔͷͰؔ࿈ͷॲཧলུͯ͠·͢ ɹຊผʹ(3061#:ͨ͠Γ8&)&3۟ͰߜͬͨΓ
εΩʔϚϨεͳ ΞϓϦέʔγϣϯϩά ͷͨΊͷ σβΠϯύλʔϯ Λߟ͑Δ
ͯ͞ جຊతͳεΩʔϚΛ࣋ͭ ϩά͕ͨ·Γ࢝Ί·ͨ͠
͔͜͜Βઌ ԿΛੳΛ͍ͨ͠߹ʹ ͲΜͳϩάΛೖΕ͓͚ͯྑ ͍͔ ύλʔϯʹ͚ͯߟ͑·͢
εΩʔϚϨεͷग़൪
جຊతͳεΩʔϚ UJNF TUBUVT VSJ VB SFGFSSFS ͳΜͪΌΒ ͔ΜͪΌΒ
جຊతͳεΩʔϚ UJNF TUBUVT VSJ VB SFGFSSFS ͳΜͪΌΒ ͔ΜͪΌΒ ಛఆͷϨίʔυʹɺಛ
ผͳҙຯΛͨͤΔ͜ͱ ͕Ͱ͖Δʂ ͔͠ଞͷϨίʔυʹӨ ڹΛ͋ͨ͑Δ͜ͱͳ͘ɻ
ύλʔϯ τϥϯβΫγϣϯ
ಛผͳҙຯΛ࣋ͭ ΞΫγϣϯͷޭͳͲΛ ه͍ͨ͠
τϥϯβΫγϣϯ uri route: ϦΫΤετ͕དྷͨ͜ͱΘ͔Δ ͔͠͠ɺຊʹޭ͔ͨ͠ɺ ΞϓϦέʔγϣϯͰ͔͠Θ͔Β ͳ͍
τϥϯβΫγϣϯ key_action key_attr_*
τϥϯβΫγϣϯ key_action present:entry:completed ΞϓϦ:ಈ࡞:ঢ়گ ※͜ͷྫʮొྃʯ
τϥϯβΫγϣϯ key_attr_* τϥϯβΫγϣϯʹؔΘΔՃ తͳใΛͭͬ͜Ή εΩʔϚɺkey_action ͝ͱʹ ҟͳΔ
τϥϯβΫγϣϯྫ key_action = shop:register:completed key_attr_user_id = xxxxx key_attr_ref = fb_share
τϥϯβΫγϣϯੳͷྫ SELECT v[‘key_attr_ref’], COUNT(*) FROM access WHERE v[‘key_action’] = ‘...’
GROUP BY v[‘key_attr_ref’]
τϥϯβΫγϣϯੳ ࠷ۙΑ͘ݟͯΔσʔλ ... Ͳͷࢪࡦ͕Ұ൪ޮ͍ͨͷ͔
ύλʔϯ Πϕϯτ
ΞΫηεʹґଘ͠ͳ͍ ΠϕϯτͷൃੜΛΓ͍ͨ
ɾ+BWB4DSJQUʹΑΔΠϕϯτ ɾϞʔμϧͷදࣔ ɾ5XJUUFS'BDFCPPLͷ γΣΞ ɾωΠςΟϒΞϓϦ
Πϕϯτ tag = app:action:location & some attributes
Πϕϯτྫ tag = shop:tweet:shop_item item_id = 1234 tweet_id = xxxxx
Πϕϯτੳͷྫ SELECT v[‘item_id’], COUNT(*) FROM events WHERE v[‘tag’] = ‘shop:tweet:shop_item’
GROUP BY v[‘item_id’]
τϥϯβΫγϣϯͱ ࣮Έ͔ΘΒͳ͍
εΩʔϚϨεϩάͷѻ͍ํͰ ࠷ॏཁͳͷ ղऍͷϧʔϧΛܾΊΔ͜ͱ
ଟ͕࣌ؒແ͍ͷͰ ͜ͷΜͰ
͜͏͍͏࣌ʹ ͜͏͍͏෩ʹσʔλͷूΊͯ ͜͏ղੳ͠Α͏ ͱ͍͏ͷΛڞ༗͍ͨ͠
ҙ͍ͨ͠ͱ͜Ζ
εΩʔϚϨεͱ͍͑Ͳ ࣄલͷϩάઃܭΛ͔ͬ͠ΓΔ ϩάҰೖΕΔͱมߋ͕͍͠ ˠੳ͍߲ͨ͠ͷ࿙Ε͕ແ͍͔ ϓϥΠόγʔʹؾΛ͚ͭΔ
None