Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Data Science for PHP Users
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Sotaro Karasawa
September 14, 2013
Technology
15k
5
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Introduction to Data Science for PHP Users
PHPカンファレンス2013「PHPerのためのデータサイエンス入門」 #phpcon2013
Sotaro Karasawa
September 14, 2013
More Decks by Sotaro Karasawa
See All by Sotaro Karasawa
「事業目線」の正体 〜3つのフェーズのCTO経験から見えてきた、EMが持つべき視点 @ EMConf JP 2026
sotarok
9
7.8k
大「個人開発サービス」時代に僕たちはどう生きるか
sotarok
22
13k
P2B Haus法人サポータープランのご提案
sotarok
2
1.7k
ソフトウェアxスタートアップから見た飲食と配送の世界 / The World of Food Deliverlies and Restaurant Businesses from a Software and Startup Perspective
sotarok
2
1.3k
CTO 3度目の正直 / My 3rd CTO Career
sotarok
21
11k
Introduction to the Corporate Solutions Engineering at MTC2018
sotarok
1
36k
Mercari meetup for Corporate Engineering #1 / What is "Corporate Engineering"?
sotarok
2
2.5k
Markdown and WYSIWYG
sotarok
1
6.5k
20 Jan 2017 / Moving Beyond Borders - Mercari DAY
sotarok
8
16k
Other Decks in Technology
See All in Technology
入門!AWS Blocks
ysuzuki
1
160
2026TECHFRESH畢業分享會 - 葬送的通靈師:化系統與用戶雜訊成行動訊號
line_developers_tw
PRO
0
1.3k
アンオフィシャルな、オフィシャルからのお願い
wyamazak_devrel
0
140
Agent Skills設計で柔軟性と硬さのバランスが難しい話
nassy20
0
140
AIAU_UMEMOGU_ninomiya_slide
ninomiya_ii
0
240
SONiC Scale-Up Working Group から探る Scale-UpやUltraEthernet機能の実装方法
ebiken
PRO
2
410
自分が詳しくない領域でAIを使う #プロヒス2026
konifar
13
5.3k
【セミナー資料】Claude Code をセキュアに使うための考え方と設定の勘どころ / Claude Code Webinar 20260616
masahirokawahara
2
420
フィジカル版Github Onshapeの紹介
shiba_8ro
0
290
脆弱性対応、どこで線を引くか
rymiyamoto
1
420
人材育成分科会.pdf
_awache
4
300
【2026年版】 ベクトル検索とEmbedding最前線
mocobeta
16
4.5k
Featured
See All Featured
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
440
ラッコキーワード サービス紹介資料
rakko
1
3.7M
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
38
2.9k
The #1 spot is gone: here's how to win anyway
tamaranovitovic
2
1.1k
Music & Morning Musume
bryan
47
7.2k
Raft: Consensus for Rubyists
vanstee
141
7.5k
Making Projects Easy
brettharned
120
6.7k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
290
Docker and Python
trallard
47
3.9k
What's in a price? How to price your products and services
michaelherold
247
13k
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
1.1k
Exploring the relationship between traditional SERPs and Gen AI search
raygrieselhuber
PRO
2
4k
Transcript
Crocos, Inc. Sotaro Karasawa @sotarok http://facebook.com/sotarok 1)1FSͷͨΊͷ σʔλαΠΤϯεೖ QIQDPO 1)1ΧϯϑΝϨϯε
ࣗݾհ 4PUBSP,BSBTBXB!TPUBSPL ฑ૱ଠ EIBUFOBOFKQTPUBSPL גࣜձࣾΫϩίε$SPDPT*OD 1)1 (JU 5% 3FE#VMM
ύʔϑΣΫτ1)1 ٕज़ධࣾ વΈͳ͞Μ࣋ͬͯ·͢ΑͶʂʁ ˡ
σʔλαΠΤϯε
ৄ͍͜͠ͱ σʔλαΠΤϯςΟετ ཆಡຊ ٕज़ධࣾ IUUQXXXBNB[PODPKQEQ
σʔλαΠΤϯε ۀཧղ σʔλཧղ σʔλநग़ σʔλՃ ϞσϦϯά ޮՌݕূ αʔϏε࣮ Ҿ༻σʔλαΠΤϯςΟετཆಡຊ 1ୈষσʔλαΠΤϯεͷϓϩηε
σʔλαΠΤϯε ੵ͞ΕͨσʔλΛੳɾϞσϦϯάͯ͠ ϏδωεΛߦ͢ΔͨΊʹॏཁͳ ࢦඪΛಘΔ Λ܁Γฦ͢
σʔλαΠΤϯε ੵ͞ΕͨσʔλΛੳɾϞσϦϯάͯ͠ ϏδωεΛߦ͢ΔͨΊʹॏཁͳ ࢦඪΛಘΔ Λ܁Γฦ͢ Βͳ͚Ε͍͚ͳ͍͜ͱ͕ଟ͍ ࣝͷྖҬɾ෯͕͍
࠷ݶͷͱ͜Ζ͔Β खܰʹ࢝ΊΒΕΔͱ͜Ζ͔Β ࠷ॳͷาΛ;Έͩͦ͏
σʔλαΠΤϯε ۀཧղ σʔλཧղ σʔλநग़ σʔλՃ ϞσϦϯά ޮՌݕূ αʔϏε࣮ Ҿ༻σʔλαΠΤϯςΟετཆಡຊ 1ୈষσʔλαΠΤϯεͷϓϩηε
1)1FS 8FCΞϓϦέʔγϣϯʹͱͬͯ σʔλͱԿ͔
1)1FS 8FCΞϓϦέʔγϣϯʹͱͬͯ σʔλͱԿ͔ σʔλϕʔε ϩά
ࠓճϩάͷ͓
େྔͷΞϓϦέʔγϣϯϩάΛ ͍͔ʹऩू͠ ͲͷΑ͏ʹूܭ͢Δ͔
ͦΕΛ౿·͑ͯ ࠓͷΞδΣϯμ ϩάऩूͱੳͷΈ 1)1ΞϓϦέʔγϣϯͷϩάऩू ੳ
ϩάͷऩूͱੳͷΈ
Έͷਚ͖ͳ͍ ϩάͷऩूͱੳ େྔͷσʔλ Ͳ͏ूΊΔ Ͳ͜ʹஷΊΔ Ͳ͏औΓग़͢ Ͳ͏ूܭ͢Δ
Έͷਚ͖ͳ͍ ϩάͷऩूͱੳ େྔͷσʔλ Ͳ͏ूΊΔ Ͳ͜ʹஷΊΔ Ͳ͏औΓग़͢ Ͳ͏ूܭ͢Δ ωοτϫʔΫଳҬ σΟεΫ༰ྔ Ϗοάσʔλॲཧܥ
ॲཧ࣌ؒ
IUUQXXXUSFBTVSFEBUBDPN
TD Web Server Web Server fluentd S3 Hadoop Client Hive
MySQL etc... Result
TD Web Server Web Server fluentd S3 Hadoop Client Hive
MySQL etc... Result ͋ͬͪଆʹσʔλ͕ஷ·ΓɺΫΤ ϦΛ͛Δͱ͋ͬͪͰ)BEPPQ ͕ىಈͯ݁͠ՌΛฦͯ͘͠ΕΔ
ϩάੳΛਐΊΔʹ͋ͨΓ հͳɺσʔλͷऩूɾੵɾσʔλॲཧ ɹˠ5%͕ͬͯ͘ΕΔ ຊ࣭తͳۀ ɾͲͷΑ͏ͳσʔλ ɾͲͷΑ͏ʹूܭ ͷઃܭɾ࣮ʹίϛοτͰ͖Δʂ
$SPDPTʹ͓͚Δϩάͷ׆༻ wΞϓϦέʔγϣϯϩά w'BDFCPPLͷଐੑใʹجͮ͘ੳ wओཁͳΞΫγϣϯͷ࣮ߦ࣮ߦ࣌ؒ wτϥϯβΫγϣϯɾଐੑผɾܦ࿏ผ wΠϕϯτϩά wιʔγϟϧͷγΣΞ w.PEBMͷ։ดFUD wͦͷଞΖΖ
1)1ΞϓϦέʔγϣϯͷ ϩάऩू
ͲΜͳΞϓϦέʔγϣϯϩά جຊతͳϩάઃܭ
ͲΜͳϩάΛूΊͯΔʁ
8FCαʔόͷϩά
ϩάͱ͍͑ 8FCαʔόʔͷϩά 5SFBTVSF%BUBͷνϡʔτϦ Ξϧ"QBDIFͷϩά http://docs.treasure-data.com/articles/quickstart
͚ͩͲຊʹཉ͍͠ͷ
ͲΜͳϢʔβʔ͕ʁ ͲΜͳͰʁͲ͔͜Βʁ ͍ͭԿΛͨ͠ͷ͔ʁ ͲΜͳϘλϯΛΫϦοΫͨ͠ ͷ͔ʁλοϓͨ͠ͷ͔ʁ
ΞϓϦέʔγϣϯϩά
ͲΜͳϢʔβʔ͕ʁ ɹˠϢʔβʔొใ ͲΜͳͰʁͲ͔͜Βʁ ɹˠ6"(&0 ͍ͭԿΛͨ͠ͷ͔ʁ ɹˠ63*ΞΫγϣϯ
ΞϓϦέʔγϣϯϩάΛ Ͳ͏ूΊΔ͔
ͦͷલʹ ܰ͘εΩʔϚϨεϩάʹ͍ͭͯ
εΩʔϚϨεϩάͱʁ εΩʔϚͷແ͍ϩά
ϩάͷεΩʔϚ ͜Ε·Ͱ ˠྫ͑547
ΧϥϜUJNF ΧϥϜTUBUVT ΧϥϜVSJ ΧϥϜVTFS@JE IPHF εΩʔϚ
foreach (file('app.log') as $line) { $column = explode("\t", trim($line)); $time
= $column[0]; $status = $column[1]; ... } ˞࣮ࡍʹ1)1ͳΜ͔ͰͬͯΒΕͳ͍ͷͰTFEBXLͰ
߲ͷΘ͔ΓͮΒ͞ εΩʔϚมߋͷ͠͞ ੳऀͱऩूऀͷೝࣝࠩҟʹ ΑΔࣄނ
5%ͷϩά ͱ͍͏͔qVFOUE +40/ { "time":1373876885, "status":200, "uri":"/52495/facebook", "session_id":"kn6avn2fuh21r25a65mgm3rjh3", "fb_id":"7c40c5dd2e55cde37a8c40ed80e1", ...
}
ϩάͷ1045
qVFOUQIQMPHHFS use Fluent\Logger\FluentLogger; $logger = new FluentLogger("localhost","24224"); $logger->post( "debug.test", array("hello"=>"world")
); IUUQTHJUIVCDPNqVFOUqVFOUMPHHFSQIQ
جຊతͳϩάઃܭ
ΞΫηεϨίʔυͱͳΔΑ ͏ʹه͢Δ
Ϩεϙϯεʹͻ͔͚ͬΔ ϑϨʔϜϫʔΫʹ͍͍ͩͨ ϨεϙϯεΠϕϯτͷϑοΫϙΠϯτ͋ΔΑͶʁ 4ZNGPOZͳΒ PO,FSOFM3FTQPOTF
tags: - { name: kernel.event_listener, event: kernel.response } public function
onKernelResponse(FilterResponseEvent $event) { $request = $event->getRequest(); $response = $event->getResponse(); // ͳΜ͔ྻͭͬͯ͘ $data = $this->onAccess($request, $response); // log data $this->logger->post("access",$data); } ˞࣮ࡍʹͬͱෳͷ-JTUFOFS-PHHFS͕ొͰ͖ΔΑ͏ʹͯ͋͠Γ·͕͢
جຊతͳεΩʔϚΛܾΊΔ
εΩʔϚϨεͱ͍ͬͯ Ͳ͏͍͏ϩάΛѻ͍ͬͯΔͷ͔ ֤ϨίʔυͰҙຯ͕ҧͬͯҙ ຯ͕ແ͍
جຊతͳεΩʔϚΛܾΊΔ UJNF TUBUVT VSJ VB SFGFSSFS LTSVͬΆ໊͍લʹ߹Θͤͯ ͓͘ͱΘ͔Γ͍͔͢
8FCαʔόʹ͋Δϩά ͚ͩͰͳ͘ BQQ SPVUF DPOUSPMMFS QSPDFTT@UJNF EFWJDF ϑϨʔϜϫʔΫͰͷ ϧʔςΟϯά໊ͱ͔ɺ
ίϯτϩʔϥ໊ͱ͔ (uri ʹϊΠζ͕͋ͬͯ routing ໊ͰूܭͰ͖Δ)
ΞϓϦέʔγϣϯͷΓ͏Δ ଐੑΛඇਖ਼نԽͯ͠Ϩίʔυ ʹؚΊΔ
ඇਖ਼نԽ͞ΕͨϨίʔυ TFTTJPO@JE VTFS@JE HFOEFS BHF EFWJDF
ͳͥඇਖ਼نԽ͔ͷϝϦοτ +0*/ͤͣʹूܭؔʹ͔ΔͨΊ )BEPPQͰ+0*/Ͱ͖Δ͕ɺ ͜͏͓ͯ͘͠ͱఔ͕ݮΔ͔Β ͍ˍγϯϓϧ
ͪͳΈʹ VTFS@JE TFTTJPO@JE ͳͲIBTIԽ͓ͯ͘͠ͱྑ͍ ˞ສҰͷͱ͖ͷϓϥΠόγʔʹ ྀ
·ͱΊΔͱ ΞΫηεϨίʔυͱͳΔΑ͏ ʹه͢Δ جຊతͳεΩʔϚΛܾΊΔ ΞϓϦέʔγϣϯͷΓ͏Δଐ ੑΛඇਖ਼نԽͯ͠ϨίʔυʹؚΊΔ
͜͜·ͰདྷΔͱɺ͏ੳ͕Մೳ
ੳͷྫ SELECT AVG(v['process_time']) FROM access WHERE v['route'] = 'crocos_index'
ੳͷྫ SELECT v['gender'], COUNT(*) FROM access GROUP BY v['gender'] ඇਖ਼نԽ͓͍ͯ͠
ͯΑ͔ͬͨʂ
ੳͷྫ Τϥʔͷௐࠪʹ SELECT v['route'], v['status'], v['ua'] FROM access WHERE v['user_id']
= 'xxx'
˞͘ͳΔͷͰؔ࿈ͷॲཧলུͯ͠·͢ ɹຊผʹ(3061#:ͨ͠Γ8&)&3۟ͰߜͬͨΓ
εΩʔϚϨεϩάͷ׆༻ྫ τϥϯβΫγϣϯ
ͯ͞ جຊతͳεΩʔϚΛ࣋ͭ ϩά͕ͨ·Γ࢝Ί·ͨ͠
ಛผͳҙຯΛ࣋ͭ ΞΫγϣϯͷޭͳͲΛ ه͍ͨ͠
τϥϯβΫγϣϯ uri route: ϦΫΤετ͕དྷͨ͜ͱΘ͔Δ ͔͠͠ɺຊʹޭ͔ͨ͠ɺ ΞϓϦέʔγϣϯͰ͔͠Θ͔Β ͳ͍
εΩʔϚϨεͷग़൪
جຊతͳεΩʔϚ ՃͷεΩʔϚ UJNF TUBUVT VSJ VB SFGFSSFS ͳΜͪΌΒ ͔ΜͪΌΒ
ಛఆͷϨίʔυʹɺಛผ ͳҙຯΛͨͤΔ͜ͱ͕Ͱ ͖Δʂ ͔͠ଞͷϨίʔυʹӨڹ Λ͋ͨ͑Δ͜ͱͳ͘ɻ
τϥϯβΫγϣϯ key_action key_attr_*
τϥϯβΫγϣϯ key_action shop:buy:completed ΞϓϦ:ಈ࡞:ঢ়گ ※͜ͷྫʮߪೖྃʯ
τϥϯβΫγϣϯ key_attr_* τϥϯβΫγϣϯʹؔΘΔՃ తͳใΛͭͬ͜Ή εΩʔϚɺkey_action ͝ͱʹ ҟͳΔ
τϥϯβΫγϣϯྫ key_action = shop:buy:completed key_attr_item_id = xxxxx key_attr_ref = fb_share
τϥϯβΫγϣϯੳͷྫ SELECT item_id, ref, COUNT(*) FROM access WHERE key_action =
'shop:buy:completed' GROUP BY item_id, ref ˞จࣈͷ্ؔW<>ল͍ͯΔ
τϥϯβΫγϣϯੳ ׆༻ྫ: ࢪࡦผʹΞΫηεݩΛه τϥϯβΫγϣϯޭ͔Β ࠷ޮՌతͳࢪࡦΛݟ͚ͭΔ
/&9545&1
ूܭ݁Ռ͔Β ɾ౷ܭతղੳख๏ ɾϞσϦϯά Ϗδωεʹରͯ͠ΫϦςΟΧϧͳࢦඪ ͷࢉग़ͱվળϓϩηεͷཱ֬
·ͱΊ
ϩάΛूΊͨΓੳͨ͠Γ͢Δͷେม ɹ→ Fluentd Hadoop ͏ ɹ→ Treasure Data ͏
Ͳ͏͍͏ϩάΛूΊΕ͍͍ͷ͔ ɹ→ 1ΞΫηε1Ϩίʔυඇਖ਼نԽϩά ɹ→ ϩάϑΥʔϚοτࣗମͷઃܭ ɹ→ εΩʔϚϨεͷ׆༻
࠷ޙʹ 8FBSFIJSJOH ύʔϑΣΫτ1)1ஶऀਓ ݩ1)1ΧϯϑΝϨϯεҕһਓ ݩඇϞςਓ ݩυϥ່ਓ ͱಇ͚Δͷ$SPDPT͚ͩ
None