Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
機械学習とのつきあいかた / How to get involved with Machine...
Search
Makoto Tanji
April 24, 2019
Programming
8
39k
機械学習とのつきあいかた / How to get involved with Machine Learning @ Wantedly
2019年新人研修で使った資料です。
以下の内容を話した全体的にはポエムです。
1. Wantedlyの機械学習
2. MLと組織
3. MLとデータ
4. MLプロジェクの進め方
Makoto Tanji
April 24, 2019
Tweet
Share
More Decks by Makoto Tanji
See All by Makoto Tanji
BigQuery ML Hands-on
tan_z_tan
1
180
Jupyterで触れながら学ぶ 機械学習速習会
tan_z_tan
1
180
Other Decks in Programming
See All in Programming
バイラテラルアップサンプリング
fadis
3
640
医療系ソフトウェアのAI駆動開発
koukimiura
1
150
Browser and UI #2 HTML/ARIA
ken7253
2
190
REALITY コマンド作成チュートリアル
nishiuriraku
0
120
Design Pressure
hynek
0
180
“技術カンファレンスで何か変わる?” ──RubyKaigi後の自分とチームを振り返る
ssagara00
0
170
Embracing Ruby magic
vinistock
2
300
generative-ai-use-cases(GenU)の推しポイント ~2025年4月版~
hideg
1
440
私のRubyKaigi 2025 Kaigi Effect / My RubyKaigi 2025 Kaigi Effect
chobishiba
1
180
Designing Your Organization's Test Pyramid ( #scrumniigata )
teyamagu
PRO
5
1.7k
파급효과: From AI to Android Development
l2hyunwoo
0
170
MySQL初心者が311個のカラムにNot NULL制約を追加していってALTER TABLEについて学んだ話
hatsu38
2
150
Featured
See All Featured
Designing Experiences People Love
moore
142
24k
Optimizing for Happiness
mojombo
378
70k
The Straight Up "How To Draw Better" Workshop
denniskardys
233
140k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
Balancing Empowerment & Direction
lara
0
49
Thoughts on Productivity
jonyablonski
69
4.6k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
180
53k
KATA
mclloyd
29
14k
The World Runs on Bad Software
bkeepers
PRO
68
11k
How GitHub (no longer) Works
holman
314
140k
How to train your dragon (web standard)
notwaldorf
91
6k
Unsuck your backbone
ammeep
671
58k
Transcript
ػցֶशͱͷ͖͍͔ͭ͋ͨ @Wantedly New Grad Training 2019 April 24, 2019 -
Makoto Tanji (@tan-z-tan)
©2019 Wantedly, Inc. • ML Engineer at Wantedly, Inc. •
Wantedly Visit 2015 - 2016 • Client Growth, Scout • Wantedly People 2016 - current • Server side • Machine Learning • Data Analysis ࣗݾհ Who am I? Makoto Tanji
©2019 Wantedly, Inc. ࠓ͢͜ͱ
©2019 Wantedly, Inc. 1. ςʔϚ 2. Wantedlyͷػցֶश 3. Take home
messages 1. MLͱ৫ 2. MLͱσʔλ 3. MLϓϩδΣΫͷਐΊํ 4. ·ͱΊ Contents
©2019 Wantedly, Inc. ϝσΟΞʹࡌΔΑ͏ͳ࠷৽ٕज़ͷػցֶशͷ׆༻՚͔Ͱ͕͢ɺ࣮ӡ༻ ͷͨΊʹ࣮ࡍେͳσʔλ͕ඞཁͩͬͨΓɺਫ਼͕ग़ͳ͔ͬͨΓɺ҆ ఆͨ͠ӡ༻͕͍͠ͱݴ͏ࠔଘࡏ͠·͢ɻ 8BOUFEMZͷ.-ΤϯδχΞɺ8&#ΤϯδχΞɾσʔλαΠΤϯςΟε τɾΠϯϑϥͱڠಇ͠ͳ͕Β͜ͷʹऔΓΜͰ͍·͢ɻ ·ͨকདྷ8&#ΤϯδχΞΠϯϑϥΤϯδχΞػցֶशͱԿΒ͔ͷܗͰ ؔΘΔ͜ͱ͕ࠓΑΓ૿͖͑ͯ·͢ɻ
8BOUFEMZͰ࣮ࡍʹಈ͍͍ͯΔػցֶशΛݟͳ͕Βػցֶशͱͷ͖͍ͭ͋ ͔ͨΛߟ͑ͯߦ͖·͠ΐ͏ɻ ςʔϚ
©2019 Wantedly, Inc. શମతʹػցֶशʹ·ͭΘΔϙΤϜͰ͢
©2019 Wantedly, Inc. Wantedlyͷػցֶश׆༻ࣄྫ Machine Learnings working at Wantedly
©2019 Wantedly, Inc. Wantedlyͷػցֶश׆༻ࣄྫ
©2019 Wantedly, Inc. Wantedlyͷػցֶश׆༻ࣄྫ 7JTJU w ืूͷਪન w 1VTI௨ͷύʔιφϥΠζ
w εΧτϑΟϧλͷվળ w ʜ 1FPQMF w ໊ը૾ͷ0$3ͱྨ w ਓͷਪન w هࣄͷਪનɺ1VTIͷ։෧༧ଌ w ༷ʑͳݻ༗දݱநग़ w ʢػցֶशͷϦϙδτϦ.-λά͕͍ͭͯ·͢ʣ w IUUQTHJUIVCDPNXBOUFEMZ VUG&$RNMUZQFMBOHVBHF
©2019 Wantedly, Inc. ͳͥػցֶशΛಋೖ͢Δͷ͔ʁ
©2019 Wantedly, Inc. Wantedlyͷػցֶश׆༻ࣄྫ ఏڙ͍ͨ͠Ձ͕͋Δ͔Β • εΩϟϯ͔ΒඵͰσʔλԽͱ͍͏ମݧΛఏڙ͍ͨ͠ w ໊Λຕຕ࿈བྷઌʹҠ͢ͷਏ͍ˠͦΕਓؒͷࣄ͡Όͳ͍Ͱ͢ɻΧϝϥͰࡱͬͨΒҰʹσʔλԽ
• ίίϩΦυϧγΰτʹग़ձ͏ମݧΛఏڙ͍ͨ͠ w ͳΜ͔͍͍ࣄͳ͍͔ͳʁˠ͋ͳ͕ͨڵຯ͕͋Γͦ͏ͳืू͔ΒࣗಈͰਪન͠·͢Α
©2019 Wantedly, Inc. ࣾͰಈ͍͍ͯΔ.-ͷߏྫ
©2019 Wantedly, Inc. ࣾͰಈ͍͍ͯΔMLͷߏྫ ֶशࡁΈͷ"1*ύλʔϯ ਪનύλʔϯ ͦͷଞ
©2019 Wantedly, Inc. ֶशࡁΈͷAPIύλʔϯ ϚΠΫϩαʔϏεͷαʔόͷ"1*ͱͯ͠ఏڙ 1. OCRʢจࣈೝࣝʣAPIɺςΩετ͔Βͷநग़APIɺಉҰੑఆAPIͳͲࢁ 2. લֶͬͯशͨ͠ϞσϧΛಈ͔ͯ݁͠ՌΛฦ͢
3. e.g. Peopleͷ໊ೝࣝؔ࿈API͜ͷྫ͕ଟ͍ ߟ͑Δ͜ͱ • ਫ਼͕Βͳ͍͏ͪʹམ͍ͪͯͳ͍͔ • ϝτϦΫεΛຖऔΔɻ৽͘͠APIΛ࡞Δͱ͖νΣοΫϦετʹೖΕΔ • ίετ • CPUʹͳ͍ͬͯΔ͜ͱ͕ଟ͘ɺkubernetesͷϦιʔεΛେྔʹͬͯͳ͍͔
©2019 Wantedly, Inc. ਪનύλʔϯ લ͓ͬͯ͢͢ΊείΞΛܭࢉɾఏڙ 1. Ϣʔβ͕ϖʔδʹΞΫηεͨ͠ͱ͖ʹҰॠͰΦεεϝͷืूΛग़͍ͨ͠ 2. Ϣʔβͷաڈͷߦಈ͔Βࣄલʹܭࢉ͓ͯ͘͠
3. Ϣʔβ×ืूͷ݁ՌΛσʔλετΞʹอଘ ߟ͑Δ͜ͱ • ʮϢʔβ×ืूʯ͕͍ΔσʔλετΞ • ຖͷֶशJob͕ࢭ·͍ͬͯͳ͍͔ʁ • Job͕ଟஈʹͳΔͱಛʹΘ͔Γʹ͘͘ͳΔˠData Pipelineͷ
©2019 Wantedly, Inc. ͦͷଞ σʔλͷਖ਼نԽ 1. ࡶଟͳใΛਖ਼نԽɾಉҰੑͳͲΛఆ͢Δ 2. ࣙॻΛ࡞Δࣄ
3. Ϣʔβʹݟ͑ͳ͍͚Ͳ෦Ͱݡ͘ͳ͍ͬͯΔ #JH2VFSZ.- 1. ෦ͷϩδοΫͰ͍ͬͯΔ 2. ࣮ΘΕͯ·͢
©2019 Wantedly, Inc. ࠓͷTake-Home Messages
©2019 Wantedly, Inc. 1. ৫ͷ 2. σʔλͷ 3. ϓϩδΣΫτͷਐΊํͷ
©2019 Wantedly, Inc. ৫ͷ
©2019 Wantedly, Inc. ʮ৫ߏઓུΛܾΊΔʯ ΠΰʔϧɾΞϯκϑ ʢ͔͍͍ͬ֨͜ݴΛݴ͍͔͚ͨͬͨͩʣ
©2019 Wantedly, Inc. ৫ͷ 8BOUFEMZͰɺϓϩμΫτνʔϜʹػցֶश ΤϯδχΞ͕͍Δ MLΤϯδχΞ ̏ਓ MLΤϯδχΞ ̎ਓ
Infrastructure Visit People
©2019 Wantedly, Inc. ৫ͷ Ϣʔβ·Ͱͷڑ͕͍ۙ 1. ߦͬͨվળ͕μΠϨΫτʹϢʔβ·Ͱಧ͘ ϓϩμΫτʹՁͷ͋ΔվળΛߦ͍͍͢
1. ϓϩμΫτͷ͔͍ͬͯΔํΛڞ༗͍ͯ͠ΔͷͰᴥᴪ͕ى͜ΓͮΒ͍ 2. ඞཁͰ͋ΕϓϩμΫτͷϏδωεϩδοΫ·ͰखΛೖΕΒΕΔ վળ͚ͩͰͳ͘৽͍͠ػೳΛࣗવͱٞ͢Δ 1. ։ൃνʔϜͷҰһͳͷͰతͳվળ͚ͩͰͳ͘ʮ͜͏͋Δ͖ʯํΛٞͯ͠ਐΊΒΕΔ
©2019 Wantedly, Inc. ৫ͷ ଞͷ৫ͷ͋Γํ σʔλαΠΤϯςΟετ ΤϯδχΞ σβΠφ Ϛωʔδϟ શࣾԣஅ
MLνʔϜ ϓϩμΫτνʔϜ શࣾԣஅ σʔλੳνʔϜ MLΤϯδχΞ ґཔ ݁Ռ
©2019 Wantedly, Inc. ৫ͷ ଞͷ։ൃνʔϜͱͲ͏ؔΘΔͷ͔ʁ 1. ։ൃελΠϧ: APIΛఆٛͯ͠WEBଆͱMLଆͰฒߦͯ͠։ൃ 1. e.g.
ϢʔβʹෳͷهࣄΛฦ͍ͨ͠ɻهࣄͷϦετWEBଆͰऔಘ͢ΔɻMLͷAPIΛݺͼग़ͯ͠ॱংΛܾఆ͢Δ 2. ؔΘΓํ 1. WEBଆ: લͬͯAPI spec͚ܾͩΊ͓ͯ͘ͱɺSTUBͯ͠MLଆΛͨͣʹ։ൃՄೳɻ 2. Πϯϑϥ: service-in͢ΔલʹɺෛՙͷϝτϦΫεͱ࠷ѱࢮΜͰ͍͍API͔ΫϦςΟΧϧͳAPI͔Λ͢Γ߹Θͤ 3. ίϛϡχέʔγϣϯ 1. ࣄલʹʮਫ਼ͷظʯΛ͢Γ߹ΘͤΔͱ͍͍ɻ 2. ݡ͍API͕αΫοͱͰ͖Δ͜ͱ΄΅ແ͍ɻ99.9ˋͷਫ਼͕ඞཁͳཁ݅ͳͷ͔ɺݟͤํΛڞ༗͢Δ
©2019 Wantedly, Inc. 8BOUFUEMZͰɺϓϩμΫτνʔϜʹ.-ΤϯδχΞ͕ೖΔͱ͍͏ಇ͖ํΛ औ͍ͬͯ·͢ɻϢʔβʹۙ͘ɺϓϩμΫτυϦϒϯͳಇ͖ํ͕͍͢͠ͱ ͍͏ϝϦοτ͕͋Γ·͢ɻ ϓϩμΫτνʔϜͷதͰɺࣄલʹʮޓ͍ͷൣғʯͱʮਫ਼ͷظʯ Λ͖ͬΓͤ͞Δͱࠞཚͳ͘։ൃ͕ਐΈ·͢ɻ ৫ͷ
©2019 Wantedly, Inc. %FFQ-FBSOJOHྠߨձͬͯ·͢ ຖिਫ༵ʙ!8BOUFEMZ
©2019 Wantedly, Inc. σʔλ
©2019 Wantedly, Inc. σʔλ ໘ന͍ྫ͑ʮσʔλݪ༉ʯ ͦͷ··Ͱ͑ͳ͍ɻਫ਼͕ඞཁ Ճ͢Δͱ༷ʑͳ༻్ʹ͑Δ
ੈքதͰσʔλͷऔΓ߹͍Λ͍ͯ͠Δ ग़లɿෆ໌
©2019 Wantedly, Inc. Wantedlyͷ໘ന͍σʔλ w ϢʔβͷϓϩϑΟʔϧඦສΦʔμʔ w Ϣʔβͷͭͳ͕ΓωοτϫʔΫΤοδԯΦʔμʔ w ಡΈऔΒΕ໊ͨԯ͘Β͍
©2019 Wantedly, Inc. Data lake ඞཁͳσʔλ#JH2VFSZʹ ूͰ͖ΔΈ͕͋Δ "OBMZUJDT
©2019 Wantedly, Inc. ؾΛ͚ͭΔ͜ͱ w σʔλपΓͰ࣮ࡍ͋ͬͨ 1. RPAͷ׆༻Ͱҙຯͷͳ͍େྔͷΞΫηε͕͋ͬͨͨΊɺݟ͔͚ͷฏۉར༻্͕͕ͬ ͨɻίϯόʔδϣϯΛΘͳ͍ϩά͕૿͕͑ͯมԽͨ͠ 2.
৽͍͠ػೳͷϦϦʔεͰςʔϒϧͷΧϥϜʹ৽͍͕͠ೖΔΑ͏ʹͳͬͨɻલͬͯ ݕͨͨ͠ΊϦϦʔεલʹमਖ਼Ͱ͖͕ͨɺΒͳ͔ͬͨΒϞσϧ͕յΕ͍͔ͯͨ
©2019 Wantedly, Inc. ؾΛ͚ͭΔ͜ͱ w σʔλͷϓϥΠόγʔͷ 1. ίϯϓϥΠΞϯεΛकΓ·͠ΐ͏
©2019 Wantedly, Inc. 8BOUFEMZͰେنͳϢʔβؔ࿈ͷσʔλΛ#JH2VFSZʹू͞ΕΔͱ͍ ͏ஈ֊·Ͱ࣮ݱ͞Ε͓ͯΓɺσʔλΛ͏͜ͱࣗମ༰қʹͳΓ·ͨ͠ɻ ࠓޙߋʹਐΜͰ׆༻͕͢͠͞ΛղܾʹͳΓ·͢ɻ σʔλϩάαʔόͷ࣮͚ͩʹ݁ͯ͠ͳ͘ɺϝτϦΫεσʔλΛ ͏.-ͷϞσϧʹӨڹΛ༩͑·͢ɻαʔόͷมߋ͕.-ϞσϧΛյ͞ͳ ͍Α͏ʹपΓʹڞ༗͠·͠ΐ͏ɻʢຊҟৗݕͳͲͰγεςϜతʹक Γ͍ͨʣ
σʔλ
©2019 Wantedly, Inc. ϓϩδΣΫτͷਐΊํ
©2019 Wantedly, Inc. ػցֶशͷϓϩδΣΫτͲ͏ͬͯ࢝·Δͷʁ
©2019 Wantedly, Inc. έʔεόΠέʔεͰҰൠԽ͕͍͠ খ͍͞ϓϩδΣΫτࡶஊϕʔε͕ ࢝·Γͷ͜ͱ͕ଟ͍ؾ͕͢Δ
©2019 Wantedly, Inc. ੈͷதͷ#FTU1SBDUJDFΛ୳ͦ͏
©2019 Wantedly, Inc. Machine Learning Best Practice • ϧʔϧ1: ػցֶशΛΘͳ͍Ͱ࢝ΊΒΕͳ͍͔ߟ͑Δ
• ϧʔϧ2: ઃܭͯ͠ࢦඪΛ࡞ΔʢݱࡏͷγεςϜΛཧղ͔ͯ͠ΒࢦඪΛ࡞Δʣ • ϧʔϧ3: ϩδοΫ͕ෳࡶʹͳͬͬͨΒػցֶशΛબ͢Δ ref: https://developers.google.com/machine-learning/guides/rules-of-ml/
©2019 Wantedly, Inc. ϓϩδΣΫτͷਐΊํ ྫɿʮ͓Γ߹͍Ͱ͔͢ʁʯػೳ
©2019 Wantedly, Inc. ϓϩδΣΫτͷਐΊํ ख୳Γظ ϧʔϧ૿Ճظ ػցֶश • ڞ௨ͷਓ͕ଟ͍ਓʢΛਪન͢Δʣ •
ϝτϦΫε=ϦΫΤετͱঝೝ • ڞ௨ͷਓ͕ଟ͍ਓ • ಉ͡ձࣾͷਓ • ڞ௨ͷ໊Λ͍࣋ͬͯΔ͕ଟ͍ਓ • ಉ࣌ʹεΩϟϯ͞ΕͨਓͰڞ௨ͷͭ ͳ͕Γ͕͋Δ • ͜ͷΈ߹Θͤ • ɾɾɾ • Deep Learningϕʔεͷਪન • + ABςετ
©2019 Wantedly, Inc. ϓϩδΣΫτͷਐΊํ ػցֶशΛಋೖ͢Δͱ͖ʹ ेෳࡶͰݱঢ়ͷࢦඪΛվળͰ͖ͦ͏ͳ՝ΛબͿ
©2019 Wantedly, Inc. ·ͱΊ
©2019 Wantedly, Inc. ·ͱΊ ػցֶशͷϓϩμΫτɺ8&#ΤϯδχΞɾΠ ϯϑϥνʔϜͱҰॹʹ࡞͍ͬͯΔ ϓϩμΫτνʔϜͷதʹ.-ΤϯδχΞ͕͍Δߏ ͳͷͰɺҰॹʹٞͯ͠։ൃͰ͖Δڥ
దͳ՝Λݟ͚ͭͯ.-ͰՁΛग़͢͜ͱʹڵ ຯΛ͛ͯ΄͍͠
©2019 Wantedly, Inc. References • Rules of Machine Learning: Best
Practices for ML Engineering • https://developers.google.com/machine-learning/guides/rules-of-ml/ • Wantedly ͷػցֶशϓϩμΫτ։ൃΛࢧ͑Δػցֶशج൫ / #rejectcon2018 • https://speakerdeck.com/south37/number-rejectcon2018 Photo Credit • https://unsplash.com/photos/PMwu9gfCSbw