Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
機械学習とのつきあいかた / How to get involved with Machine...
Search
Makoto Tanji
April 24, 2019
Programming
8
39k
機械学習とのつきあいかた / How to get involved with Machine Learning @ Wantedly
2019年新人研修で使った資料です。
以下の内容を話した全体的にはポエムです。
1. Wantedlyの機械学習
2. MLと組織
3. MLとデータ
4. MLプロジェクの進め方
Makoto Tanji
April 24, 2019
Tweet
Share
More Decks by Makoto Tanji
See All by Makoto Tanji
BigQuery ML Hands-on
tan_z_tan
1
190
Jupyterで触れながら学ぶ 機械学習速習会
tan_z_tan
1
190
Other Decks in Programming
See All in Programming
Protocol Buffersの型を超えて拡張性を得る / Beyond Protocol Buffers Types Achieving Extensibility
linyows
0
110
フロントエンドのmonorepo化と責務分離のリアーキテクト
kajitack
2
160
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
210
レガシープロジェクトで最大限AIの恩恵を受けられるようClaude Codeを利用する
tk1351
4
1.6k
HTMLの品質ってなんだっけ? “HTMLクライテリア”の設計と実践
unachang113
4
2.1k
オープンセミナー2025@広島「君はどこで動かすか?」アンケート結果
satoshi256kbyte
0
240
Kiroで始めるAI-DLC
kaonash
2
540
2025 年のコーディングエージェントの現在地とエンジニアの仕事の変化について
azukiazusa1
17
8.7k
開発チーム・開発組織の設計改善スキルの向上
masuda220
PRO
18
9.8k
プロポーザル駆動学習 / Proposal-Driven Learning
mackey0225
2
770
rage against annotate_predecessor
junk0612
0
160
Claude Codeで実装以外の開発フロー、どこまで自動化できるか?失敗と成功
ndadayo
4
1.9k
Featured
See All Featured
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.9k
YesSQL, Process and Tooling at Scale
rocio
173
14k
Making the Leap to Tech Lead
cromwellryan
135
9.5k
4 Signs Your Business is Dying
shpigford
184
22k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
840
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.4k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
131
19k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
252
21k
The Language of Interfaces
destraynor
161
25k
Why You Should Never Use an ORM
jnunemaker
PRO
59
9.5k
Transcript
ػցֶशͱͷ͖͍͔ͭ͋ͨ @Wantedly New Grad Training 2019 April 24, 2019 -
Makoto Tanji (@tan-z-tan)
©2019 Wantedly, Inc. • ML Engineer at Wantedly, Inc. •
Wantedly Visit 2015 - 2016 • Client Growth, Scout • Wantedly People 2016 - current • Server side • Machine Learning • Data Analysis ࣗݾհ Who am I? Makoto Tanji
©2019 Wantedly, Inc. ࠓ͢͜ͱ
©2019 Wantedly, Inc. 1. ςʔϚ 2. Wantedlyͷػցֶश 3. Take home
messages 1. MLͱ৫ 2. MLͱσʔλ 3. MLϓϩδΣΫͷਐΊํ 4. ·ͱΊ Contents
©2019 Wantedly, Inc. ϝσΟΞʹࡌΔΑ͏ͳ࠷৽ٕज़ͷػցֶशͷ׆༻՚͔Ͱ͕͢ɺ࣮ӡ༻ ͷͨΊʹ࣮ࡍେͳσʔλ͕ඞཁͩͬͨΓɺਫ਼͕ग़ͳ͔ͬͨΓɺ҆ ఆͨ͠ӡ༻͕͍͠ͱݴ͏ࠔଘࡏ͠·͢ɻ 8BOUFEMZͷ.-ΤϯδχΞɺ8&#ΤϯδχΞɾσʔλαΠΤϯςΟε τɾΠϯϑϥͱڠಇ͠ͳ͕Β͜ͷʹऔΓΜͰ͍·͢ɻ ·ͨকདྷ8&#ΤϯδχΞΠϯϑϥΤϯδχΞػցֶशͱԿΒ͔ͷܗͰ ؔΘΔ͜ͱ͕ࠓΑΓ૿͖͑ͯ·͢ɻ
8BOUFEMZͰ࣮ࡍʹಈ͍͍ͯΔػցֶशΛݟͳ͕Βػցֶशͱͷ͖͍ͭ͋ ͔ͨΛߟ͑ͯߦ͖·͠ΐ͏ɻ ςʔϚ
©2019 Wantedly, Inc. શମతʹػցֶशʹ·ͭΘΔϙΤϜͰ͢
©2019 Wantedly, Inc. Wantedlyͷػցֶश׆༻ࣄྫ Machine Learnings working at Wantedly
©2019 Wantedly, Inc. Wantedlyͷػցֶश׆༻ࣄྫ
©2019 Wantedly, Inc. Wantedlyͷػցֶश׆༻ࣄྫ 7JTJU w ืूͷਪન w 1VTI௨ͷύʔιφϥΠζ
w εΧτϑΟϧλͷվળ w ʜ 1FPQMF w ໊ը૾ͷ0$3ͱྨ w ਓͷਪન w هࣄͷਪનɺ1VTIͷ։෧༧ଌ w ༷ʑͳݻ༗දݱநग़ w ʢػցֶशͷϦϙδτϦ.-λά͕͍ͭͯ·͢ʣ w IUUQTHJUIVCDPNXBOUFEMZ VUG&$RNMUZQFMBOHVBHF
©2019 Wantedly, Inc. ͳͥػցֶशΛಋೖ͢Δͷ͔ʁ
©2019 Wantedly, Inc. Wantedlyͷػցֶश׆༻ࣄྫ ఏڙ͍ͨ͠Ձ͕͋Δ͔Β • εΩϟϯ͔ΒඵͰσʔλԽͱ͍͏ମݧΛఏڙ͍ͨ͠ w ໊Λຕຕ࿈བྷઌʹҠ͢ͷਏ͍ˠͦΕਓؒͷࣄ͡Όͳ͍Ͱ͢ɻΧϝϥͰࡱͬͨΒҰʹσʔλԽ
• ίίϩΦυϧγΰτʹग़ձ͏ମݧΛఏڙ͍ͨ͠ w ͳΜ͔͍͍ࣄͳ͍͔ͳʁˠ͋ͳ͕ͨڵຯ͕͋Γͦ͏ͳืू͔ΒࣗಈͰਪન͠·͢Α
©2019 Wantedly, Inc. ࣾͰಈ͍͍ͯΔ.-ͷߏྫ
©2019 Wantedly, Inc. ࣾͰಈ͍͍ͯΔMLͷߏྫ ֶशࡁΈͷ"1*ύλʔϯ ਪનύλʔϯ ͦͷଞ
©2019 Wantedly, Inc. ֶशࡁΈͷAPIύλʔϯ ϚΠΫϩαʔϏεͷαʔόͷ"1*ͱͯ͠ఏڙ 1. OCRʢจࣈೝࣝʣAPIɺςΩετ͔Βͷநग़APIɺಉҰੑఆAPIͳͲࢁ 2. લֶͬͯशͨ͠ϞσϧΛಈ͔ͯ݁͠ՌΛฦ͢
3. e.g. Peopleͷ໊ೝࣝؔ࿈API͜ͷྫ͕ଟ͍ ߟ͑Δ͜ͱ • ਫ਼͕Βͳ͍͏ͪʹམ͍ͪͯͳ͍͔ • ϝτϦΫεΛຖऔΔɻ৽͘͠APIΛ࡞Δͱ͖νΣοΫϦετʹೖΕΔ • ίετ • CPUʹͳ͍ͬͯΔ͜ͱ͕ଟ͘ɺkubernetesͷϦιʔεΛେྔʹͬͯͳ͍͔
©2019 Wantedly, Inc. ਪનύλʔϯ લ͓ͬͯ͢͢ΊείΞΛܭࢉɾఏڙ 1. Ϣʔβ͕ϖʔδʹΞΫηεͨ͠ͱ͖ʹҰॠͰΦεεϝͷืूΛग़͍ͨ͠ 2. Ϣʔβͷաڈͷߦಈ͔Βࣄલʹܭࢉ͓ͯ͘͠
3. Ϣʔβ×ืूͷ݁ՌΛσʔλετΞʹอଘ ߟ͑Δ͜ͱ • ʮϢʔβ×ืूʯ͕͍ΔσʔλετΞ • ຖͷֶशJob͕ࢭ·͍ͬͯͳ͍͔ʁ • Job͕ଟஈʹͳΔͱಛʹΘ͔Γʹ͘͘ͳΔˠData Pipelineͷ
©2019 Wantedly, Inc. ͦͷଞ σʔλͷਖ਼نԽ 1. ࡶଟͳใΛਖ਼نԽɾಉҰੑͳͲΛఆ͢Δ 2. ࣙॻΛ࡞Δࣄ
3. Ϣʔβʹݟ͑ͳ͍͚Ͳ෦Ͱݡ͘ͳ͍ͬͯΔ #JH2VFSZ.- 1. ෦ͷϩδοΫͰ͍ͬͯΔ 2. ࣮ΘΕͯ·͢
©2019 Wantedly, Inc. ࠓͷTake-Home Messages
©2019 Wantedly, Inc. 1. ৫ͷ 2. σʔλͷ 3. ϓϩδΣΫτͷਐΊํͷ
©2019 Wantedly, Inc. ৫ͷ
©2019 Wantedly, Inc. ʮ৫ߏઓུΛܾΊΔʯ ΠΰʔϧɾΞϯκϑ ʢ͔͍͍ͬ֨͜ݴΛݴ͍͔͚ͨͬͨͩʣ
©2019 Wantedly, Inc. ৫ͷ 8BOUFEMZͰɺϓϩμΫτνʔϜʹػցֶश ΤϯδχΞ͕͍Δ MLΤϯδχΞ ̏ਓ MLΤϯδχΞ ̎ਓ
Infrastructure Visit People
©2019 Wantedly, Inc. ৫ͷ Ϣʔβ·Ͱͷڑ͕͍ۙ 1. ߦͬͨվળ͕μΠϨΫτʹϢʔβ·Ͱಧ͘ ϓϩμΫτʹՁͷ͋ΔվળΛߦ͍͍͢
1. ϓϩμΫτͷ͔͍ͬͯΔํΛڞ༗͍ͯ͠ΔͷͰᴥᴪ͕ى͜ΓͮΒ͍ 2. ඞཁͰ͋ΕϓϩμΫτͷϏδωεϩδοΫ·ͰखΛೖΕΒΕΔ վળ͚ͩͰͳ͘৽͍͠ػೳΛࣗવͱٞ͢Δ 1. ։ൃνʔϜͷҰһͳͷͰతͳվળ͚ͩͰͳ͘ʮ͜͏͋Δ͖ʯํΛٞͯ͠ਐΊΒΕΔ
©2019 Wantedly, Inc. ৫ͷ ଞͷ৫ͷ͋Γํ σʔλαΠΤϯςΟετ ΤϯδχΞ σβΠφ Ϛωʔδϟ શࣾԣஅ
MLνʔϜ ϓϩμΫτνʔϜ શࣾԣஅ σʔλੳνʔϜ MLΤϯδχΞ ґཔ ݁Ռ
©2019 Wantedly, Inc. ৫ͷ ଞͷ։ൃνʔϜͱͲ͏ؔΘΔͷ͔ʁ 1. ։ൃελΠϧ: APIΛఆٛͯ͠WEBଆͱMLଆͰฒߦͯ͠։ൃ 1. e.g.
ϢʔβʹෳͷهࣄΛฦ͍ͨ͠ɻهࣄͷϦετWEBଆͰऔಘ͢ΔɻMLͷAPIΛݺͼग़ͯ͠ॱংΛܾఆ͢Δ 2. ؔΘΓํ 1. WEBଆ: લͬͯAPI spec͚ܾͩΊ͓ͯ͘ͱɺSTUBͯ͠MLଆΛͨͣʹ։ൃՄೳɻ 2. Πϯϑϥ: service-in͢ΔલʹɺෛՙͷϝτϦΫεͱ࠷ѱࢮΜͰ͍͍API͔ΫϦςΟΧϧͳAPI͔Λ͢Γ߹Θͤ 3. ίϛϡχέʔγϣϯ 1. ࣄલʹʮਫ਼ͷظʯΛ͢Γ߹ΘͤΔͱ͍͍ɻ 2. ݡ͍API͕αΫοͱͰ͖Δ͜ͱ΄΅ແ͍ɻ99.9ˋͷਫ਼͕ඞཁͳཁ݅ͳͷ͔ɺݟͤํΛڞ༗͢Δ
©2019 Wantedly, Inc. 8BOUFUEMZͰɺϓϩμΫτνʔϜʹ.-ΤϯδχΞ͕ೖΔͱ͍͏ಇ͖ํΛ औ͍ͬͯ·͢ɻϢʔβʹۙ͘ɺϓϩμΫτυϦϒϯͳಇ͖ํ͕͍͢͠ͱ ͍͏ϝϦοτ͕͋Γ·͢ɻ ϓϩμΫτνʔϜͷதͰɺࣄલʹʮޓ͍ͷൣғʯͱʮਫ਼ͷظʯ Λ͖ͬΓͤ͞Δͱࠞཚͳ͘։ൃ͕ਐΈ·͢ɻ ৫ͷ
©2019 Wantedly, Inc. %FFQ-FBSOJOHྠߨձͬͯ·͢ ຖिਫ༵ʙ!8BOUFEMZ
©2019 Wantedly, Inc. σʔλ
©2019 Wantedly, Inc. σʔλ ໘ന͍ྫ͑ʮσʔλݪ༉ʯ ͦͷ··Ͱ͑ͳ͍ɻਫ਼͕ඞཁ Ճ͢Δͱ༷ʑͳ༻్ʹ͑Δ
ੈքதͰσʔλͷऔΓ߹͍Λ͍ͯ͠Δ ग़లɿෆ໌
©2019 Wantedly, Inc. Wantedlyͷ໘ന͍σʔλ w ϢʔβͷϓϩϑΟʔϧඦສΦʔμʔ w Ϣʔβͷͭͳ͕ΓωοτϫʔΫΤοδԯΦʔμʔ w ಡΈऔΒΕ໊ͨԯ͘Β͍
©2019 Wantedly, Inc. Data lake ඞཁͳσʔλ#JH2VFSZʹ ूͰ͖ΔΈ͕͋Δ "OBMZUJDT
©2019 Wantedly, Inc. ؾΛ͚ͭΔ͜ͱ w σʔλपΓͰ࣮ࡍ͋ͬͨ 1. RPAͷ׆༻Ͱҙຯͷͳ͍େྔͷΞΫηε͕͋ͬͨͨΊɺݟ͔͚ͷฏۉར༻্͕͕ͬ ͨɻίϯόʔδϣϯΛΘͳ͍ϩά͕૿͕͑ͯมԽͨ͠ 2.
৽͍͠ػೳͷϦϦʔεͰςʔϒϧͷΧϥϜʹ৽͍͕͠ೖΔΑ͏ʹͳͬͨɻલͬͯ ݕͨͨ͠ΊϦϦʔεલʹमਖ਼Ͱ͖͕ͨɺΒͳ͔ͬͨΒϞσϧ͕յΕ͍͔ͯͨ
©2019 Wantedly, Inc. ؾΛ͚ͭΔ͜ͱ w σʔλͷϓϥΠόγʔͷ 1. ίϯϓϥΠΞϯεΛकΓ·͠ΐ͏
©2019 Wantedly, Inc. 8BOUFEMZͰେنͳϢʔβؔ࿈ͷσʔλΛ#JH2VFSZʹू͞ΕΔͱ͍ ͏ஈ֊·Ͱ࣮ݱ͞Ε͓ͯΓɺσʔλΛ͏͜ͱࣗମ༰қʹͳΓ·ͨ͠ɻ ࠓޙߋʹਐΜͰ׆༻͕͢͠͞ΛղܾʹͳΓ·͢ɻ σʔλϩάαʔόͷ࣮͚ͩʹ݁ͯ͠ͳ͘ɺϝτϦΫεσʔλΛ ͏.-ͷϞσϧʹӨڹΛ༩͑·͢ɻαʔόͷมߋ͕.-ϞσϧΛյ͞ͳ ͍Α͏ʹपΓʹڞ༗͠·͠ΐ͏ɻʢຊҟৗݕͳͲͰγεςϜతʹक Γ͍ͨʣ
σʔλ
©2019 Wantedly, Inc. ϓϩδΣΫτͷਐΊํ
©2019 Wantedly, Inc. ػցֶशͷϓϩδΣΫτͲ͏ͬͯ࢝·Δͷʁ
©2019 Wantedly, Inc. έʔεόΠέʔεͰҰൠԽ͕͍͠ খ͍͞ϓϩδΣΫτࡶஊϕʔε͕ ࢝·Γͷ͜ͱ͕ଟ͍ؾ͕͢Δ
©2019 Wantedly, Inc. ੈͷதͷ#FTU1SBDUJDFΛ୳ͦ͏
©2019 Wantedly, Inc. Machine Learning Best Practice • ϧʔϧ1: ػցֶशΛΘͳ͍Ͱ࢝ΊΒΕͳ͍͔ߟ͑Δ
• ϧʔϧ2: ઃܭͯ͠ࢦඪΛ࡞ΔʢݱࡏͷγεςϜΛཧղ͔ͯ͠ΒࢦඪΛ࡞Δʣ • ϧʔϧ3: ϩδοΫ͕ෳࡶʹͳͬͬͨΒػցֶशΛબ͢Δ ref: https://developers.google.com/machine-learning/guides/rules-of-ml/
©2019 Wantedly, Inc. ϓϩδΣΫτͷਐΊํ ྫɿʮ͓Γ߹͍Ͱ͔͢ʁʯػೳ
©2019 Wantedly, Inc. ϓϩδΣΫτͷਐΊํ ख୳Γظ ϧʔϧ૿Ճظ ػցֶश • ڞ௨ͷਓ͕ଟ͍ਓʢΛਪન͢Δʣ •
ϝτϦΫε=ϦΫΤετͱঝೝ • ڞ௨ͷਓ͕ଟ͍ਓ • ಉ͡ձࣾͷਓ • ڞ௨ͷ໊Λ͍࣋ͬͯΔ͕ଟ͍ਓ • ಉ࣌ʹεΩϟϯ͞ΕͨਓͰڞ௨ͷͭ ͳ͕Γ͕͋Δ • ͜ͷΈ߹Θͤ • ɾɾɾ • Deep Learningϕʔεͷਪન • + ABςετ
©2019 Wantedly, Inc. ϓϩδΣΫτͷਐΊํ ػցֶशΛಋೖ͢Δͱ͖ʹ ेෳࡶͰݱঢ়ͷࢦඪΛվળͰ͖ͦ͏ͳ՝ΛબͿ
©2019 Wantedly, Inc. ·ͱΊ
©2019 Wantedly, Inc. ·ͱΊ ػցֶशͷϓϩμΫτɺ8&#ΤϯδχΞɾΠ ϯϑϥνʔϜͱҰॹʹ࡞͍ͬͯΔ ϓϩμΫτνʔϜͷதʹ.-ΤϯδχΞ͕͍Δߏ ͳͷͰɺҰॹʹٞͯ͠։ൃͰ͖Δڥ
దͳ՝Λݟ͚ͭͯ.-ͰՁΛग़͢͜ͱʹڵ ຯΛ͛ͯ΄͍͠
©2019 Wantedly, Inc. References • Rules of Machine Learning: Best
Practices for ML Engineering • https://developers.google.com/machine-learning/guides/rules-of-ml/ • Wantedly ͷػցֶशϓϩμΫτ։ൃΛࢧ͑Δػցֶशج൫ / #rejectcon2018 • https://speakerdeck.com/south37/number-rejectcon2018 Photo Credit • https://unsplash.com/photos/PMwu9gfCSbw