Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TDD applied Data Cleansing

dproject21
September 08, 2018

TDD applied Data Cleansing

XP祭り2018 LT祭り にて発表

TDDもせずに "AI" とは何事だ?
機械学習ではなく、機械学習で用いるデータのクレンジングにTDDを適用した話をしました。

dproject21

September 08, 2018
Tweet

More Decks by dproject21

Other Decks in Programming

Transcript

  1. 2 ϓϩϑΟʔϧ ా໺ޱେथ %BJLJ5BOPHVDIJ !EQSPKFDU ϛʔΧϯύχʔגࣜձࣾ (PMBOH0VUTUBOEJOH%BUBFOHJOFFS (0% ྵࡉιϑτϋ΢ε 7#/&5

    4*FS 8FCαʔϏεاۀ ྲྀ࿘ͷຽ $+BWB ػցֶश *P5τϥΠΞϧࢀը 1ZUIPO (PMBOH લॲཧϥΠϒϥϦ։ൃͳͲ "LJCBHP $PEFGPS+BQBO Ϧʔμʔक़ %FW-07& 1VC 50$G& #PPUDBNQ 8"$"5& ϛʔΧϯύχʔגࣜձࣾ ೖ͙ͬͯ͢ɺϦʔϚϯγϣοΫͰ ֎ʹग़Εͳ͔ͬͨ l຤୺ड͚z 91ࡇΓॳࢀՃ 
  2. 3 ϓϩϑΟʔϧ ా໺ޱେथ %BJLJ5BOPHVDIJ !EQSPKFDU ϛʔΧϯύχʔגࣜձࣾ (PMBOH0VUTUBOEJOH%BUBFOHJOFFS (0% ྵࡉιϑτϋ΢ε 7#/&5

    4*FS 8FCαʔϏεاۀ ྲྀ࿘ͷຽ $+BWB ػցֶश *P5τϥΠΞϧࢀը 1ZUIPO (PMBOH લॲཧϥΠϒϥϦ։ൃͳͲ "LJCBHP $PEFGPS+BQBO Ϧʔμʔक़ %FW-07& 1VC 50$G& #PPUDBNQ 8"$"5& ϛʔΧϯύχʔגࣜձࣾ ೖ͙ͬͯ͢ɺϦʔϚϯγϣοΫͰ ֎ʹग़Εͳ͔ͬͨ l຤୺ड͚z 91ࡇΓॳࢀՃ  ͢΂ͯͷόουσʔλΛ ੜ·ΕΔલʹফ͠ڈΓ͍ͨ ͤΊͯ"*#*ʹ৯ΘͤΔલʹ ফ͠ڈΓ͍ͨ ࠷ۙͷϙϦγʔ
  3. mecompany Inc. 4 8 C: % # G+K $/& /0B

    — „ž’‘ -! 8 XVWV WX)Y( ,J'EZSYYSXQ – …šœ Œ†ƒXeQ 3 I ¡A>#¢ P* 9 ¡bci¢ WRVVV W_ 51NŸ?ŸD Ž“ Œ€ œš Š›žM6ŸF ldg ean mjg "L@ B" VYS\Z[]S^[]_ VYS\\\]SZ_XX t||y`UUvrpxvyow~Tvr t||y`UU{p}ruTvr t||y`UU{p}ruqo|oTvr t||y`UU{p}ruZvzTvr  L@ ?Ÿ51™ „ U7DU? =Ÿ2H / /ˆž‰‚ž‡ / fl /Ÿ™‚ / ŸŸ; ŸŸŸ /• ˜” ‹ ŸŸŸ 4OŸ?.< ŸŸŸ kbmdg Ž“ Œ ŸŸŸ kbmdgQsxzQhj 2018 9)8( ϓϩϑΟʔϧ
  4. <3S mecompany Inc. 8 7 > QP#? y!9/w (Az <[R'_Uw4KwF6_HX

    Z^\Wkxg* Y^]r`xmoic_ \'_%X y!9/w (Az 'U Cefjt@ ;B.E@ ;3uxbx ;3S MJ O2( SCUEL Projectpxlnx VL $=Ndxqf I8G &G ;wMwOdxqdx wwA 1Dsvoah M0 WEB WEB WEB SCUEL Projectpxlnx kxg58 )+ :T] HP 2018"9-8, ϓϩϑΟʔϧ
  5. 2+H mecompany Inc. 9 - 4 FE5 t/'r !7u t/'r

    !7u J 9`bem6 18&;6 1+ns\s 1+H B? D*! SCUEL Projectjsghs KA 3C_skb > .= = 1rBrD_sk_s rr7 ):lpi[d B( WEB  WEB  WEB  SCUEL Projectjsghs fsc,. "# 0IW HP 20189%8$ G Z<NYOXV@TUPO fsc]oqaq^VQMRWASNL ϓϩϑΟʔϧ
  6. 5-K ࠓ೔ͷ࿩͸ʜ mecompany Inc. 11 0 7 IH8 r2)p #:s

    r2)p #:s "M <abem9 4;(>9 4-nq_q 4-K EB G,# SCUEL Projectjqghq ND 6F`qkb A1@ !@ 4pEpG`qk`q pp: +=loi^d E* WEB  WEB  WEB  SCUEL Projectjqghq fqc.1 $% 3LZ HP 20189'8& J"]?Q\R[Y CWXSR fqc /YTPUZDVQO #*޲͚ͷ σʔλ ΫϨϯδϯάͷ ࿩Ͱ͢
  7. 5-K ࠓ೔ͷ࿩͸ʜ mecompany Inc. 12 0 7 IH8 r2)p #:s

    r2)p #:s "M <abem9 4;(>9 4-nq_q 4-K EB G,# SCUEL Projectjqghq ND 6F`qkb A1@ !@ 4pEpG`qk`q pp: +=loi^d E* WEB  WEB  WEB  SCUEL Projectjqghq fqc.1 $% 3LZ HP 20189'8& J"]?Q\R[Y CWXSR fqc /YTPUZDVQO ࣗવݴޠॲཧ دΓͷ࿩Ͱ͢
  8. 14    EC UFDITIPXDBTF %ηογϣϯ https://www.db-tech-showcase.com/dbts/tokyo  r

     $PEF GPS +BQBO 4VNNJU ৽ׁ lόουΦʔϓϯσʔλڙཆࣉz ηογϣϯ https://summit2018.code4japan.org/session/210/ Ͱ׬શ൛Λ΍Γ·͢
  9.  mecompany Inc. 16      

      2018 98 ެతػ͔ؔΒ։ࣔ͞Ε͍ͯΔσʔλ &YDFM $47ͷଞ 1%' 8FCαΠτ ࢴʢ෺ཧʣ͕͋Δ ࢴͱઓ͏ͨΊʹ0$3Λ࢖ͬͨ͋ͱ͸ ޡೝࣝσʔλͱઓΘͳ͚Ε͹͍͚ͳ͍ɻ σʔλ։ൃͷݱ৔
  10.  mecompany Inc. 17      

      2018 98 ެతػ͔ؔΒ։ࣔ͞Ε͍ͯΔσʔλ தʹ͸ zόουσʔλz ΋ଘࡏ͢Δ • ޡࣈ୤ࣈ • දهΏΕ • ҟମࣈ • ༣ศ൪߸ؒҧ͍ • ి࿩൪߸ ͳ͔ͥ&ϝʔϧΞυϨε • ʙࢢʙ۠ ͷ͕۠ൈ͚͍ͯΔ σʔλ։ൃͷݱ৔
  11. 28 import ( "testing" "github.com/stretchr/testify/assert” ) func TestCleansingExample (t *testing.T)

    { var testCases := []struct { desc string in string want string }{ {“ϚτϞͳσʔλ”, “03-1234-5678”, “03-1234-5678”}, {“ϝΞυೖͬͯΔߥΕͨσʔλ”, “TEL:03-1234-9876 email:[email protected]”, “03-1234-9876”}, } for _, test := range testCases { assert.Equal(t, Cleansing(test.in), test.want, test.desc) } } ͜͜ʹ lߥΕͨσʔλzͱ l๬·͍͠σʔλzͷ ςετέʔεΛՃ͑ͯɺ 5%%ͷαΠΫϧΛճ͢
  12. ! mecompany Inc. 32   / -0 / -0

     " 201898 લॲཧର৅σʔλ ԯ௒߲໨ ʢ਺ेສϨίʔυ º਺ສ߲໨ʣ %'+.*  +.*,.) ࠃ಺๏ਓ ສϨίʔυ ެతػ͔ؔΒ։ࣔ͞Ε͍ͯΔσʔλΛ౷߹͠ l࢖͑Δzσʔλʹ͢ΔͨΊɺલॲཧ͕ඞཁ #( $& +.*,.) ଞ͔ࣾΒߪೖͨ͠σʔλϕʔε ਺ԯϨίʔυ σʔλ։ൃͷݱ৔
  13. 42 ۩ମతͳऔΓ૊Έ͸ r  $PEF GPS +BQBO 4VNNJU ৽ׁ lόουΦʔϓϯσʔλڙཆࣉz

    Ͱ͓࿩͠·͢ https://summit2018.code4japan.org/session/210/ ։ࣔՄೳͳόουσʔλͷ౤ߘ ͓଴͓ͪͯ͠Γ·͢