Upgrade to Pro — share decks privately, control downloads, hide ads and more …

データ分析コンペにおいて 特徴量管理に疲弊している全人類に伝えたい想い

Avatar for Takanobu Nozawa Takanobu Nozawa
November 05, 2019
15k

データ分析コンペにおいて 特徴量管理に疲弊している全人類に伝えたい想い

Avatar for Takanobu Nozawa

Takanobu Nozawa

November 05, 2019
Tweet

More Decks by Takanobu Nozawa

Transcript

  1. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ w ಛ௃ྔ࡞Δ DPM<Z " # $ %>ˠ<Z "

    # $ % & '> # e.g) train['A'] = train['A'].fillna(0) train['B'] = np.log1p(train['B']) train['E'] = train['A'] + train['B'] df_group = train.groupby('D')['E'].mean() train['F'] = train['D'].map(df_group) <> ɾ ɾ ɾ
  2. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ w ࢖͏ಛ௃ྔͷΧϥϜ͚ͩࢦఆ͢Δ # e.g) feat_col = ['A', 'C',

    'D', 'E', 'F', 'J'] x_train = train[feat_col] y_train = train['y'] # e.g) clf.fit(x_train, y_train) w ֶशͤ͞Δ <> <> ɾ ɾ ɾ
  3. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ w ࢖͏ಛ௃ྔͷΧϥϜ͚ͩࢦఆ͢Δ # e.g) feat_col = ['A', 'C',

    'D', 'E', 'F', 'J'] x_train = train[feat_col] y_train = train['y'] # e.g) clf.fit(x_train, y_train) w ֶशͤ͞Δ <> <> ɾ ɾ ɾ ͋Ε b'`ͬͯͲΜͳಛ௃ྔ͚ͩͬʁ
  4. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ # e.g) train['A'] = train['A'].fillna(0) train['B'] = np.log1p(train['B'])

    train['E'] = train['A'] + train['B'] df_group = train.groupby('D')['E'].mean() train['F'] = train['D'].map(df_group) <> # e.g) feat_col = ['A', 'C', 'D', 'E', 'F', 'J'] x_train = train[feat_col] y_train = train['y'] # e.g) clf.fit(x_train, y_train) <> <>
  5. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ # e.g) train['A'] = train['A'].fillna(0) train['B'] = np.log1p(train['B'])

    train['E'] = train['A'] + train['B'] df_group = train.groupby('D')['E'].mean() train['F'] = train['D'].map(df_group) <> # e.g) feat_col = ['A', 'C', 'D', 'E', 'F', 'J'] x_train = train[feat_col] y_train = train['y'] # e.g) clf.fit(x_train, y_train) <> <> ݟ͚ͭͨʂ ʢOPUFCPPLͷ্ͷํʣ
  6. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ # e.g) train['A'] = train['A'].fillna(0) train['B'] = np.log1p(train['B'])

    train['E'] = train['A'] + train['B'] df_group = train.groupby('D')['E'].mean() train['F'] = train['D'].map(df_group) <> # e.g) feat_col = ['A', 'C', 'D', 'E', 'F', 'J'] x_train = train[feat_col] y_train = train['y'] # e.g) clf.fit(x_train, y_train) <> <> ݟ͚ͭͨʂ ʢOPUFCPPLͷ্ͷํʣ ಛ௃ྔ͕গͳ͍৔߹͸·ͩϚγ͕ͩɺ ଟ͘ͳͬͯ͘ΔͱͲΜͳܭࢉͰٻΊͨ ಛ௃ྔ͔ͩͬͨΛ͍͍ͪͪߟ͑Δʢ୳͢ʣ ͷ͸݁ߏେมͩ͠ɺ͕͔͔࣌ؒΔ
  7. Α͋͘ΔύλʔϯͦͷʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ <> submission.to_csv('submission.csv', index=False)
  8. Α͋͘ΔύλʔϯͦͷʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ ɾ ɾ ɾ <> submission.to_csv('submission.csv', index=False)
  9. Α͋͘ΔύλʔϯͦͷʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ ɾ ɾ ɾ <> submission.to_csv('submission.csv', index=False)
  10. Α͋͘ΔύλʔϯͦͷʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ ɾ ɾ ɾ <> submission.to_csv('submission.csv', index=False) ಉ͡ܭࢉΛԿ౓΋΍Βͳ͍ͱ͍͚ͳ͍ ʴ ºʢ ʣˠແବ
  11. ΞδΣϯμ  ࣗݾ঺հ  ಛ௃ྔ؅ཧʹ͍ͭͯ ‣ ྻ͝ͱʹQJDLMFϑΝΠϧͰಛ௃ྔΛ؅ཧ ‣ ಛ௃ྔੜ੒࣌ɺಉ࣌ʹϝϞϑΝΠϧ΋ੜ੒ 

    ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ‣ ίϚϯυҰൃͰֶशˠ4VCNJUϑΝΠϧ࡞੒·ͰΛ࣮ߦ ‣ ֶशʹ࢖༻ͨ͠ಛ௃ྔ΍Ϟσϧύϥϝʔλ͸MPHͱҰॹʹอଘ ‣ TIBQΛ༻͍ͯಛ௃ྔͷߩݙ౓ΛՄࢹԽ͠ɺ࣍ճֶश࣌ͷצॴΛݟ͚ͭΔ ϚϚͷҰาΛࢧ͑Δ
  12. ΞδΣϯμ  ࣗݾ঺հ  ಛ௃ྔ؅ཧʹ͍ͭͯ ‣ ྻ͝ͱʹQJDLMFϑΝΠϧͰಛ௃ྔΛ؅ཧ ‣ ಛ௃ྔੜ੒࣌ɺಉ࣌ʹϝϞϑΝΠϧ΋ੜ੒ 

    ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ‣ ίϚϯυҰൃͰֶशˠ4VCNJUϑΝΠϧ࡞੒·ͰΛ࣮ߦ ‣ ֶशʹ࢖༻ͨ͠ಛ௃ྔ΍Ϟσϧύϥϝʔλ͸MPHͱҰॹʹอଘ ‣ TIBQΛ༻͍ͯಛ௃ྔͷߩݙ౓ΛՄࢹԽ͠ɺ࣍ճֶश࣌ͷצॴΛݟ͚ͭΔ ϚϚͷҰาΛࢧ͑Δ ݰਓͷ஌ܙΛ͓आΓͨ͠Β ΊͬͪΌΑ͔ͬͨ ʢ˞ʣ ͍ͬͯ͏࿩Λ͠·͢ ʢ˞ʣ͋͘·Ͱओ؍Ͱ͢
  13. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ 4VSWJWFE 1DMBTT 4FY "HF &NCBSLFE   NBMF

     4   NBMF  $   GFNBMF  $   NBMF  $   GFNBMF  $   GFNBMF  4   NBMF  4 lྻ͝ͱzʹಛ௃ྔΛQJDLMFϑΝΠϧͰ؅ཧ͢Δ
  14. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ 4VSWJWFE 1DMBTT 4FY "HF &NCBSLFE   NBMF

     4   NBMF  $   GFNBMF  $   NBMF  $   GFNBMF  $   GFNBMF  4   NBMF  4 TVSWJWFE@USBJOQLM QDMBTT@USBJOQLM QDMBTT@UFTUQLM TFY@USBJOQLM TFY@UFTUQLM BHF@USBJOQLM BHF@UFTUQLM FNCBSLFE@USBJOQLM FNCBSLFE@UFTUQLM lྻ͝ͱzʹಛ௃ྔΛQJDLMFϑΝΠϧͰ؅ཧ͢Δ
  15. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ IPHFQZΛίϚϯυϥΠϯ͔Β࣮ߦ͢Δ͚ͩ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass']

    self.test['Pclass'] = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') class Sex(Feature): def create_features(self): self.train['Sex'] = train['Sex'] self.test['Sex'] = test['Sex'] create_memo('Sex','ੑผ') class Age(Feature): def create_features(self): self.train['Age'] = train['Age'] self.test['Age'] = test['Age'] create_memo('Age','೥ྸ') class Age_mis_val_median(Feature): def create_features(self): self.train['Age_mis_val_median'] = train['Age'].fillna(train['Age'].median()) self.test['Age_mis_val_median'] = test['Age'].fillna(test['Age'].median()) create_memo('Age_mis_val_median','೥ྸͷܽଛ஋Λதԝ஋Ͱิ׬ͨ͠΋ͷ')
  16. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') class Sex(Feature): def create_features(self): self.train['Sex'] = train['Sex'] self.test['Sex'] = test['Sex'] create_memo('Sex','ੑผ') class Age(Feature): def create_features(self): self.train['Age'] = train['Age'] self.test['Age'] = test['Age'] create_memo('Age','೥ྸ') class Age_mis_val_median(Feature): def create_features(self): self.train['Age_mis_val_median'] = train['Age'].fillna(train['Age'].median()) self.test['Age_mis_val_median'] = test['Age'].fillna(test['Age'].median()) create_memo('Age_mis_val_median','೥ྸͷܽଛ஋Λதԝ஋Ͱิ׬ͨ͠΋ͷ') IPHFQZΛίϚϯυϥΠϯ͔Β࣮ߦ͢Δ͚ͩ
  17. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') class Sex(Feature): def create_features(self): self.train['Sex'] = train['Sex'] self.test['Sex'] = test['Sex'] create_memo('Sex','ੑผ') class Age(Feature): def create_features(self): self.train['Age'] = train['Age'] self.test['Age'] = test['Age'] create_memo('Age','೥ྸ') class Age_mis_val_median(Feature): def create_features(self): self.train['Age_mis_val_median'] = train['Age'].fillna(train['Age'].median()) self.test['Age_mis_val_median'] = test['Age'].fillna(test['Age'].median()) create_memo('Age_mis_val_median','೥ྸͷܽଛ஋Λதԝ஋Ͱิ׬ͨ͠΋ͷ') ֤ಛ௃ྔ IPHFQZΛίϚϯυϥΠϯ͔Β࣮ߦ͢Δ͚ͩ
  18. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') class Sex(Feature): def create_features(self): self.train['Sex'] = train['Sex'] self.test['Sex'] = test['Sex'] create_memo('Sex','ੑผ') class Age(Feature): def create_features(self): self.train['Age'] = train['Age'] self.test['Age'] = test['Age'] create_memo('Age','೥ྸ') class Age_mis_val_median(Feature): def create_features(self): self.train['Age_mis_val_median'] = train['Age'].fillna(train['Age'].median()) self.test['Age_mis_val_median'] = test['Age'].fillna(test['Age'].median()) create_memo('Age_mis_val_median','೥ྸͷܽଛ஋Λதԝ஋Ͱิ׬ͨ͠΋ͷ') ಛ௃ྔϝϞϑΝΠϧ IPHFQZΛίϚϯυϥΠϯ͔Β࣮ߦ͢Δ͚ͩ
  19. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') DSFBUF@NFNPͷॲཧ֓ཁ
  20. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') DSFBUF@NFNPͷॲཧ֓ཁ # ಛ௃ྔϝϞcsvϑΝΠϧ࡞੒ def create_memo(col_name, desc): file_path = Feature.dir + '/_features_memo.csv' if not os.path.isfile(file_path): with open(file_path,"w"):pass with open(file_path, 'r+') as f: lines = f.readlines() lines = [line.strip() for line in lines] # ॻ͖ࠐ΋͏ͱ͍ͯ͠Δಛ௃ྔ͕͢Ͱʹॻ͖ࠐ·Ε͍ͯͳ͍͔νΣοΫ col = [line for line in lines if line.split(',')[0] == col_name] if len(col) != 0:return writer = csv.writer(f) writer.writerow([col_name, desc])
  21. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') DSFBUF@NFNPͷॲཧ֓ཁ # ಛ௃ྔϝϞcsvϑΝΠϧ࡞੒ def create_memo(col_name, desc): file_path = Feature.dir + '/_features_memo.csv' if not os.path.isfile(file_path): with open(file_path,"w"):pass with open(file_path, 'r+') as f: lines = f.readlines() lines = [line.strip() for line in lines] # ॻ͖ࠐ΋͏ͱ͍ͯ͠Δಛ௃ྔ͕͢Ͱʹॻ͖ࠐ·Ε͍ͯͳ͍͔νΣοΫ col = [line for line in lines if line.split(',')[0] == col_name] if len(col) != 0:return writer = csv.writer(f) writer.writerow([col_name, desc])
  22. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') DSFBUF@NFNPͷॲཧ֓ཁ # ಛ௃ྔϝϞcsvϑΝΠϧ࡞੒ def create_memo(col_name, desc): file_path = Feature.dir + '/_features_memo.csv' if not os.path.isfile(file_path): with open(file_path,"w"):pass with open(file_path, 'r+') as f: lines = f.readlines() lines = [line.strip() for line in lines] # ॻ͖ࠐ΋͏ͱ͍ͯ͠Δಛ௃ྔ͕͢Ͱʹॻ͖ࠐ·Ε͍ͯͳ͍͔νΣοΫ col = [line for line in lines if line.split(',')[0] == col_name] if len(col) != 0:return writer = csv.writer(f) writer.writerow([col_name, desc]) $47ܗࣜͰอଘ͓ͯ͘͠ͱ(JUIVC͔Βࢀর͠΍͍͢ ʢ΋ͪΖΜɺ&YDFM΍/VNCFSTͱ͍ͬͨΞϓϦέʔγϣϯ͔ΒͰ΋៉ྷʹݟ͑Δʣ
  23. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Family_Size(Feature): def create_features(self): self.train['Family_Size'] = train['Parch'] +

    train['SibSp'] self.test['Family_Size'] = test['Parch'] + test['SibSp'] create_memo('Family_Size','Ո଒ͷ૯਺') IPHFQZʹ৽͍͠ಛ௃ྔੜ੒ॲཧΛهड़ ৽͍͠ಛ௃ྔΛ࡞੒͢Δ৔߹
  24. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Family_Size(Feature): def create_features(self): self.train['Family_Size'] = train['Parch'] +

    train['SibSp'] self.test['Family_Size'] = test['Parch'] + test['SibSp'] create_memo('Family_Size','Ո଒ͷ૯਺') QZUIPOIPHFQZ ৽͍͠ಛ௃ྔΛ࡞੒͢Δ৔߹ IPHFQZʹ৽͍͠ಛ௃ྔੜ੒ॲཧΛهड़
  25. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Family_Size(Feature): def create_features(self): self.train['Family_Size'] = train['Parch'] +

    train['SibSp'] self.test['Family_Size'] = test['Parch'] + test['SibSp'] create_memo('Family_Size','Ո଒ͷ૯਺') QZUIPOIPHFQZ ৽͍͠ಛ௃ྔͷΈੜ੒ ৽͍͠ಛ௃ྔΛ࡞੒͢Δ৔߹ IPHFQZʹ৽͍͠ಛ௃ྔੜ੒ॲཧΛهड़
  26. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ σʔλΛಡΈࠐΉࡍ͸ɺಛ௃ྔΛࢦఆͯ͠ϩʔυ͢Δ͚ͩ # ಛ௃ྔͷࢦఆ features = [ "age_mis_val_median", "family__size",

    "cabin", "fare_mis_val_median" ] df = [pd.read_pickle(FEATURE_DIR_NAME + f’{f}_train.pkl') for f in features] df = pd.concat(df, axis=1)
  27. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ σʔλΛಡΈࠐΉࡍ͸ɺಛ௃ྔΛࢦఆͯ͠ϩʔυ͢Δ͚ͩ # ಛ௃ྔͷࢦఆ features = [ "age_mis_val_median", "family__size",

    "cabin", "fare_mis_val_median" ] df = [pd.read_pickle(FEATURE_DIR_NAME + f’{f}_train.pkl') for f in features] df = pd.concat(df, axis=1) Կ͕خ͔͔ͬͨ͠
  28. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ SVOQZΛ࣮ߦ͢Δ͜ͱͰɺֶशɾਪ࿦ɾ4VCNJUϑΝΠϧΛ࡞੒ # ಛ௃ྔͷࢦఆ features = [ "age_mis_val_median", "family__size",

    "cabin", "fare_mis_val_median" ] run_name = 'lgb_1102' # ࢖༻͢Δಛ௃ྔϦετͷอଘ with open(LOG_DIR_NAME + run_name + "_features.txt", 'wt') as f: for ele in features: f.write(ele+'\n') params_lgb = { 'boosting_type': 'gbdt', 'objective': 'binary', 'early_stopping_rounds': 20, 'verbose': 10, 'random_state': 99, 'num_round': 100 } # ࢖༻͢Δύϥϝʔλͷอଘ with open(LOG_DIR_NAME + run_name + "_param.txt", 'wt') as f: for key,value in sorted(params_lgb.items()): f.write(f'{key}:{value}\n') runner = Runner(run_name, ModelLGB, features, params_lgb, n_fold, name_prefix) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒
  29. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ SVOQZΛ࣮ߦ͢Δ͜ͱͰɺֶशɾਪ࿦ɾ4VCNJUϑΝΠϧΛ࡞੒ # ಛ௃ྔͷࢦఆ features = [ "age_mis_val_median", "family__size",

    "cabin", "fare_mis_val_median" ] run_name = 'lgb_1102' # ࢖༻͢Δಛ௃ྔϦετͷอଘ with open(LOG_DIR_NAME + run_name + "_features.txt", 'wt') as f: for ele in features: f.write(ele+'\n') params_lgb = { 'boosting_type': 'gbdt', 'objective': 'binary', 'early_stopping_rounds': 20, 'verbose': 10, 'random_state': 99, 'num_round': 100 } # ࢖༻͢Δύϥϝʔλͷอଘ with open(LOG_DIR_NAME + run_name + "_param.txt", 'wt') as f: for key,value in sorted(params_lgb.items()): f.write(f'{key}:{value}\n') runner = Runner(run_name, ModelLGB, features, params_lgb, n_fold, name_prefix) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒ ͜ͷSVO@OBNFΛQSFpYͱͯ͠ɺϑΝΠϧ΍ϞσϧΛอଘͯ͘͠ΕΔɻ ྫ w ࢖༻ͨ͠ಛ௃ྔϦετ w ࢖༻ͨ͠ϋΠύʔύϥϝʔλ w GPMEຖͷϞσϧ w ਪ࿦݁Ռ w TVCNJUϑΝΠϧ w TIBQͷܭࢉ݁ՌΠϝʔδϑΝΠϧͳͲ
  30. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ SVOQZΛ࣮ߦ͢Δ͜ͱͰɺֶशɾਪ࿦ɾ4VCNJUϑΝΠϧΛ࡞੒ # ಛ௃ྔͷࢦఆ features = [ "age_mis_val_median", "family__size",

    "cabin", "fare_mis_val_median" ] run_name = 'lgb_1102' # ࢖༻͢Δಛ௃ྔϦετͷอଘ with open(LOG_DIR_NAME + run_name + "_features.txt", 'wt') as f: for ele in features: f.write(ele+'\n') params_lgb = { 'boosting_type': 'gbdt', 'objective': 'binary', 'early_stopping_rounds': 20, 'verbose': 10, 'random_state': 99, 'num_round': 100 } # ࢖༻͢Δύϥϝʔλͷอଘ with open(LOG_DIR_NAME + run_name + "_param.txt", 'wt') as f: for key,value in sorted(params_lgb.items()): f.write(f'{key}:{value}\n') runner = Runner(run_name, ModelLGB, features, params_lgb, n_fold, name_prefix) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒ ੜ੒͞ΕΔϑΝΠϧྫ
  31. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ
  32. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ
  33. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ
  34. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ
  35. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ
  36. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ Կ͕خ͔͔ͬͨ͠
  37. ·ͱΊ ϚϚͷҰาΛࢧ͑Δ w ಛ௃ྔ؅ཧ͍͍ͧʂ  ͭͷεΫϦϓτϑΝΠϧʹಛ௃ྔੜ੒Λ·ͱΊΔ͜ͱͰɺಉ͡ܭࢉΛෳ਺ճ࣮ߦ͢Δ͜ͱ ΛճආͰ͖Δʂ  ಛ௃ྔͷϝϞΛಉ࣌ʹੜ੒͢Δ͜ͱͰʮ͜ͷಛ௃ྔͳΜ͚ͩͬʁʯͱ಄Λ࢖͏ճ਺͕ݮ Δʂ

     ಛ௃ྔΛྻ͝ͱʹ؅ཧ͢Δ͜ͱͰऔΓճָ͕͠ʹͳͬͨʂʢ͕ɺಛ௃ྔ͕๲େͳ৔߹͸͋ Δఔ౓ͷ·ͱ·ΓͰ؅ཧͨ͠ํ͕ྑ͍͔΋ʣ w ύΠϓϥΠϯ͍͍ͧʂ  ύΠϓϥΠϯΛߏங͢Δ͜ͱͰɺߴ଎ͳ1%$"Λ࣮ݱʂ  ֶशʹ࢖༻ͨ͠ಛ௃ྔͱύϥϝʔλΛ؅ཧ͢Δ͜ͱͰɺ࠶ݱੑ΋୲อ͞Ε৺ཧత҆શੑ΋