Upgrade to Pro — share decks privately, control downloads, hide ads and more …

データ分析コンペにおいて 特徴量管理に疲弊している全人類に伝えたい想い

Takanobu Nozawa
November 05, 2019
15k

データ分析コンペにおいて 特徴量管理に疲弊している全人類に伝えたい想い

Takanobu Nozawa

November 05, 2019
Tweet

More Decks by Takanobu Nozawa

Transcript

  1. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ w ಛ௃ྔ࡞Δ DPM<Z " # $ %>ˠ<Z "

    # $ % & '> # e.g) train['A'] = train['A'].fillna(0) train['B'] = np.log1p(train['B']) train['E'] = train['A'] + train['B'] df_group = train.groupby('D')['E'].mean() train['F'] = train['D'].map(df_group) <> ɾ ɾ ɾ
  2. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ w ࢖͏ಛ௃ྔͷΧϥϜ͚ͩࢦఆ͢Δ # e.g) feat_col = ['A', 'C',

    'D', 'E', 'F', 'J'] x_train = train[feat_col] y_train = train['y'] # e.g) clf.fit(x_train, y_train) w ֶशͤ͞Δ <> <> ɾ ɾ ɾ
  3. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ w ࢖͏ಛ௃ྔͷΧϥϜ͚ͩࢦఆ͢Δ # e.g) feat_col = ['A', 'C',

    'D', 'E', 'F', 'J'] x_train = train[feat_col] y_train = train['y'] # e.g) clf.fit(x_train, y_train) w ֶशͤ͞Δ <> <> ɾ ɾ ɾ ͋Ε b'`ͬͯͲΜͳಛ௃ྔ͚ͩͬʁ
  4. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ # e.g) train['A'] = train['A'].fillna(0) train['B'] = np.log1p(train['B'])

    train['E'] = train['A'] + train['B'] df_group = train.groupby('D')['E'].mean() train['F'] = train['D'].map(df_group) <> # e.g) feat_col = ['A', 'C', 'D', 'E', 'F', 'J'] x_train = train[feat_col] y_train = train['y'] # e.g) clf.fit(x_train, y_train) <> <>
  5. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ # e.g) train['A'] = train['A'].fillna(0) train['B'] = np.log1p(train['B'])

    train['E'] = train['A'] + train['B'] df_group = train.groupby('D')['E'].mean() train['F'] = train['D'].map(df_group) <> # e.g) feat_col = ['A', 'C', 'D', 'E', 'F', 'J'] x_train = train[feat_col] y_train = train['y'] # e.g) clf.fit(x_train, y_train) <> <> ݟ͚ͭͨʂ ʢOPUFCPPLͷ্ͷํʣ
  6. Α͋͘ΔύλʔϯʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ # e.g) train['A'] = train['A'].fillna(0) train['B'] = np.log1p(train['B'])

    train['E'] = train['A'] + train['B'] df_group = train.groupby('D')['E'].mean() train['F'] = train['D'].map(df_group) <> # e.g) feat_col = ['A', 'C', 'D', 'E', 'F', 'J'] x_train = train[feat_col] y_train = train['y'] # e.g) clf.fit(x_train, y_train) <> <> ݟ͚ͭͨʂ ʢOPUFCPPLͷ্ͷํʣ ಛ௃ྔ͕গͳ͍৔߹͸·ͩϚγ͕ͩɺ ଟ͘ͳͬͯ͘ΔͱͲΜͳܭࢉͰٻΊͨ ಛ௃ྔ͔ͩͬͨΛ͍͍ͪͪߟ͑Δʢ୳͢ʣ ͷ͸݁ߏେมͩ͠ɺ͕͔͔࣌ؒΔ
  7. Α͋͘ΔύλʔϯͦͷʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ <> submission.to_csv('submission.csv', index=False)
  8. Α͋͘ΔύλʔϯͦͷʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ ɾ ɾ ɾ <> submission.to_csv('submission.csv', index=False)
  9. Α͋͘ΔύλʔϯͦͷʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ ɾ ɾ ɾ <> submission.to_csv('submission.csv', index=False)
  10. Α͋͘ΔύλʔϯͦͷʢJQZOCʣ ϚϚͷҰาΛࢧ͑Δ <> import numpy as np import pandas as

    pd OPUFCPPLͷத਎ ɾ ɾ ɾ ɾ ɾ ɾ <> submission.to_csv('submission.csv', index=False) ಉ͡ܭࢉΛԿ౓΋΍Βͳ͍ͱ͍͚ͳ͍ ʴ ºʢ ʣˠແବ
  11. ΞδΣϯμ  ࣗݾ঺հ  ಛ௃ྔ؅ཧʹ͍ͭͯ ‣ ྻ͝ͱʹQJDLMFϑΝΠϧͰಛ௃ྔΛ؅ཧ ‣ ಛ௃ྔੜ੒࣌ɺಉ࣌ʹϝϞϑΝΠϧ΋ੜ੒ 

    ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ‣ ίϚϯυҰൃͰֶशˠ4VCNJUϑΝΠϧ࡞੒·ͰΛ࣮ߦ ‣ ֶशʹ࢖༻ͨ͠ಛ௃ྔ΍Ϟσϧύϥϝʔλ͸MPHͱҰॹʹอଘ ‣ TIBQΛ༻͍ͯಛ௃ྔͷߩݙ౓ΛՄࢹԽ͠ɺ࣍ճֶश࣌ͷצॴΛݟ͚ͭΔ ϚϚͷҰาΛࢧ͑Δ
  12. ΞδΣϯμ  ࣗݾ঺հ  ಛ௃ྔ؅ཧʹ͍ͭͯ ‣ ྻ͝ͱʹQJDLMFϑΝΠϧͰಛ௃ྔΛ؅ཧ ‣ ಛ௃ྔੜ੒࣌ɺಉ࣌ʹϝϞϑΝΠϧ΋ੜ੒ 

    ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ‣ ίϚϯυҰൃͰֶशˠ4VCNJUϑΝΠϧ࡞੒·ͰΛ࣮ߦ ‣ ֶशʹ࢖༻ͨ͠ಛ௃ྔ΍Ϟσϧύϥϝʔλ͸MPHͱҰॹʹอଘ ‣ TIBQΛ༻͍ͯಛ௃ྔͷߩݙ౓ΛՄࢹԽ͠ɺ࣍ճֶश࣌ͷצॴΛݟ͚ͭΔ ϚϚͷҰาΛࢧ͑Δ ݰਓͷ஌ܙΛ͓आΓͨ͠Β ΊͬͪΌΑ͔ͬͨ ʢ˞ʣ ͍ͬͯ͏࿩Λ͠·͢ ʢ˞ʣ͋͘·Ͱओ؍Ͱ͢
  13. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ 4VSWJWFE 1DMBTT 4FY "HF &NCBSLFE   NBMF

     4   NBMF  $   GFNBMF  $   NBMF  $   GFNBMF  $   GFNBMF  4   NBMF  4 lྻ͝ͱzʹಛ௃ྔΛQJDLMFϑΝΠϧͰ؅ཧ͢Δ
  14. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ 4VSWJWFE 1DMBTT 4FY "HF &NCBSLFE   NBMF

     4   NBMF  $   GFNBMF  $   NBMF  $   GFNBMF  $   GFNBMF  4   NBMF  4 TVSWJWFE@USBJOQLM QDMBTT@USBJOQLM QDMBTT@UFTUQLM TFY@USBJOQLM TFY@UFTUQLM BHF@USBJOQLM BHF@UFTUQLM FNCBSLFE@USBJOQLM FNCBSLFE@UFTUQLM lྻ͝ͱzʹಛ௃ྔΛQJDLMFϑΝΠϧͰ؅ཧ͢Δ
  15. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ IPHFQZΛίϚϯυϥΠϯ͔Β࣮ߦ͢Δ͚ͩ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass']

    self.test['Pclass'] = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') class Sex(Feature): def create_features(self): self.train['Sex'] = train['Sex'] self.test['Sex'] = test['Sex'] create_memo('Sex','ੑผ') class Age(Feature): def create_features(self): self.train['Age'] = train['Age'] self.test['Age'] = test['Age'] create_memo('Age','೥ྸ') class Age_mis_val_median(Feature): def create_features(self): self.train['Age_mis_val_median'] = train['Age'].fillna(train['Age'].median()) self.test['Age_mis_val_median'] = test['Age'].fillna(test['Age'].median()) create_memo('Age_mis_val_median','೥ྸͷܽଛ஋Λதԝ஋Ͱิ׬ͨ͠΋ͷ')
  16. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') class Sex(Feature): def create_features(self): self.train['Sex'] = train['Sex'] self.test['Sex'] = test['Sex'] create_memo('Sex','ੑผ') class Age(Feature): def create_features(self): self.train['Age'] = train['Age'] self.test['Age'] = test['Age'] create_memo('Age','೥ྸ') class Age_mis_val_median(Feature): def create_features(self): self.train['Age_mis_val_median'] = train['Age'].fillna(train['Age'].median()) self.test['Age_mis_val_median'] = test['Age'].fillna(test['Age'].median()) create_memo('Age_mis_val_median','೥ྸͷܽଛ஋Λதԝ஋Ͱิ׬ͨ͠΋ͷ') IPHFQZΛίϚϯυϥΠϯ͔Β࣮ߦ͢Δ͚ͩ
  17. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') class Sex(Feature): def create_features(self): self.train['Sex'] = train['Sex'] self.test['Sex'] = test['Sex'] create_memo('Sex','ੑผ') class Age(Feature): def create_features(self): self.train['Age'] = train['Age'] self.test['Age'] = test['Age'] create_memo('Age','೥ྸ') class Age_mis_val_median(Feature): def create_features(self): self.train['Age_mis_val_median'] = train['Age'].fillna(train['Age'].median()) self.test['Age_mis_val_median'] = test['Age'].fillna(test['Age'].median()) create_memo('Age_mis_val_median','೥ྸͷܽଛ஋Λதԝ஋Ͱิ׬ͨ͠΋ͷ') ֤ಛ௃ྔ IPHFQZΛίϚϯυϥΠϯ͔Β࣮ߦ͢Δ͚ͩ
  18. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') class Sex(Feature): def create_features(self): self.train['Sex'] = train['Sex'] self.test['Sex'] = test['Sex'] create_memo('Sex','ੑผ') class Age(Feature): def create_features(self): self.train['Age'] = train['Age'] self.test['Age'] = test['Age'] create_memo('Age','೥ྸ') class Age_mis_val_median(Feature): def create_features(self): self.train['Age_mis_val_median'] = train['Age'].fillna(train['Age'].median()) self.test['Age_mis_val_median'] = test['Age'].fillna(test['Age'].median()) create_memo('Age_mis_val_median','೥ྸͷܽଛ஋Λதԝ஋Ͱิ׬ͨ͠΋ͷ') ಛ௃ྔϝϞϑΝΠϧ IPHFQZΛίϚϯυϥΠϯ͔Β࣮ߦ͢Δ͚ͩ
  19. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') DSFBUF@NFNPͷॲཧ֓ཁ
  20. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') DSFBUF@NFNPͷॲཧ֓ཁ # ಛ௃ྔϝϞcsvϑΝΠϧ࡞੒ def create_memo(col_name, desc): file_path = Feature.dir + '/_features_memo.csv' if not os.path.isfile(file_path): with open(file_path,"w"):pass with open(file_path, 'r+') as f: lines = f.readlines() lines = [line.strip() for line in lines] # ॻ͖ࠐ΋͏ͱ͍ͯ͠Δಛ௃ྔ͕͢Ͱʹॻ͖ࠐ·Ε͍ͯͳ͍͔νΣοΫ col = [line for line in lines if line.split(',')[0] == col_name] if len(col) != 0:return writer = csv.writer(f) writer.writerow([col_name, desc])
  21. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') DSFBUF@NFNPͷॲཧ֓ཁ # ಛ௃ྔϝϞcsvϑΝΠϧ࡞੒ def create_memo(col_name, desc): file_path = Feature.dir + '/_features_memo.csv' if not os.path.isfile(file_path): with open(file_path,"w"):pass with open(file_path, 'r+') as f: lines = f.readlines() lines = [line.strip() for line in lines] # ॻ͖ࠐ΋͏ͱ͍ͯ͠Δಛ௃ྔ͕͢Ͱʹॻ͖ࠐ·Ε͍ͯͳ͍͔νΣοΫ col = [line for line in lines if line.split(',')[0] == col_name] if len(col) != 0:return writer = csv.writer(f) writer.writerow([col_name, desc])
  22. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Pclass(Feature): def create_features(self): self.train['Pclass'] = train['Pclass'] self.test['Pclass']

    = test['Pclass'] create_memo('Pclass','νέοτͷΫϥεɻ1st, 2nd, 3rdͷ3छྨ') DSFBUF@NFNPͷॲཧ֓ཁ # ಛ௃ྔϝϞcsvϑΝΠϧ࡞੒ def create_memo(col_name, desc): file_path = Feature.dir + '/_features_memo.csv' if not os.path.isfile(file_path): with open(file_path,"w"):pass with open(file_path, 'r+') as f: lines = f.readlines() lines = [line.strip() for line in lines] # ॻ͖ࠐ΋͏ͱ͍ͯ͠Δಛ௃ྔ͕͢Ͱʹॻ͖ࠐ·Ε͍ͯͳ͍͔νΣοΫ col = [line for line in lines if line.split(',')[0] == col_name] if len(col) != 0:return writer = csv.writer(f) writer.writerow([col_name, desc]) $47ܗࣜͰอଘ͓ͯ͘͠ͱ(JUIVC͔Βࢀর͠΍͍͢ ʢ΋ͪΖΜɺ&YDFM΍/VNCFSTͱ͍ͬͨΞϓϦέʔγϣϯ͔ΒͰ΋៉ྷʹݟ͑Δʣ
  23. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Family_Size(Feature): def create_features(self): self.train['Family_Size'] = train['Parch'] +

    train['SibSp'] self.test['Family_Size'] = test['Parch'] + test['SibSp'] create_memo('Family_Size','Ո଒ͷ૯਺') IPHFQZʹ৽͍͠ಛ௃ྔੜ੒ॲཧΛهड़ ৽͍͠ಛ௃ྔΛ࡞੒͢Δ৔߹
  24. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Family_Size(Feature): def create_features(self): self.train['Family_Size'] = train['Parch'] +

    train['SibSp'] self.test['Family_Size'] = test['Parch'] + test['SibSp'] create_memo('Family_Size','Ո଒ͷ૯਺') QZUIPOIPHFQZ ৽͍͠ಛ௃ྔΛ࡞੒͢Δ৔߹ IPHFQZʹ৽͍͠ಛ௃ྔੜ੒ॲཧΛهड़
  25. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ class Family_Size(Feature): def create_features(self): self.train['Family_Size'] = train['Parch'] +

    train['SibSp'] self.test['Family_Size'] = test['Parch'] + test['SibSp'] create_memo('Family_Size','Ո଒ͷ૯਺') QZUIPOIPHFQZ ৽͍͠ಛ௃ྔͷΈੜ੒ ৽͍͠ಛ௃ྔΛ࡞੒͢Δ৔߹ IPHFQZʹ৽͍͠ಛ௃ྔੜ੒ॲཧΛهड़
  26. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ σʔλΛಡΈࠐΉࡍ͸ɺಛ௃ྔΛࢦఆͯ͠ϩʔυ͢Δ͚ͩ # ಛ௃ྔͷࢦఆ features = [ "age_mis_val_median", "family__size",

    "cabin", "fare_mis_val_median" ] df = [pd.read_pickle(FEATURE_DIR_NAME + f’{f}_train.pkl') for f in features] df = pd.concat(df, axis=1)
  27. ಛ௃ྔ؅ཧʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ σʔλΛಡΈࠐΉࡍ͸ɺಛ௃ྔΛࢦఆͯ͠ϩʔυ͢Δ͚ͩ # ಛ௃ྔͷࢦఆ features = [ "age_mis_val_median", "family__size",

    "cabin", "fare_mis_val_median" ] df = [pd.read_pickle(FEATURE_DIR_NAME + f’{f}_train.pkl') for f in features] df = pd.concat(df, axis=1) Կ͕خ͔͔ͬͨ͠
  28. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ SVOQZΛ࣮ߦ͢Δ͜ͱͰɺֶशɾਪ࿦ɾ4VCNJUϑΝΠϧΛ࡞੒ # ಛ௃ྔͷࢦఆ features = [ "age_mis_val_median", "family__size",

    "cabin", "fare_mis_val_median" ] run_name = 'lgb_1102' # ࢖༻͢Δಛ௃ྔϦετͷอଘ with open(LOG_DIR_NAME + run_name + "_features.txt", 'wt') as f: for ele in features: f.write(ele+'\n') params_lgb = { 'boosting_type': 'gbdt', 'objective': 'binary', 'early_stopping_rounds': 20, 'verbose': 10, 'random_state': 99, 'num_round': 100 } # ࢖༻͢Δύϥϝʔλͷอଘ with open(LOG_DIR_NAME + run_name + "_param.txt", 'wt') as f: for key,value in sorted(params_lgb.items()): f.write(f'{key}:{value}\n') runner = Runner(run_name, ModelLGB, features, params_lgb, n_fold, name_prefix) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒
  29. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ SVOQZΛ࣮ߦ͢Δ͜ͱͰɺֶशɾਪ࿦ɾ4VCNJUϑΝΠϧΛ࡞੒ # ಛ௃ྔͷࢦఆ features = [ "age_mis_val_median", "family__size",

    "cabin", "fare_mis_val_median" ] run_name = 'lgb_1102' # ࢖༻͢Δಛ௃ྔϦετͷอଘ with open(LOG_DIR_NAME + run_name + "_features.txt", 'wt') as f: for ele in features: f.write(ele+'\n') params_lgb = { 'boosting_type': 'gbdt', 'objective': 'binary', 'early_stopping_rounds': 20, 'verbose': 10, 'random_state': 99, 'num_round': 100 } # ࢖༻͢Δύϥϝʔλͷอଘ with open(LOG_DIR_NAME + run_name + "_param.txt", 'wt') as f: for key,value in sorted(params_lgb.items()): f.write(f'{key}:{value}\n') runner = Runner(run_name, ModelLGB, features, params_lgb, n_fold, name_prefix) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒ ͜ͷSVO@OBNFΛQSFpYͱͯ͠ɺϑΝΠϧ΍ϞσϧΛอଘͯ͘͠ΕΔɻ ྫ w ࢖༻ͨ͠ಛ௃ྔϦετ w ࢖༻ͨ͠ϋΠύʔύϥϝʔλ w GPMEຖͷϞσϧ w ਪ࿦݁Ռ w TVCNJUϑΝΠϧ w TIBQͷܭࢉ݁ՌΠϝʔδϑΝΠϧͳͲ
  30. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ SVOQZΛ࣮ߦ͢Δ͜ͱͰɺֶशɾਪ࿦ɾ4VCNJUϑΝΠϧΛ࡞੒ # ಛ௃ྔͷࢦఆ features = [ "age_mis_val_median", "family__size",

    "cabin", "fare_mis_val_median" ] run_name = 'lgb_1102' # ࢖༻͢Δಛ௃ྔϦετͷอଘ with open(LOG_DIR_NAME + run_name + "_features.txt", 'wt') as f: for ele in features: f.write(ele+'\n') params_lgb = { 'boosting_type': 'gbdt', 'objective': 'binary', 'early_stopping_rounds': 20, 'verbose': 10, 'random_state': 99, 'num_round': 100 } # ࢖༻͢Δύϥϝʔλͷอଘ with open(LOG_DIR_NAME + run_name + "_param.txt", 'wt') as f: for key,value in sorted(params_lgb.items()): f.write(f'{key}:{value}\n') runner = Runner(run_name, ModelLGB, features, params_lgb, n_fold, name_prefix) runner.run_train_cv() # ֶश runner.run_predict_cv() # ਪ࿦ Submission.create_submission(run_name) # submit࡞੒ ੜ੒͞ΕΔϑΝΠϧྫ
  31. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ
  32. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ
  33. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ
  34. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ
  35. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ
  36. ֶशɾਪ࿦ύΠϓϥΠϯʹ͍ͭͯ ϚϚͷҰาΛࢧ͑Δ ੜ੒͞ΕΔϑΝΠϧͷྫʢϑΥϧμ͸దٓ෼͚͍ͯ·͢ʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w MHC@@GPMENPEFMʢGPMEͰ࡞੒͞ΕͨϞσϧʣ w

    MHC@@QSFEQLMʢUFTUσʔλͰͷਪ࿦݁Ռʣ w MHC@@@TVCNJTTJPODTWʢਪ࿦݁ՌΛLBHHMFʹఏग़Ͱ͖ΔDTWʹม׵ͨ͠΋ͷʣ w MHC@@@GFBUVSFTUYUʢࠓճͷֶशʹ࢖༻ͨ͠ಛ௃ྔϦετʣ w MHC@@@QBSBNUYUʢࠓճͷֶशʹ࢖༻ͨ͠ϋΠύʔύϥϝʔλʣ w MHC@@@TIBQQOHʢTIBQͰܭࢉͨ͠ՄࢹԽΠϝʔδʣ w HFOFSBMMPHʢܭࢉϩάϑΝΠϧʣ w SFTVMUMPHʢϞσϧͷείΞ͚͕ͩهࡌ͞ΕͨϩάϑΝΠϧʣ Կ͕خ͔͔ͬͨ͠
  37. ·ͱΊ ϚϚͷҰาΛࢧ͑Δ w ಛ௃ྔ؅ཧ͍͍ͧʂ  ͭͷεΫϦϓτϑΝΠϧʹಛ௃ྔੜ੒Λ·ͱΊΔ͜ͱͰɺಉ͡ܭࢉΛෳ਺ճ࣮ߦ͢Δ͜ͱ ΛճආͰ͖Δʂ  ಛ௃ྔͷϝϞΛಉ࣌ʹੜ੒͢Δ͜ͱͰʮ͜ͷಛ௃ྔͳΜ͚ͩͬʁʯͱ಄Λ࢖͏ճ਺͕ݮ Δʂ

     ಛ௃ྔΛྻ͝ͱʹ؅ཧ͢Δ͜ͱͰऔΓճָ͕͠ʹͳͬͨʂʢ͕ɺಛ௃ྔ͕๲େͳ৔߹͸͋ Δఔ౓ͷ·ͱ·ΓͰ؅ཧͨ͠ํ͕ྑ͍͔΋ʣ w ύΠϓϥΠϯ͍͍ͧʂ  ύΠϓϥΠϯΛߏங͢Δ͜ͱͰɺߴ଎ͳ1%$"Λ࣮ݱʂ  ֶशʹ࢖༻ͨ͠ಛ௃ྔͱύϥϝʔλΛ؅ཧ͢Δ͜ͱͰɺ࠶ݱੑ΋୲อ͞Ε৺ཧత҆શੑ΋