timestamp, text token, etc. • Τϯήʔδϝϯτ͢ΔϢʔβͷใ: engaging user ID, follower count, etc. • Τϯήʔδϝϯτ͞ΕΔϢʔβͷใ: engaged user ID, follower count, etc. • Τϯήʔδϝϯτใ(λʔήοτ): timestamps of the engagements ධՁσʔλͷׂํ๏ DATASET DESCRIPTION Training Data ( ~ 120 millions samples ) Testing Data Validation Data 1 week 1 week
Stacking LightGBMs Features • Categorical Features • Network Features • Text Features • Meta Features • etc. Training Process • Bagging with negative under sampling • Stratified K-Folds over Retweet with Comment
Second Stage Models Like Models Reply Models RT Models RT with Comment Models Target Independent Features Target Dependent Features Like Models Reply Models RT Models RT with Comment Models Like Predictions Reply Predictions RT Predictions RT with Comment Predictions Meta Features
Models Like Models Reply Models RT Models RT with Comment Models Target Independent Features Target Dependent Features Like Models Reply Models RT Models RT with Comment Models Like Predictions Reply Predictions RT Predictions RT with Comment Predictions Meta Features ୯ҰͷΤϯήʔδϝϯτΛ༧ଌ ͢ΔϞσϧΛ, Τϯήʔδϝϯτ ͷछྨ͚ͩ࡞͢Δ (1st stage models) 1st Stage MODEL ARCHITECTURE
Models Like Models Reply Models RT Models RT with Comment Models Target Independent Features Target Dependent Features Like Models Reply Models RT Models RT with Comment Models Like Predictions Reply Predictions RT Predictions RT with Comment Predictions Meta Features 2nd Stage 1st stage modelsͷ༧ଌΛೖྗ ʹՃͨ͠ϞσϧΛ, Τϯήʔδ ϝϯτͷछྨ͚ͩ࡞͢Δ (2nd stage models) MODEL ARCHITECTURE
language, tweet type • ΧʔσΟφϦςΟͷେ͖͍ΧςΰϦมʹ Frequency Encoding & Target Encoding • e.g. tweet ID, user ID ΧςΰϦมͷΈ߹ΘͤΛ৽͍͠ΧςΰϦͱΈͳͨ͠ಛྔͷ࡞ • ΧςΰϦมؒͷෳࡶͳؔੑΛଊ͑Δ͜ͱ͕Ͱ͖Δ • e.g. Hashtag engaging user ID × Categorical Features FEATURES
ֶशʹͦͷଞͷΤϯήʔδϝϯτใ͕ॏཁͱͳΔ. • Τϯήʔδϝϯτͷ༧ଌ݁ՌΛ user ID tweet ID ͳͲͷΧςΰϦͰू͢Δ. • user ID tweet ID ͷΤϯήʔδϝϯτ͢͠͞ / ͞Ε͢͞ Λߴ͍දݱྗͰ ѻ͏͜ͱ͕Ͱ͖Δ. Meta Features FEATURES
࡞͢Δ Bagging Λ࠾༻͢Δ. • ҎԼͷΑ͏ͳαϯϓϦϯάํ๏Λ࠾༻ͨ͠. 1. Negative User-Sampling Λద༻ͯ͠, σʔλαΠζΛখ͘͢͞Δ. 2. Like Retweet ͳͲͷΤϯήʔδϝϯτґવσʔλαΠζ͕େ͖͍ͷͰ, ࢦఆͨ͠ αΠζʹͳΔΑ͏ Random Sampling ͰߋʹσʔλαΠζΛখ͘͢͞Δ. TRAINING PROCESS Sampling Process