• ArXiv: 2001 & 2020 • Ciao: 2000 & 2011 Evaluation Metric T2でのPerplexity : lower the better 14 Baselines: • BERT-base-uncased • BERT(T1):T1でFine-tuning • BERT(T2):T2でFine-tuning • FT(model,template):提案手法 Hyperparameters • weight decay=0.01 • batch size=4 • learning rate=3x10-8 • k={500,1000,2000,5000,10000} • Epoch=20