multiple fine-tuned models improves accuracy without increasing inference time [2212.04089] Editing Models with Task Arithmetic [2306.01708] TIES-Merging: Resolving Interference When Merging Models [2311.03099] Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch [2403.19522] Model Stock: All we need is just a few fine-tuned models アプローチ1 重みレベルのモデルマージ
Weights Leads to Wider Optima and Better Generalization [2203.05482] Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
from Homologous Models as a Free Lunch [2310.04799] Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages
Characteristics (BCs) in each generation 差分 #2 Model merging as crossover Illustration of model merging 差分 #3 SVD-based mutation Illustration of SVD-based mutation