Upgrade to Pro — share decks privately, control downloads, hide ads and more …

drummernet

Avatar for Zhang Yixiao Zhang Yixiao
January 17, 2020

 drummernet

Avatar for Zhang Yixiao

Zhang Yixiao

January 17, 2020
Tweet

More Decks by Zhang Yixiao

Other Decks in Research

Transcript

  1. 现有的问题 • 缺少大规模的带注释数据 • 解决办法1:使用合成数据 • 解决办法2:使用未标记的数据 • Mark Cartwright

    and Juan Pablo Bello. Increasing drum transcription vocabulary using data synthesis. Proc. of the 21st Int. Conference on Digital Audio Effects (DAFx-18). Aveiro, Portugal, 2018. • Chih-Wei Wu and Alexander Lerch. Automatic drum transcription using the student-teacher learning paradigm with unlabeled music data. In Proc. Int. Soc. Music Inf. Retrieval Conf., pages 613–620, 2017. • 但上面的模型仍然是有监督+师生学习(Teacher-student Learning) • Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  2. RNN/Sparsemax/Unsampler • 三层GRU • {time-axis, bi-direction, 100 channel} • {time-axis,

    uni-direction, 50 channel} • {instrument-axis, uni-direction, K}, K为鼓乐器数目 • Sparsemax,softmax的“稀疏版本”,允许某项为0 • 一个沿着time-axis的不重叠窗口,一个沿着instrument-axis • 并行计算,结果点乘 • Unsampler • 以0插值,从N/16补回N
  3. Ablation Study • Sparsemax • Softmax效果更差 • Softmax顺序使用比并行差 • 会造成很多假阳性

    • CQT • 用质谱图MEL或短时傅里叶变换STFT会变差 • Onset Enhancement • 不显著的提升,但对训练初期loss下降有好处 • RNNs • 用三个卷积层替换,不产生显著差异。长期关系信息少?