基本信息
- Title: Sequence-Level Knowledge Distillation
- Source type: paper
- Related topic notes: Knowledge Distillation, Sequence-level Distillation, Offline KD
TODO
- 阅读论文原文,整理 sequence-level distillation 与 word-level / logits distillation 的差异。
- 回填 teacher-generated sequence 如何作为 student supervised target。
- 补充在 neural machine translation 中的实验设置、指标和对后训练蒸馏的迁移意义。