Project DeepSPIN presents on June 20, 2022, from 10:30 am to 12:00 am, the Workshops "Improving Systematic Generalization of Sequence-to-Sequence Learning with Structural Biases" and "Decoding is deciding under uncertainty - the case of NMT", respectively, with the invited keynote speakers Ivan Titov (University of Edinburgh) e Wilker Aziz (University of Amsterdam).
10:30 | Ivan Titov (University of Edinburgh)
Title: Improving Systematic Generalization of Sequence-to-Sequence Learning with Structural Biases
Abstract: Despite success in many domains, sequence-to-sequence models struggle in settings where train and test examples are drawn from different distributions. In particular, in contrast to humans, they fail to generalize systematically, i.e., interpret sentences representing novel combinations of concepts (e.g., text segments) seen in training. In this talk, I will discuss two main routes to improving systematic generalization. First, I will discuss the integration of structural biases into the model architecture. I will introduce a neural model which models the 'translation' process as structured permutation and monotonic translation of the subsequences. Second, I will discuss the injection of structural biases into the learning objectives. We use a meta-learning-like objective that encourages the gradients of similarly-structured examples (as determined by a similarity metric, e.g., a string or tree kernel) to be similar. This objective aims to inhibit memorization and encourages the model to learn the tasks in a 'systematic fashion'. We will use semantic parsing and machine translation as applications for our methods.
Joint work with Bailin Wang, Hal Conklin, Mirella Lapata, and Kenny Smith.
Bio: Ivan Titov is a professor and chair for NLP at the University of Edinburgh, and a part-time faculty member at the University of Amsterdam. Now, he is visiting researcher at Google. He received his Ph.D. from the University of Geneva and also spent time at the University of Illinois in Urbana-Champaign and the Saarland University. His current research focuses on natural language understanding, improving generalization across tasks and data distributions, and interpretability. He has been awarded an ERC starting grant, Dutch VICI, and VIDI fellowships. Ivan co-directs the Edinburgh doctoral school in NLP (CDT in NLP) and directs the Edinburgh ELLIS unit. He has been a program chair for ICLR 2021 and CoNLL 2018, an action editor at TACL and JMLR, and a member of the advisory board of the European chapter of ACL. He is a Turing and ELLIS fellow and co-directs ELLIS NLP program.
11:15 | Wilker Aziz (University of Amsterdam)
Title: Decoding is deciding under uncertainty — the case of NMT.
Abstract: In neural machine translation (NMT), we search for the mode of the model distribution to form predictions. We do so mostly following the intuition that the most probable outcome ought to be an essential distribution summary. Despite our intuition, there’s plenty of evidence against the adequacy of the most probable translations in NMT. In this talk, I make a case to move away from mode-seeking search as a tool for decision-making and model criticism. I will highlight reasons concerning MT as a task, NMT as a probabilistic model, and MLE as a training algorithm. Finally, I’ll turn to statistical decision theory and motivate a different rule for making decisions, one which is familiar to statistical MT folks like those of my generation and earlier, as well as a modern approximation of it. I’ll close the talk with a discussion of the merits and limitations of this decision rule, and comments on opportunities moving forward with or without a mode-seeking search.