Creating and sharing knowledge for telecommunications

Project: DECOLLAGE - DEep COgnition Learning for LAnguage GEneration

Acronym: DECOLLAGE
Main Objective:
In recent years, transformer-based deep learning models such as BERT or GPT-3 have led to impressive results in many natural language
processing (NLP) tasks, exhibiting transfer and few-shot learning capabilities. However, despite faring well in benchmarks, current
deep learning models for NLP often fail badly in the wild: they are bad at out-of-domain generalization, they do not exploit contextual
information, they are poorly calibrated, and their memory is not traceable. These limitations stem from their monolithic architectures,
which are good for perception, but unsuitable for tasks requiring higher-level cognition. In this project, I attack these fundamental
problems by bringing together tools and ideas from machine learning, sparse modeling, information theory, and cognitive science, in
an interdisciplinary approach. First, I will use uncertainty and quality estimates for utility-guided controlled generation, combining
this control mechanism with the efficient encoding of contextual information and integration of multiple modalities. Second, I will
develop sparse and structured memory models, together with attention descriptive representations towards conscious processing. Third,
I will build mathematical models for sparse communication (reconciling discrete and continuous domains), supporting end-to-end
differentiability and enabling a shared workspace where multiple modules and agents can communicate. I will apply the innovations
above to highly challenging language generation tasks, including machine translation, open dialogue, and story generation. To reinforce
interdisciplinarity and maximize technological impact, collaborations are planned with cognitive scientists and with a scale-up company
in the crowd-sourcing translation industry.
Reference: ERC-2022-CoG 101088763
Funding: EU/Horizon Europe
Start Date: 01-08-2023
End Date: 31-07-2028
Team: André Filipe Torres Martins, Mario Alexandre Teles de Figueiredo, Patrick Santos Fernandes, Sophia Sklaviadis, Saúl José Rodrigues dos Santos
Groups: Pattern and Image Analysis – Lx
Partners: Instituto de Telecomunicações, Unbabel
Local Coordinator: André Filipe Torres Martins
...

Associated Publications
  • 3Papers in Conferences
  • S. Santos, A. F. Farinhas, D. McNamee, A. Martins, Modern Hopfield Networks with Continuous-Time Memories, International Conference on Learning Representations ICLR, Singapore, Singapore, April, 2025,
    | Abstract
    | BibTex
  • G. Faria, S. A. Agrawal, A. F. Farinhas, R. Rei, J. G. de Souza, A. Martins, QUEST: Quality-Aware Metropolis-Hastings Sampling for Machine Translatio, Advances in Neural Information Processing Systems - NIPS, Vancouver, Canada, December, 2024 | Full text (PDF 3 MBs) | BibTex
  • A. Martins, V. Niculae, D. McNamee, Sparse Modern Hopfield Networks, Advances in Neural Information Processing Systems - NIPS, New Orleans, United States, December, 2023 | BibTex