This project is a collaboration between INESC-ID, FCT-UNL, Carnegie Mellon University, Priberam and SAPO. The goal of the project is to advance media monitoring technology to track the popularity or reputation of entities on the Web, with application to tourism and context-aware recommendation. The scope of this grant is on statistical natural language processing (NLP) and machine learning. The objectives are two-fold:
1. Learning with Big Data: develop scalable, online and distributed learning algorithms, requiring only weak supervision signals, that will be fundamental to discover hidden trends in data and to model large-scale problems. Emphasis will be given to modern deep learning approaches with weak supervision.
2. Robust Natural Language Processing: develop groundbreaking NLP techniques, fast enough for practical use with large volumes of text, robust to domain-shift (e.g., from news domains to tourism and social media), and capable of performing fine-grained linguistic analysis. Emphasis will be given to the multilingual aspect (e.g., processing contents in English, Portuguese, and Spanish) and on the treatment of rare and out-of-vocabulary words.
|Funding: FCT/ CMU|
|Start Date: 01-03-2016|
|End Date: 01-02-2020|
|Team: André Filipe Torres Martins, Mariana Sá Correia Leite de Almeida, Mario Alexandre Teles de Figueiredo|
|Groups: Pattern and Image Analysis – Lx|
|Local Coordinator: André Filipe Torres Martins|