Creating and sharing knowledge for telecommunications

Distributed and scalable platform for collaborative analysis of massive time series data sets

Duarte, E. ; Campus, D. ; Gomes, D.G.

Distributed and scalable platform for collaborative analysis of massive time series data sets, Proc INTICC International Conference on Data Science, Technology and Applications DATA, Prague, Czech Republic, Vol. , pp. 41 - 52, July, 2019.

Digital Object Identifier: 10.5220/0007834700410052

Abstract
The recent expansion of metrification on a daily basis has led to the production of massive quantities of
data, which in many cases correspond to time series. To streamline the discovery and sharing of meaningful
information within time series, a multitude of analysis software tools were developed. However, these tools
lack appropriate mechanisms to handle massive time series data sets and large quantities of simultaneous
requests, as well as suitable visual representations for annotated data. We propose a distributed, scalable,
secure and high-performant architecture that allows a group of researchers to curate a mutual knowledge base
deployed over a network and to annotate patterns while preventing data loss from overlapping contributions
or unsanctioned changes. Analysts can share annotation projects with peers over a reactive web interface with
a customizable workspace. Annotations can express meaning not only over a segment of time but also over a
subset of the series that coexist in the same segment. In order to reduce visual clutter and improve readability,
we propose a novel visual encoding where annotations are rendered as arcs traced only over the affected
curves. The performance of the prototype under different architectural approaches was benchmarked.