Creating and sharing knowledge for telecommunications

Extracting Information from Interval Data Using Symbolic Principal Component Analysis

de Oliveira, M.R.O ; Vilela, M. ; Pacheco, A. ; Valadas, R. ; Salvador, P.

Austrian Journal of Statistics Vol. 46, Nº 3-4, pp. 79 - 87, April, 2017.

ISSN (print): 1026-597X
ISSN (online):

Scimago Journal Ranking: 0,23 (in 2017)

Digital Object Identifier: 10.17713/ajs.v46i3-4.673

We introduce generic definitions of symbolic variance and covariance for random interval-valued variables, that lead to a unified and insightful interpretation of four known symbolic principal component estimation methods: CPCA, VPCA, CIPCA, and SymCovPCA. Moreover, we propose the use of truncated versions of symbolic principal components, that use a strict subset of the original symbolic variables, as a way to improve the interpretation of symbolic principal components. Furthermore, the analysis of a real dataset leads to a meaningful characterization of Internet traffic applications, while highlighting similarities between the symbolic principal component estimation methods considered in the paper.