Extracting Information from Interval Data Using Symbolic Principal Component Analysis
de Oliveira, M.R.O
; Vilela, M.
; Pacheco, A.
Austrian Journal of Statistics Vol. 46, Nº 3-4, pp. 79 - 87, April, 2017.
ISSN (print): 1026-597X
Journal Impact Factor: (in )
Digital Object Identifier: 10.17713/ajs.v46i3-4.673
We introduce generic definitions of symbolic variance and covariance for random interval-valued variables, that lead to a unified and insightful interpretation of four known symbolic principal component estimation methods: CPCA, VPCA, CIPCA, and SymCovPCA. Moreover, we propose the use of truncated versions of symbolic principal components, that use a strict subset of the original symbolic variables, as a way to improve the interpretation of symbolic principal components. Furthermore, the analysis of a real dataset leads to a meaningful characterization of Internet traffic applications, while highlighting similarities between the symbolic principal component estimation methods considered in the paper.