Efficient Supervised Relevance Criteria Tailored to Discrete Features
Figueiredo, M. A. T.
Efficient Supervised Relevance Criteria Tailored to Discrete Features, Proc Portuguese Conf. on Pattern Recognition - RecPad, Covilha, Portugal, Vol. --, pp. -- - --, October, 2014.
Digital Object Identifier: 0
Download Full text PDF ( 251 KBs)
The benefits of feature discretization (FD) techniques for machine learning and pattern recognition tasks are well-known. The use of FD leads to discrete-valued features with enough information for the learning task
at hand, while ignoring minor fluctuations that may be irrelevant for that
task. As a consequence, we obtain compact data representations for learning purposes, yielding both better accuracy and lower training time, as
compared to the use of the original features. However, in many cases,
mainly with medium and high-dimensional (HD) data, the large number
of features usually implies that there is some redundancy among them.
Thus, it is advantageous to apply feature selection (FS) techniques on the
discrete features, keeping the most relevant ones, in order to improve the
performance of machine learning tasks. In this paper, we propose relevance criteria for supervised FS techniques on discrete data, based on the
histogram of the discrete feature. The experimental results on HD data
show that the proposed criteria can achieve better accuracy than widely
used relevance criteria, such as mutual information and the Fisher ratio