Creating and sharing knowledge for telecommunications

Unsupervised Joint Feature Discretization and Selection

Ferreira, A. ; Figueiredo, M. A. T.

Unsupervised Joint Feature Discretization and Selection, Proc Iberian Conf. on Pattern Recognition and Image Analysis, Las Palmas, Gran Canaria, Spain, Vol. --, pp. -- - --, June, 2011.

Digital Object Identifier:

Download Full text PDF ( 131 KBs)

In many applications, we deal with high dimensional datasets
with different types of data. For instance, in text classification and information retrieval problems, we have large collections of documents. Each text is usually represented by a bag-of-words or similar representation, with a large number of features (terms). Many of these features may be irrelevant (or even detrimental) for the learning tasks. This excessive number of features carries the problem of memory usage in order to represent and deal with these collections, clearly showing the need for adequate techniques for feature representation, reduction, and selection,
to both improve the classification accuracy and the memory requirements.
In this paper, we propose a combined unsupervised feature
discretization and feature selection technique. The experimental results on standard datasets show the efficiency of the proposed techniques as well as improvement over previous similar techniques.