Relevance and Mutual Information-Based Feature Discretization
Figueiredo, M. A. T.
Relevance and Mutual Information-Based Feature Discretization, Proc International Conf. on Pattern Recognition Applications and Methods - ICPRAM, Barcelona, Spain, Vol. --, pp. -- - --, February, 2013.
Digital Object Identifier:
In many learning problems, feature discretization (FD) techniques yield compact data representations, which
often lead to shorter training time and higher classification accuracy. In this paper, we propose two new FD techniques. The first method is based on the classical Linde-Buzo-Gray quantization algorithm, guided by a
relevance criterion, and is able to work in unsupervised, supervised, or semi-supervised scenarios, depending
on the adopted measure of relevance. The second method is a supervised technique based on the maximization of the mutual information between each discrete feature and the class label. For both methods, our experiments on standard benchmark datasets show their ability to scale up to high-dimensional data, attaining in many cases better accuracy than other FD approaches, while using fewer discretization intervals.