Creating and sharing knowledge for telecommunications

Union k-Fold Feature Selection on Microarray Data

Ferreira, A. ; Figueiredo, M. A. T.

Union k-Fold Feature Selection on Microarray Data, Proc INSTICC International Conference on Data Science, Technology and Applications (DATA) DATA, Rome, Italy, Vol. , pp. 540 - 547, July, 2023.

Digital Object Identifier: 10.5220/0012135800003541


Cancer detection from microarray data is an important problem to be handled by machine learning techniques.
This type of data poses many challenges to machine learning techniques, namely because it usually has large
number of features (genes) and small number of instances (patients). Moreover, it is important to characterize
which genes are the most important for a given classification task, providing explainability on the classification. In this paper, we propose a feature selection approach for microarray data, which is an extension of the
recently proposed k-fold feature selection algorithm. We propose performing the union of the feature subspaces found independently by two feature selection filters, which have been proven to be adequate for this
type of data, individually. The experimental results show that the union of the subsets of features found by
each filter, in some cases, produces better results than the use of each individual filter, yielding human manageable subsets of features.