A Step Towards the Explainability of Microarray Data for Cancer Diagnosis with Machine Learning Techniques
Nogueira, A.
;
Ferreira, A.
;
Figueiredo, M. A. T.
A Step Towards the Explainability of Microarray Data for Cancer Diagnosis with Machine Learning Techniques, Proc International Conf. on Pattern Recognition Applications and Methods - ICPRAM, Conference Online, Vol. , pp. 362 - 369, February, 2022.
Digital Object Identifier: 10.5220/0010980100003122
Abstract
Detecting diseases, such as cancer, from from gene expression data has assumed great importance and is a
very active area of research. Today, many gene expression datasets are publicly available, which consist of
microarray data with information on the activation (or not) of thousands of genes, in sets of patients that have
(or not) a certain disease. These datasets consist of high-dimensional feature vectors (very large numbers of
genes), which raises difficulties for human analysis and interpretation with the goal of identifying the most
relevant genes for detecting the presence of a particular disease. In this paper, we propose to take a step towards
the explainability of these disease detection methods, by applying feature discretization and feature selection
techniques. We accurately classify microarray data, while substantially reducing and identifying subsets of
relevant genes. These small subsets of genes are thus easier to interpret by human experts, thus potentially
providing valuable information about which genes are involved in a given disease.