University of Padua

Feature selection and molecular classification of cancer phenotypes: a comparative study

Cimetta, Elisa and Zanella, Luca and Bezzo, Fabrizio and Facco, Pierantonio (2022) Feature selection and molecular classification of cancer phenotypes: a comparative study. [Data Collection]

This is the latest version of this item.

Available Versions of this Item

Collection description

Classification of high dimensional gene expression data is key to the development of effective di-agnostic and prognostic tools. Feature selection involves finding the best subset with the highest power in predicting class labels. We here conducted a comparative study focused on different combinations of feature selectors (Chi-Squared, mRMR, Relief-F, Genetic Algorithms) and classi-fication learning algorithms (Random Forests, PLS-DA, SVM, Regularized Logistic/Multinomial Regression, kNN) to identify those with the best predictive capacity. The performance of each combination is evaluated through an empirical study on three benchmark cancer-related micro-array datasets. Our results first suggest that the quality of the data relevant to the target classes is key for the successful classification of cancer phenotypes. We also proved that, for a given classi-fication learning algorithm and dataset, all filters have a similar performance. Interestingly, fil-ters achieve comparable or even better results with respect to the GA-based wrappers, while also being easier to implement and faster. Taken together, our findings suggest that simple, well-established feature selectors in combination with optimized classifiers guarantee good per-formances, with no need for complicated and computationally demanding methodologies

DOI: 10.25430/researchdata.cab.unipd.it.00000679
Keywords: feature selection; classification; learning algorithm; cancer; gene expression
Subjects: Physical:Sciences and Engineering > Computer Science and Informatics: Informatics and information systems, computer science, scientific computing, intelligent systems > Bioinformatics, biocomputing, and DNA and molecular computation
Physical:Sciences and Engineering > Products and Processes Engineering: Product design, process design and control, construction methods, civil engineering, energy processes, material engineering > Chemical engineering, technical chemistry
Physical:Sciences and Engineering > Products and Processes Engineering: Product design, process design and control, construction methods, civil engineering, energy processes, material engineering > Computational engineering
Department: Departments > Dipartimento di Ingegneria industriale (DII)
Depositing User: Elisa Cimetta
Date Deposited: 02 May 2023 06:29
Last Modified: 02 May 2023 06:29
Creators/Authors:
CreatorsEmailORCID
Cimetta, Elisaelisa.cimetta@unipd.itorcid.org/0000-0002-8632-3999
Zanella, Lucaluca.zanella.7@phd.unipd.itUNSPECIFIED
Bezzo, Fabriziofabrizio.bezzo@unipd.itorcid.org/0000-0003-1561-0584
Facco, Pierantoniopierantonio.facco@unipd.itorcid.org/0000-0001-8383-6783
Type of data: Image
Research funder: ERC
Research project title: MICRONEX
Grant number: ERC-StG UERI17
Collection period:
FromTo
20192022
Resource language: English
Metadata language: English
Publisher: Research Data Unipd
Date: 1 August 2022
Copyright holders: The Author
URI: https://researchdata.cab.unipd.it/id/eprint/874

Available Files

Full Archive

Other

Cite As

Begin typing (e.g. Chicago or IEEE.) or use the drop down menu.

Begin typing (e.g. en-GB for English, Great Britain) or use the drop down menu.

Export As