A highly accurate model for screening prostate cancer using propensity index panel of ten genes

S Jain, KPK Malhotra, S Patiyal… - Journal of …, 2023 - liebertpub.com
S Jain, KPK Malhotra, S Patiyal, GPS Raghava
Journal of Computational Biology, 2023liebertpub.com
Prostate-specific antigen (PSA) is a key biomarker commonly used to screen patients for
prostate cancer. A significant number of unnecessary biopsies are performed every year due
to poor accuracy of PSA-based biomarkers. In this study, we aim to identify alternate
biomarkers based on gene expression that can be used to screen prostate cancer with
higher accuracy. Our proposed machine learning model was trained and then tested on
gene expression profiles of 500 prostate cancer and 51 normal samples in a 70: 30 ratio …
Abstract
Prostate-specific antigen (PSA) is a key biomarker commonly used to screen patients for prostate cancer. A significant number of unnecessary biopsies are performed every year due to poor accuracy of PSA-based biomarkers. In this study, we aim to identify alternate biomarkers based on gene expression that can be used to screen prostate cancer with higher accuracy. Our proposed machine learning model was trained and then tested on gene expression profiles of 500 prostate cancer and 51 normal samples in a 70: 30 ratio. Numerous feature selection techniques have been used in this study to identify potential biomarkers. These identified genes have been used to develop various machine learning models for distinguishing between prostate cancer samples and healthy controls. Our logistic regression-based model achieved the highest area under the curve (AUC) of 0.91 with accuracy of 82.42% on the validation dataset. We introduced a new approach called propensity index, where expression of the gene is converted into propensity. Our propensity-based approach significantly improved the performance of classification models and achieved an AUC of 0.99 with accuracy of 96.36% on the validation dataset. We also identified and ranked biomarker genes that can be used to distinguish prostate cancer patients from healthy individuals with high accuracy. It was observed that single-gene-based biomarkers can only achieve accuracy of around 90%. In this study, we achieved the best performance using a panel of 10 genes; a random forest model using the propensity index was used. With rapid advancement, we hope that our proposed gene panel will be implemented for identifying and screening of prostate cancer, avoiding biopsy procedures.
1. BACKGROUND
Prostate Adenocarcinoma (PRAD) is the second most prevalent cancer diagnosed in men around the world (Rawla, 2019). With recent advancements in health care, patients with prostate cancer are undergoing clinical biopsy procedures for proper diagnosis and management of the disease. A better understanding of the molecular mechanisms responsible for the onset of prostate carcinogenesis would help in exploring novel therapeutic methods.
Mary Ann Liebert
以上显示的是最相近的搜索结果。 查看全部搜索结果