Abstract:
Objective Relapse is responsible for the majority of deaths in breast cancer. Molecular marker related to relapse is helpful for the diagnosis and treatment of breast cancer.
Methods Two microarray datasets of breast cancer, GSE1456 and GSE2034, from GEO database were analyzed by software BRB-ArrayTools. Genes significantly associated with survival were obtained by univariate analysis and Cox proportional hazards model from differential genes related to relapse. These genes were used as candidate genes to predict specific survival rates in GSE1456. Leave-one-out cross-validation method was used to compute mis-classification rate. The result of prediction was assessed with receiver operating characteristic (ROC) curve.
Results Twenty-nine genes were used as the signature to predict the disease specific survival of breast cancer. Area under ROC curve was 0.803. Cross validation of 29 genes were all higher than 96%. The methods showed satisfactory classification result. Gene annotation analysis showed that these genes were associated with cell cycle, cell proliferation, DNA repair, cell motility and adhesion.
Conclusion The analysis of gene expression profiles may provide a new thought for understanding the pathogenesis of breast cancer, and is helpful for molecular diagnosis and individualized therapy.