Abstract:
Objective To find out molecular signature in breast cancer(BC) for early detection by analyzing the gene expression profile in the peripheral blood of BC and healthy samples. Methods GSE11545 from GEO database was taken as training cohort in this paper. Differentially expressed genes between BC and healthy samples were obtained by BRB-ArrayTools software. And these genes were used as candidate genes to predict classification in validation cohort GSE27562 by four methods including compound covariate predictor, diagonal linear discriminant analysis, 3-nearest neighbors and support vector machine. Only genes significantly differed between the classes at 0.001 significance level were used for class prediction. Leave-oneout cross-validation method was used to compute mis-classification rate. Result of prediction was assessed with receiver operating characteristic(ROC) curve. Results Sixty-one differential genes were obtained from the training cohort. 39-gene classifier was used to predict validation cohort. The accuracy rate of classification reached or exceeded 80% with four methods. Areas under ROC curve were 0.925. The methods showed satisfactory classification result. Conclusion Microarray analysis is an effective method in screening gene signature in the peripheral blood of BC. It may provide a new method for diagnosing breast cancer in early stage.