高级搜索

基于单细胞与机器学习的三阴性乳腺癌上皮细胞亚群解析与预后模型构建

Single-Cell and Machine Learning-Based Identification of Epithelial Subsets and Prognostic Modeling in Triple-Negative Breast Cancer

  • 摘要:
    目的 探讨三阴性乳腺癌(TNBC)上皮细胞的异质性与关键分子特征,筛选预后相关标志物并构建稳定的生存预测模型。
    方法 基于TNBC单细胞转录组数据,对上皮细胞进行提取、标准化及聚类分型,并解析其分子特征与功能差异;采用hdWGCNA构建上皮细胞共表达模块,结合多种机器学习算法筛选关键预后基因并建立风险评分模型;通过ROC曲线与Kaplan-Meier(K-M)生存分析验证模型性能。进一步评估高、低风险组的免疫微环境特征及潜在药物反应差异,并采用PCR验证关键基因在肿瘤与正常组织中的表达差异。
    结果 本研究解析了TNBC上皮细胞的亚群组成及其分子特征,识别出与TNBC相关的上皮细胞亚群。基于hdWGCNA与机器学习联合筛得10个关键基因,构建的预后模型能够有效区分不同生存风险人群,并在ROC与K-M分析中表现出良好的预测能力。免疫分析显示,高、低风险组在7类免疫细胞浸润水平及免疫功能相关特征上存在差异;药物敏感性评估提示8种药物在不同风险组间存在潜在响应差异。PCR进一步验证了10个关键基因在肿瘤与正常组织中的表达差异。
    结论 本研究在单细胞分辨率下揭示了TNBC上皮细胞异质性,并建立了基于10基因的预后模型,可用于TNBC患者风险分层及免疫特征与潜在治疗策略评估。

     

    Abstract:
    Objective To investigate the heterogeneity and key molecular features of epithelial cells in triple-negative breast cancer (TNBC), identify prognostic biomarkers, and develop a robust survival prediction model.
    Methods Using TNBC single-cell transcriptomic data, epithelial cells were extracted, normalized, and subclustered to characterize their molecular signatures and functional differences. High-dimensional weighted gene co-expression network analysis (hdWGCNA) was applied to establish co-expression modules in epithelial cells. Multiple machine learning algorithms were integrated to select key prognostic genes and develop a risk-score model, whose performance was evaluated using receiver operating characteristic (ROC) curves and Kaplan-Meier (K-M) survival analysis. In addition, the immune microenvironment features and potential drug-response differences between the high- and low-risk groups were systematically assessed. Finally, PCR was performed to validate the expression differences of the key genes between tumor and normal tissues.
    Results We characterized the composition and molecular features of TNBC epithelial subpopulations and identified a TNBC-associated epithelial subset. By integrating hdWGCNA with machine learning approaches, 10 key genes were selected to construct a prognostic model, which effectively stratified patients into distinct survival-risk groups and demonstrated favorable predictive performance in ROC and K-M analyses. Immune profiling revealed the differences in the infiltration levels of seven immune cell types and immune function-related features between the high- and low-risk groups. Drug-sensitivity analysis suggested potential differential responses to eight agents across the risk groups. PCR validation further confirmed the differential expression of the ten signature genes between tumor and normal tissues.
    Conclusion This study reveals epithelial heterogeneity in TNBC at single-cell resolution and establishes a 10-gene prognostic model, which may facilitate the stratification of TNBC risk and the evaluation of immune characteristics and potential therapeutic strategies.

     

/

返回文章
返回