Identifying risk factors for poor prognosis and developing prognostic model in patients achieving pathological complete response after neoadjuvant therapy for breast cancer
Original Article

Identifying risk factors for poor prognosis and developing prognostic model in patients achieving pathological complete response after neoadjuvant therapy for breast cancer

Xixi Lin1,2#, Shenkangle Wang1,2,3#, Ziyu Zhu1,2#, Zijie Guo1,2, Mingpeng Luo4, Qiong Ding5, Linbo Wang1,2, Jichun Zhou1,2

1Department of Surgical Oncology, Affiliated Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China; 2Biomedical Research Center and Key Laboratory of Biotherapy of Zhejiang Province, Hangzhou, China; 3Department of Radiation Oncology, Affiliated Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou, China; 4First Clinical Medical College of Zhejiang Chinese Medical University, Hangzhou, China; 5Department of General Surgery, Zhejiang Putuo Hospital, Zhoushan, China

Contributions: (I) Conception and design: X Lin, J Zhou, L Wang, Q Ding; (II) Administrative support: J Zhou, L Wang, Q Ding; (III) Provision of study materials or patients: None; (IV) Collection and assembly of data: X Lin, S Wang, Z Zhu, Z Guo, M Luo; (V) Data analysis and interpretation: X Lin, S Wang, Z Zhu, Z Guo, M Luo; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally to this work.

Correspondence to: Jichun Zhou, MD, PhD; Linbo Wang, MD, PhD. Department of Surgical Oncology, Affiliated Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, No. 3 Eastern Qingchun Road, Hangzhou 310016, China; Biomedical Research Center and Key Laboratory of Biotherapy of Zhejiang Province, Hangzhou 310016, China. Email: jichun-zhou@zju.edu.cn; linbowang@zju.edu.cn. Qiong Ding, MD. Department of General Surgery, Zhejiang Putuo Hospital, No.19 Wenkang Road, Putuo district, Zhoushan 316000, China. Email: wyydljdq@163.com.

Background: A subset of breast cancer patients who achieved pathological complete response (pCR) after neoadjuvant therapy (NAT) still experience poor outcomes, including recurrence, metastasis, and death. This study aims to identify risk factors for adverse outcomes in pCR patients, construct predictive models, elucidate molecular subtype-specific prognostic determinants, and explore the peaks of death and progression events among different subtypes.

Methods: Female patients who received NAT and achieved pCR in the Surveillance, Epidemiology, and End Results (SEER) database were enrolled in this research. This study aims to clarify independent prognostic factors of overall survival (OS) and event-free survival (EFS) by using Cox regression analyses as well as developing nomograms and random survival forest (RSF) machine learning model to predict prognoses of patients with pCR. Subgroup analysis was performed to clarify molecular subtype heterogeneity, and survival sequential analysis was conducted to identify survival and progression event peaks.

Results: Analyses based on SEER data identified age, T stage, N stage, molecular subtype, histological tumor type, surgical approach, and histological grade as independent predictors of OS [Concordance index (C-index) =0.723; 3-year area under the curve (AUC) =0.707], while EFS predictors included age, T stage, N stage, molecular subtype, histological tumor type, and grade (C-index =0.682; 3-year AUC =0.690). The C-index of OS and EFS nomograms were 0.723 (3-year AUC =0.711) and 0.682 (3-year AUC =0.691) respectively. The RSF model for mortality risk achieved a C-index of 0.721 (3-year AUC =0.73). Prognostic factors varied across molecular subtypes, though T/N stage was a common determinant. Survival sequential peaks for death events occurred at 36 months [triple-negative breast cancer (TNBC)], 114 months (Luminal), and 97 months [human epidermal growth factor receptor 2 (HER2)-positive subtype], while progression events’ peaks were observed at 111 months (TNBC), 114 months (Luminal), and 84 months (HER2-positive subtype).

Conclusions: This study systematically revealed key clinicopathological factors influencing prognosis of pCR patients receiving NAT: tumor burden (T/N stage) emerged as a universal risk factor across molecular subtypes. Survival sequential analysis highlights subtype-specific surveillance priorities: intensified monitoring within 3 years for TNBC, focused follow-up at 7–8 years for HER2-positive subtype, and extended tracking for Luminal subtypes. Both nomograms and the RSF model demonstrated robust predictive performance, providing theoretical and practical tools for precision prognosis management in breast cancer.

Keywords: Breast cancer; neoadjuvant therapy (NAT); pathological complete response (pCR); Surveillance, Epidemiology, and End Results database (SEER database); nomogram


Submitted Apr 30, 2025. Accepted for publication Sep 04, 2025. Published online Nov 25, 2025.

doi: 10.21037/gs-2025-181


Highlight box

Key findings

• Tumor burden (T/N stage) emerged as a universal risk factor for prognosis in pathological complete response (pCR) patients across molecular subtypes.

What is known and what is new?

• Few studies have focused on identifying risk factors influencing the prognosis of pCR patients following neoadjuvant therapy. This study pioneers a comprehensive investigation into the prognostic factors for pCR patients within a large-scale cohort (n=8,459). We subsequently developed nomograms and machine learning algorithms to enhance prognostic prediction.

What is the implication, and what should change now?

• Our findings provide robust analytical evidence that tumor burden retains significant prognostic impact even in patients achieving pCR. Notably, prognostic heterogeneity was observed across molecular subtypes, with differential predictive factors identified through. These findings collectively suggest that strengthened systematic therapy should be considered for pCR patients exhibiting high-risk of poor prognosis.


Introduction

Breast cancer is the most common malignant tumor among women globally, ranking first in both incidence and mortality rates among female cancers. According to the World Health Organization (WHO) statistics for 2022, there are more than 2.3 million new cases of breast cancer, accounting for 11.6% of all cancer cases, and about 666,000 deaths, accounting for 6.9% of cancer-related deaths (1).

Neoadjuvant therapy (NAT), as an important therapeutic strategy for patients with locally advanced breast cancer or early-stage breast cancer with high-risk factors, aims to reduce tumor burden and clinical stage through preoperative systemic therapy, thus improving the accessibility of surgery and the rate of breast-conserving surgery, as well as providing information on the sensitivity of the tumor to drugs. According to the 2024 National Comprehensive Cancer Network (NCCN) guidelines (2), NAT is mainly indicated for inoperable patients (inflammatory breast cancer; with large tumor or cN2; cN3; cT4), to achieve the purpose of operable downstaging. NAT can be chosen preoperatively for patients with the following characteristics: for patients with triple-negative breast cancer (TNBC)-type and human epidermal growth factor receptor 2 (HER2)-positive breast cancers at stage cT2N0 and above or with positive clinical lymph nodes; and for patients with a large tumor size relative to the breast volume, but with a strong desire to undergo breast-conserving surgery. In addition, NAT can help clinicians obtain information about tumor sensitivity to therapy (3), thus guiding subsequent adjuvant therapy, such as postoperative adjuvant therapy (including chemotherapy, targeted therapy, etc.). In recent years, with the rapid development of targeted therapy and immunotherapy, the efficacy of NAT has been significantly improved, especially for HER2-positive (3) and TNBC-type (4) patients.

Pathological complete response (pCR) is an important indicator of the efficacy of NAT. Currently, the definition of pCR has not been fully standardized in different clinical trials and expert consensus, which may lead to heterogeneity of study results and complexity of clinical interpretation. Based on the tumor-node-metastasis (TNM) after NAT (ypTNM) system published by the American Joint Committee on Cancer (AJCC), the U.S. Food and Drug Administration (FDA) published a collaborative trial of neoadjuvant breast cancer [Collaborative Trials in Neoadjuvant Breast Cancer (CTNeoBC)] (5), including 12 international NAT trials such as NSABP B-27. The CTNeoBC research discovered that patients with absence of lesions in both breast and regional lymph nodes (ypT0/TisypN0) had better survival prognosis compared to patients with disappearance only in breast lesions (ypT0/Tis). Studies have shown that pCR is significantly associated with patients’ disease-free survival (DFS) and overall survival (OS) (5). However, there are significant differences between pCR rates of breast cancer patients among different molecular subtypes: TNBC- and HER2-positive breast cancers have higher pCR rates of about 40–60% and 30–50%, respectively, whereas luminal breast cancers have pCR rates of approximately 10–20% (6). Nevertheless, some pCR patients still experience recurrence, metastasis and death. Studies have shown that about 10–20% of patients with pCR develop disease recurrence or metastasis within 10 years (7), suggesting that patients who achieve pCR after NAT may still experience poor prognostic outcomes.

Currently, only few studies focus on exploring the risk factors of poor prognosis after pCR. Studies have shown that higher tumor burden (cT, cN, cStage), molecular subtypes, HER2 status, residual in situ lesions, and the presence of circulating tumor DNA (ctDNA) may be associated with poor prognosis after pCR (8-11). In addition, the role of the immune microenvironment in the neoadjuvant treatment of breast cancer has received increasing attention. The presence of tumor infiltrating lymphocytes (TILs) is closely associated with pCR rate and long-term prognosis (12). However, existing studies have been limited in research scale, waiting for multicentered-, more patients included-systematic analyses and predictive models to be developed. In addition, no studies have been found to show any particular patterns in the timing of adverse prognostic events in pCR patients by sequential survival analysis.

Therefore, an in-depth study of the risk factors for poor prognosis after pCR is of great significance for optimizing individualized treatment strategies and improving long-term survival of patients. This retrospective cohort analysis aims to explore the independent risk factors and establish prediction models for poor prognosis outcomes in pCR patients by combining regression analysis, survival analysis, nomogram and machine learning prediction model, heterogeneity in different molecular subtypes and the timing of adverse prognostic events. The results of the study are expected to provide more accurate prognostic assessment tools with clinicians and a theoretical support for individualized treatment and follow-up strategies for high-risk patients. We present this article in accordance with the TRIPOD reporting checklist (available at https://gs.amegroups.com/article/view/10.21037/gs-2025-181/rc).


Methods

Study populations

This study enrolled in 12 databases in SEER*Stat 8.4.4 version including: the SEER Research Data (8 registries, 1975–2019), the SEER Research Data (12 registries, 1992–2019), the SEER Research Data (17 registries, 2000–2019), the SEER Research Data (8 registries, 1975–2020), the SEER Research Data (12 registries, 1992–2020), the SEER Research Data (17 registries, 2000-2020), the SEER Research Data (8 registries, 1975–2021), the SEER Research Data (12 registries. 1992–2021), the SEER Research Data (17 registries, 2000–2021), the SEER Research Plus Data (8 registries, 1975–2019), the SEER Research Plus Data (12 registries, 1992–2019), and the SEER Research Plus Data (17 registries, 2000–2019). We downloaded all patients’ record with unilaterally occurring female breast tumors from the 12 databases, with known quadrant sites, pathologically confirmed, receiving NAT. After integrating information above, this study excluded duplicate records and performed a patient screening process as shown in Figure 1: (I) Exclude all records of patients whose multiple records did not match the quality rules: the order of age at diagnosis, year of diagnosis, and follow-up time of the same patient’s ID should not be conflicted; the vital status of the corresponding records of the same patient’s ID should not be inconsistent; the pathological records of the corresponding records of the same patient’s ID should match the T records (in situ cancer should be recorded as Tis, and invasive cancer should not be recorded as Tis). (II) Exclude all patients’ records of patients whose first record are diagnosed as in situ disease. (III) Exclude all patients’ records of patients with the history of malignant tumors. (IV) According to the corresponding ICD-O-3 codes for breast cancer recorded in “Pathology of Breast Tumors” published in 2008, edited by Li Fu and Xilin Fu, exclude all patients’ records of patients whose ICD-O-3 codes are not mentioned in the following codes: 8500/3, 8022/3, 8035/3, 8520/3, 8524/3 8211/3, 8201/3, 8510/3, 8480/3, 8470/3, 8490/3, 8041/3, 8042/3, 8013/3, 8503/3, 8507/3, 8401/3, 8575/3, 8070/3, 8071/3, 8074/3, 8075/3, 8572/3, 8560. (V) Exclude all patients’ records of patients with stage IV or unknown stage (unknown T stage or N stage). (VI) Exclude all patients’ records of patients died, occurred ipsilateral breast tumor recurrence (IBTR) or contralateral breast tumor recurrence within 6 months and patients with a follow-up time ≤6 months. (VII) Exclude all patients’ records of patients have not received surgery after NAT. (VIII) Consistent with previous research (13), Exclude all patients’ records of patients with ‘response to neoadjuvant therapy’ recorded as ‘unknown’; and patients with records of ‘no response’ and ‘partial response’ were categorized as ‘non-pCR’. The final enrollment was 8,459 breast cancer patients who reached pCR with NAT. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Figure 1 Inclusion criteria of pCR patients in SEER database. ID, identification; n-pCR, non-pathological complete response; pCR, pathological complete response; QC, quality contrl; SEER, Surveillance, Epidemiology, and End Results; T, tumor; TNM, tumor-node-metastasis.

In SEER-based cohort analysis of pCR patients, breast cancer patients were categorized into Luminal subtype (HR-positive/HER2-negative), HER2-positive subtype (HR-positive/HER2-positive or HR-negative/HER2-positive), and TNBC subtype (HR-negative/HER2-negative) subtypes based on HR and HER2 expression. Progression events are defined as ipsilateral or contralateral breast tumor recurrence (defined by multiple records in the database), distant metastasis, and death. Follow-up time was defined using the database “Survival month” entry. And event-free survival time (EFST) was defined as the time from the first record to the occurrence of the progressive event (in the case of ipsilateral or contralateral breast recurrence, EFST was the difference in follow-up time between the two records). Because there was no time record of the time to metastasis and distant metastasis only occurs in few patients, for metastatic patients this study used follow-up time as EFST.

Statistical analyses

Clinical characteristics description and univariate variables analysis

We used R language to analyze data in the present study. The clinical characteristics description was generated by the ‘tableone’ package to describe the clinical characteristics of the patients included in the study, and the categorical variables were expressed as absolute frequencies and proportions, and the continuous variables were expressed as mean and standard deviation (SD) or median and interquartile range (IQR). Categorical variables were analyzed by Chi-squared test and continuous variables were analyzed by Student’s t-test if they met normal distribution, and the results were expressed as mean ± SD. And continuous variables were analyzed by Wilcoxon’s rank-sum test if they did not meet the normal distribution for comparison between groups, and the results were expressed as median (IQR).

Kaplan-Meier (KM) curve, univariate and multivariate Cox regression analysis

In this study, we used the ‘survival’ package for KM curve plotting, univariate Cox regression analysis and multivariate Cox regression analysis, and included statistically significant variables obtained from the univariate Cox regression analysis into the multivariate Cox regression analysis. Based on multivariate Cox regression analysis, this study obtained independent prognostic influential factors through the likelihood ratio test (LRT) to be included as variables in the nomogram construction. Hazard ratios (HRs) with 95% confidence intervals (CIs) were derived from the Cox models. The HR represents the ratio of the hazard rates between two groups. For OS, the HR reflects the relative risk of death, while for EFS, it reflects the relative risk of the composite event (progression, relapse, or death). An HR >1.0 indicates a higher risk in the group of interest compared to the reference group, whereas an HR <1.0 indicates a lower risk.

Nomogram

In this study, we used the ‘regplot’ package for nomogram construction by dividing the cohort into training and test datasets based on 7:3 ratio. We described the sensitivity and specificity of the nomogram model by producing Receiver operating characteristic (ROC) curves (timeROC package) in both training and test datasets, and evaluated and validated its accuracy with calibration curves.

Random survival forest (RSF)

In this study, RSF construction was performed using the randomForestSRC package, and the cohort was divided into training set and test set according to 7:3 ratio. Because of the large span of follow-up time and the limited number of deaths, the follow-up time was limited as ≤60 months in RSF model, and it was weighted by sample frequency inversion processing to increase the model performance. Based on the risk score obtained from RSF mode, the cohort was categorized into high-risk and low-risk groups, and KM curves were plotted through the Survival package to compare the difference in survival between the two groups. We characterized the sensitivity and specificity of the model by producing ROC curves in both test sets (survivalROC package).


Results

Descriptive and comparison analysis

This study included 8,459 pCR breast cancer patients following the inclusion criteria detailed in Figure 1. As shown in Figure 2A, pCR patients have significantly better OS than non-pathological complete response (n-pCR) patients (5-year OS 93.0% vs. 81.1%, 10-year OS 85.2% vs. 70.1%, P<0.0001). As shown in Figure 2B, pCR patients have significantly better EFS than n-pCR (5-year EFS 92.8% vs. 83.2%, 10-year EFS 73.3% vs. 45.3%, P<0.0001).

Figure 2 Kaplan-Meier plots for patient outcomes. (A) OS for pCR patients and n-pCR patients (P<0.001). (B) EFS for pCR patients and n-pCR patients (P<0.001). EFS, event-free survival; n-pCR, non-pathological complete response; OS, overall survival; pCR, pathological complete response.

This study categorized the cohort into 5-year progression group and 5-year progression-free group based on whether progression events occurred within 5 years. Excluding patients with less than 5 years of follow-up without progressive events, we included 3,068 patients and 18.2% of patients (n=558) progressed in 5 years. As shown in Figure 3, 5-year progression group have worse OS than 5-year progression-free group (P<0.0001). And we analyzed the clinical characteristics differences between these two groups as shown in Table 1: age, T stage, N stage, TNM staging, molecular subtype, histological type, surgical approach, grade, and marital status at diagnosis were significantly different between the two groups. Compared to the 5-year progression-free group, the 5-year progression group is characterised by an older diagnosis age (54.76 vs. 51.12 years old, P<0.001), a higher proportion of severe T stage, particularly T3 and T4 (P<0.001), and a greater number of patients with N+ (N1/N2/N3, P<0.001). Additionally, a higher proportion of 5-year progression group are diagnosed as stage III (35.7% vs. 20.9%, P<0.001), a lower proportion of HER2-positive subtypes (32.4% vs. 39.5%) and an increase of Luminal and TNBC subtypes (31.2% vs. 28.4%, 32.3% vs. 28.6%). What’s more, 5-year progression group owns fewer ductal cancer and more lobular cancer (92.1% vs. 95.5%, 4.5% vs. 2.2%), greater proportion of G3 grade (8.4% vs. 2.6%). Also, more of them underwent mastectomy (59.1% vs. 54.4%) rather than breast-conserving surgery (40.1% vs. 45.6%), and with unknown marital status (88.2% vs. 87.1%). There were no significant differences in terms of ethnicity, side of tumor, quadrant location, and whether they were treated with radiotherapy (P>0.05).

Figure 3 Kaplan-Meier plots for OS of 5-year event-free patients and 5-year progressed patients. OS, overall survival.

Table 1

Clinical baseline characteristics of 5-year events-free group and 5-year events group of pCR patients

Variable Overall (n=3,068) 5-year events-free group (n=2,510) 5-year events group (n=558) P value
Age (years) 52.11±12.41 51.52±12.03 54.76±13.67 <0.001
Race 0.56
   Black 446 (14.54) 363 (14.5) 83 (14.9)
   Others 304 (9.91) 247 (9.8) 57 (10.2)
   Unknown 18 (0.59) 17 (0.7) 1 (0.2)
   White 2,300 (74.97) 1,883 (75.0) 417 (74.7)
Laterality 0.63
   Right 1,454 (47.39) 1,184 (47.2) 270 (48.4)
Quadrant 0.26
   Axillary 32 (1.04) 25 (1.0) 7 (1.3)
   CEN 131 (4.27) 100 (4.0) 31 (5.6)
   LI 206 (6.71) 164 (6.5) 42 (7.5)
   LO 252 (8.21) 213 (8.5) 39 (7.0)
   Overlap 779 (25.39) 643 (25.6) 136 (24.4)
   UI 371 (12.09) 314 (12.5) 57 (10.2)
   UO 1,297 (42.28) 1,051 (41.9) 246 (44.1)
T stage <0.001
   T0 4 (0.13) 3 (0.1) 1 (0.2)
   T1 797 (25.98) 679 (27.1) 118 (21.1)
   T2 1,595 (51.99) 1,315 (52.4) 280 (50.2)
   T3 417 (13.59) 338 (13.5) 79 (14.2)
   T4 255 (8.31) 175 (7.0) 80 (14.3)
N stage <0.001
   N0 1,510 (49.22) 1,302 (51.9) 208 (37.3)
   N1 1,205 (39.28) 980 (39.0) 225 (40.3)
   N2 193 (6.29) 139 (5.5) 54 (9.7)
   N3 160 (5.22) 89 (3.5) 71 (12.7)
TNM <0.001
   I 1,848 (60.23) 440 (17.5) 56 (10.0)
   II 724 (23.60) 1,545 (61.6) 303 (54.3)
   III 496 (16.17) 525 (20.9) 199 (35.7)
Biological subtypes 0.02
   HER2-postive 1,173 (38.23) 992 (39.5) 181 (32.4)
   Luminal 888 (28.94) 714 (28.4) 174 (31.2)
   TNBC 898 (29.27) 718 (28.6) 180 (32.3)
   Unknown 109 (3.55) 86 (3.4) 23 (4.1)
Histological type 0.006
   Ductal 2,910 (94.85) 2,396 (95.5) 514 (92.1)
   Lobular 80 (2.61) 55 (2.2) 25 (4.5)
   Mucinous 2 (0.07) 2 (0.1) 0 (0.0)
   Others 76 (2.48) 57 (2.3) 19 (3.4)
Radiation 0.14
   Yes 230 (7.50) 197 (7.8) 33 (5.9)
Surgical approach 0.046
   BCS 1,373 (44.75) 1,145 (45.6) 228 (40.1)
   Mastectomy 1,695 (55.25) 1,365 (54.4) 330 (59.1)
Grade <0.001
   G1 99 (3.23) 88 (3.5) 11 (2.0)
   G2 188 (6.13) 164 (6.5) 24 (4.3)
   G3 113 (3.68) 66 (2.6) 47 (8.4)
   Unknown 2,668 (86.96) 2,192 (87.3) 476 (85.3)
Marital status 0.043
   Divorced 35 (1.14) 31 (1.2) 4 (0.7)
   Married 234 (7.63) 201 (8.0) 33 (5.9)
   Separated 7 (0.23) 6 (0.2) 1 (0.2)
   Single 60 (1.96) 50 (2.0) 10 (1.8)
   Unknown 2,678 (87.29) 2,186 (87.1) 492 (88.2)
   Unmarried 1 (0.03) 1 (0.0) 0 (0.0)
   Widow 53 (1.73) 35 (1.4) 18 (3.2)

Data are presented as mean ± standard deviation or n (%). BCS, breast conserving surgery; CEN, central; HER2, human epidermal growth factor receptor 2; LI, lower inner; LO, lower outer; pCR, pathological complete response; TNBC, triple-negative breast cancer; TNM, tumor-node-metastasis; UI, upper inner; UO, upper outer.

Cox regression analysis and nomogram

In the univariate Cox analysis, age, quadrant, T stage, N stage, TNM staging, molecular subtype, surgical approach, histological type, grade and marital status at diagnosis were related to OS and EFS, while quadrant and marital status at diagnosis is only associated with EFS (Table 2). Multivariate Cox analysis identified age, T stage, N stage, molecular subtype, surgical approach, histological type and grade as independent risk factor for OS. And age, T stage, N stage, molecular subtype, histological type and grade are discovered to be independent factor of EFS for pCR patients (Figure 4, Table S1).

Table 2

Univariate Cox analysis for OS and EFS of pCR patients

Variable OS EFS
HR (95% CI) P HR (95% CI) P
Age 1.02 (1.01, 1.03) <0.001 1.02 (1.01, 1.02) <0.001
Race >0.05 >0.05
   Black Ref Ref
   Others/unknown 0.75 (0.51, 1.1) 0.85 (0.61, 1.19)
   White 0.92 (0.71, 1.19) 0.91 (0.72, 1.15)
Laterality >0.05 >0.05
   Left Ref Ref
   Right 0.96 (0.8, 1.16) 1.02 (0.86, 1.2)
Quadrant >0.05 <0.05
   Axillary Ref Ref
   CEN 1.17 (0.45, 3.04) 0.99 (0.43, 2.24)
   LI 0.95 (0.37, 2.44) 0.86 (0.39, 1.92)
   LO 0.61 (0.24, 1.57) 0.56 (0.25, 1.25)
   Overlap 0.83 (0.34, 2.02) 0.68 (0.32, 1.46)
   UI 0.62 (0.25, 1.56) 0.56 (0.25, 1.22)
   UO 0.88 (0.36, 2.15) 0.76(0.36, 1.61)
T stage <0.001 <0.001
   T0/T1 Ref Ref
   T2 1.2 (0.94, 1.53) 1.2 (0.97, 1.49)
   T3 1.42 (1.04, 1.94) 1.34 (1.01, 1.78)
   T4 3.09 (2.28, 4.18) 2.57 (1.93, 3.41)
N stage <0.001 <0.001
   N0 Ref Ref
   N1 1.92 (1.56, 2.38) 1.55 (1.28, 1.87)
   N2 3.12 (2.24, 4.34) 2.64 (1.96, 3.56)
   N3 5.92 (4.39, 7.97) 5.02 (3.84, 6.58)
TNM <0.001 <0.001
   I Ref Ref
   II 1.52 (1.1, 2.1) 1.53 (1.15, 2.04)
   III 3.41 (2.45, 4.75) 3.02 (2.25, 4.06)
Biological subtype <0.001 <0.001
   HER2-positive Ref Ref
   Luminal 1.75 (1.39, 2.2) 1.48 (1.21, 1.83)
   TNBC 1.67 (1.32, 2.1) 1.48 (1.2, 1.81)
   Unknown 1.61 (0.96, 2.7) 1.74 (1.13, 2.69)
Histological type <0.001 <0.001
   Ductal Ref Ref
   Lobular 2.26 (1.46, 3.5) 2.2 (1.47, 3.29)
   Others 1.76 (1.09, 2.86) 1.61 (1.02, 2.54)
Radiation >0.05 >0.05
   No/unknown Ref Ref
   Yes 0.91 (0.64, 1.3) 0.73 (0.51, 1.04)
Surgical approach <0.001 <0.01
   BCS Ref Ref
   Mastectomy 1.47 (1.22, 1.78) 1.29 (1.09, 1.53)
Grade <0.001 <0.001
   G1 Ref Ref
   G2 1.52 (0.67, 3.43) 1.25 (0.61, 2.55)
   G3 3.93 (1.84, 8.39) 2.95 (1.53, 5.72)
   Unknown 2.04 (1.01, 4.1) 1.84 (1.01, 3.35)
Marital status >0.05 <0.05
   Married Ref Ref
   Single (never married) 1.48 (0.72, 3.04) 1.3 (0.64, 2.63)
   Single (married before)/common-law relationship 2.08 (1.2, 3.62) 1.91 (1.12, 3.25)
   Unknown 1.22 (0.83, 1.77) 1.33 (0.94, 1.9)

BCS, breast conserving surgery; CEN, central; CI, confidence interval; EFS, event-free survival; HER2, human epidermal growth factor receptor 2; HR, hazard ratio; LI, lower inner; LO, lower outer; OS, overall survival; pCR, pathological complete response; Ref, reference; TNBC, triple-negative breast cancer; TNM, tumor-node-metastasis; UI, upper inner; UO, upper outer.

Figure 4 Forest plot of multivariate Cox analysis of pCR patients. (A) Forest plot of multivariate Cox analysis for OS of pCR patients. (B) Forest plot of multivariate Cox analysis for EFS of pCR patients. *, P<0.05; **, P<0.01; ***, P<0.001. AIC, Akaike Information Criterion; EFS, event-free survival; N, node; OS, overall survival; pCR, pathological complete response; T, tumor.

Based on independent risk factors obtained from multivariate Cox analysis, this study established nomograms to assist the prediction of 3-, 5- and 10-year mortality risk (Figure 5) and progression risk for pCR patients (Figure 6). The Concordance index (C-index) of mortality risk nomogram (Figure S1) and progression risk nomogram (Figure S2) are 0.723 and 0.682, with 3-year area under the curve (AUC) in test dataset of 0.707 and 0.691. To validate predictive efficacy of nomograms in different molecular subtypes of pCR patients, the C-index of mortality risk nomogram of Luminal, HER2-positive, TNBC subtypes are 0.71, 0.699 and 0.704, while progression risk nomogram’s C-index are 0.685, 0.645 and 0.676 for each subtypes. Additionally, the mortality risk nomogram had the best 3-year AUC value in the Luminal subtype (3-year AUC =0.734), followed by HER2-positive subtype (3-year AUC =0.709), and TNBC was the worst (3-year AUC =0.696). As for the progression risk nomogram, Luminal subtype had the best 3-year AUC value of 0.715, followed by TNBC (3-year AUC =0.68) and HER2-positive subtype (3-year AUC =0.644) (Figure 7).

Figure 5 Nomogram for predicting mortality risk of pCR patients at 3, 5, and 10 years. *, P<0.05; **, P<0.01; ***, P<0.001. HER2, human epidermal growth factor receptor 2; N, node; pCR, pathological complete response; T, tumor; TNBC, triple-negative breast cancer.
Figure 6 Nomogram for predicting progression risk of pCR patients at 3, 5, and 10 years. *, P<0.05; **, P<0.01; ***, P<0.001. HER2, human epidermal growth factor receptor 2; N, node; pCR, pathological complete response; T, tumor; TNBC, triple-negative breast cancer.
Figure 7 ROC curves of nomogram in different molecular subtypes. (A) 3-year ROC curves for OS nomogram. (B) 5-year ROC curves for OS nomogram. (C) 3-year ROC curves for EFS nomogram. (D) 5-year ROC curves for EFS nomogram. AUC, area under the curve; EFS, event-free survival; HER2, human epidermal growth factor receptor 2; OS, overall survival; ROC, receiver operating characteristic; TNBC, triple-negative breast cancer.

RSF model

This study constructed a RSF model containing 13 variables, with at least 10 samples retained at each node, and 4 variables randomly selected at each node for partitioning. After the number of decision trees reaches 250, the error rate stabilizes and the model C-index is 0.721 (Figure 8A). The most important variables contributing to the model are, in order, N stage, age, molecular subtype, TNM staging, race, T stage, marital status at diagnosis, radiotherapy, surgical approach, grade, histological type, quadrant, while tumor side has no contribution to the model construction (Figure 8B). The C-index of the RSF model is 0.721, which suggests that the model has good predictive ability. Meanwhile the RSF model’s predictive performance was even better in HER2-positive breast cancer, with a C-index of 0.752; followed by TNBC with a C-index of 0.685 and Luminal subtype with a C-index of 0.630 (Figure 8C). The 3-year AUC of the RSF model is 0.73, and the 5-year AUC is 0.60, which suggests that the established RSF model has a good predictive ability, especially for mortality risk in 3 years (Figure 8D). The RSF model grouped patients into high- and low-risk groups based on the median risk score (1.95), and the high-risk group had significantly worse OS than low-risk groups (P<0.0001), suggesting risk stratification based on RSF model can effectively identify high-risk groups (Figure 8E).

Figure 8 RSF model of overall survival of pCR patients. (A) Error rate learning curves for RSF model based on variation in the number of decision trees. (B) Kaplan-Meier curves for OS of high- and low-risk pCR patients based on RSF model. (C) C-index of RSF models in different molecular subtypes. (D) 3- and 5-year ROC curves for the RSF model. (E) Importance of variables in RSF models. C-index, Concordance index; HER2, human epidermal growth factor receptor 2; pCR, pathological complete response; ROC, receiver operating characteristic; RSF, random survival forest; T, tumor; TNBC, triple-negative breast cancer.

Subgroup analysis of different subtypes

As shown in Figure 9A,9B, KM curves based on different molecular subtypes showed that HER2-positive breast cancer possessed the best OS and EFS prognosis, followed by TNBC and worst by Luminal type (P<0.001). The 5-year OS rates for HER2-positive, Luminal type and TNBC were 95.1%, 91.7% and 91.1%, respectively, and the 10-year OS rates were 88.9%, 78.8% and 88.3%, respectively. And the 5-year EFS rates of HER2-positive, Luminal and TNBC were 93.2%, 90.7% and 90.0%, respectively, and the 10-year EFS rates were 86.6%, 77.3% and 84.0%, respectively.

Figure 9 Kaplan-Meier curves and survival sequence analysis of different subtypes of pCR patients. (A) OS Kaplan-Meier curves of different subtypes of pCR patients. (B) EFS Kaplan-Meier curves of different subtypes of pCR patients. (C) Survival sequence analysis of death in different subtypes of pCR patients. (D) Survival sequence analysis of progression events in different subtypes of pCR patients. EFS, event-free survival; HER2, human epidermal growth factor receptor 2; OS, overall survival; pCR, pathological complete response; TNBC, triple-negative breast cancer.

In this study, we established the survival sequential analysis to investigate the pattern of peak time of death and progression in pCR patients with different molecular typing. The result suggests that the death peak occurs earlier in TNBC patients than in Luminal and HER2-positive breast cancer, with the earliest peak occurring at 36 months in TNBC patients and 97 months in HER2-positive subtype, and 114 months in Luminal subtype. For the progression events, the peak occurred as early as 84 months for HER2-positive breast cancer, and the peak months were closer for TNBC-subtype and Luminal-subtype breast cancers, at 111 and 114 months, respectively (Figure 9C,9D). Based on the univariate and multivariate Cox analysis, this study discovered independent risk factors of each subtypes: For luminal subtype, age, T stage, N stage and grade served as independent risk factors for OS and EFS, while age only influence OS. And age, T stage and N stage independently influence OS of HER2-positive breast cancer patients, while age and N stage have a significant impact of EFS. As for TNBC, the independent OS risk factors are age, T stage, N stage and marital status, meanwhile the independent EFS risk factors are age, T stage, N stage, quadrant and grade (Tables S2-S4).


Discussion

Most studies have focused on exploring risk factors affecting the achievement of pCR, and regard pCR as an effective endpoint surrogate, and fewer studies are interested in pCR patients with poor prognosis. This emphasis on the prognostic value of pCR is further supported by real-world evidence from multinational cohorts, such as the recent study by Antonini et al. in a Brazilian multicenter cohort (14), which confirmed a significant correlation between pCR and improved OS, consistent with findings from major clinical trials and registry analyses.

To pay more attention to the heterogeneity within this generally favorable group, this study enrolled patients registered in SEER database to analyze the risk factors affecting poor prognosis specifically among pCR patients, and constructed nomogram and RSF models to predict the poor prognosis of pCR patients by combining the traditional statistical models with machine learning methods. Compared with the published studies on pCR patients based on the SEER database (13), the present study included data from 12 databases in the SEER database, and the number of enrolled cases reached 8,459 after more stringent inclusion criteria (especially for those who did not pass the quality control) to ensure the authenticity of the data. Based on the published studies, the nomogram and RSF models were constructed by combining traditional statistical modeling and machine learning methods to predict the poor prognosis of patients.

Consisting with other published results and the overarching consensus that pCR portends improved survival, this study suggested that both T stage and N stage significantly influenced OS and EFS in pCR patients, both in terms of Cox regression model and RSF model construction, and subgroup analysis of different molecular subtypes. According to the results, tumor burden can be recognized as potential drivers of tumor recurrence and metastasis, even in patients achieving pCR. Of particular interest is the impact of regional lymph node status on survival, especially as N3 stage patients exhibit a significantly increased risk of death as well as risk of progression events even after achieving pCR, which is consistent with the results of published articles: among patients who have reached pCR, a study based on the SEER database demonstrated that patients with invasive ductal carcinoma staged at cN3 had a much higher risk of death than patients with stage cN0 (13). Several studies have shown that cN+ is are confirmed to have association with a worse prognosis in patients with pCR (8,9,15,16). These findings underscore that while pCR is a strong positive marker overall, as validated in international real-world settings like the Brazilian cohort, significant risk stratification remains necessary within this population. In addition, because of lack of a uniform definition of pCR, the CTNeoBC study conducted a comparative analysis of patients’ prognosis with different definitions of pCR after NAT: patients with ypT0/TisypN0 had significantly better prognosis than patients achieving ypT0/Tis (5). In conclusion, if pCR is defined as ypT0/Tis, pN stage has a significant impact on the poor prognosis of pCR patients, while in the case of ypT0/TisypN0, cN stage has a significant impact on pCR patients’ prognosis. And patients with higher clinical tumor (cT) stage had a higher probability of circulating tumor cells (CTCs) detected before NAT (17). Hence, pCR patients with higher tumor burden have a worse prognosis possibly associated with micro-metastases in the circulatory system. Severer cT and clinical nodal (cN) stages were identified as independent risk factors for adverse prognosis across all molecular subtypes of breast cancer, these findings underscore the significant prognostic value of higher initial tumor burden in predicting clinical outcomes, even when accounting for inter-subtype heterogeneity. Furthermore, age and histologic grade independently influenced the prognosis of pCR patients, with higher histologic grade associated with worse prognosis. This is consistent with published findings that higher grade is associated with a higher risk of recurrence, even after NAT to achieve pCR (18). In addition, patients with lobular cancer had a significantly higher risk of death and risk of progression events than patients with ductal cancer, and the adverse prognostic effect was especially pronounced for patients with TNBC, suggesting that additional attention needs to be paid to the prognosis of patients with pCR for lobular types, especially for patients with TNBC.

Despite the fact that mastectomy is considered to be a radical breast surgery, this study demonstrated pCR patients receiving mastectomy had a higher mortality risk, consisting with published findings in patients with pCR based on the SEER database (13). With regard to the radical surgical treatment for breast cancer, the results of studies conducted in the last 10 years have suggested that patients who underwent breast-conserving surgery combined with postoperative radiotherapy had an improved OS compared to those who underwent mastectomy (19,20). Clinical baseline characteristics of patients undergoing mastectomy usually show larger tumors and higher probability of lymph node metastases compared to patients who underwent breast-conserving surgery. Therefore, in order to balance the baseline differences, the study evaluated 1,641 patients who received neoadjuvant chemotherapy by propensity score matching, and the results still suggested that patients who underwent breast-conserving surgery combined with radiotherapy had a better prognosis compared with those who underwent mastectomy (21). Combining the results of this study with the latest relevant studies, we found that for patients who reached pCR after neoadjuvant chemotherapy, choosing breast-conserving surgery is more beneficial for survival prognosis. In addition, some patients may choose contralateral breast resection to prevent from tumor recurrence in the contralateral breast as well as affecting long-term survival, although information of contralateral breast resection is not available in the SEER database, a study have suggested that contralateral breast resection does not improve long-term prognosis (22). In order to reduce the risk of recurrence and metastasis in pCR patients with high-risk factors, if it is necessary to perform mastectomy or adjuvant therapy in high-risk patients is a hotspot of concern and inconclusive for clinicians, especially for adjuvant therapy: a study from Harbin Medical University (9) concluded that HER2-positive breast cancer with higher cStage, extensive lymph node involvement and T4 clinicopathologic features, the probability of poor prognosis is increased, for this group of patients, even if pCR is reached, systematic therapy is recommended (such as strengthened treatment, including trastuzumab and tyrosine kinase inhibitors), and for HR-positive subtype with severe lymph node involvement, early intensive treatment and continuous endocrine therapy are recommended. Meanwhile, a study concluded that adjuvant therapy after achieving pCR could not improve the prognosis (23), but the study did not screen the high-risk group. It is currently unknown whether strict screening of patients with high-risk factors for adjuvant therapy can further improve the prognostic outcome of high-risk patients. Therefore, the results of this study provide more theoretical basis for future surgical approach as well as adjuvant therapy selection and prospective clinical trials in patients after NAT.

The OS nomogram model constructed in this study showed great predictive efficacy in the overall population (C-index =0.723), especially in Luminal subtype (3-year AUC =0.734), supporting its clinical translational potential. Both OS and EFS nomogram models showed significantly higher predictive ability in Luminal subtype than other subtypes, suggesting that linear prediction model is more suitable for Luminal subtype. The better predictive performance of the OS nomogram model compared to the EFS nomogram is related to the limitations of the SEER database, where the events of EFS include recurrence, metastasis, and death, whereas a large number of metastatic events are missing from the SEER database, affecting the predictive performance of the EFS nomogram. The good predictive efficacy of the RSF model in the overall population (C-index =0.7211), and especially the excellent performance in HER2-positive subtypes (C-index =0.752) suggests that nonlinear algorithms may be better suited to capture the complex prognostic patterns of HER2-positive patients in the era of targeted therapies. And further construction of more suitable predictive models for TNBC subtype breast cancer is needed.

To contextualize our model within the existing landscape of predictive tools in breast cancer, we consider the recent systematic review by Antonini et al. (24), which summarized nomograms developed specifically for predicting pCR. While the focus of that review was on predicting the likelihood of achieving pCR after NAT, our model addresses a distinct yet equally critical clinical question: identifying which patients, despite achieving a pCR, remain at high risk for adverse long-term outcomes such as recurrence or death. This fundamental difference in prediction objective—pCR attainment versus post-pCR survival stratification—means that the predictor variables incorporated into our model naturally differ from those commonly emphasized in pCR-predicting nomogram. Methodologically, our approach incorporates both traditional Cox regression and machine learning techniques (RSF), which may offer advantages in capturing complex, non-linear relationships within the data, particularly for heterogeneous subtypes like HER2+ and TNBC, as suggested by our model’s performance. Although direct performance comparison of C-index or AUC values with the nomograms summarized by Antonini et al. (24) is not appropriate due to the differing endpoints predicted (pCR vs. OS/EFS), the discriminative ability of our model appears competitive and addresses a later phase in the patient care continuum. Future research could explore integrating factors predictive of both pCR and post-pCR survival to build comprehensive models guiding therapy from diagnosis through long-term follow-up.

This study explored the temporal pattern of progression events and death events and the heterogeneity of different molecular subtype in pCR breast cancer patients after NAT, with the aim of providing a preliminary theoretical basis for the follow-up strategy of pCR patients with different molecular subtypes. The findings reinforce that even within a group characterized by a favorable response marker like pCR, which is consistently associated with improved survival across diverse populations (14), intricate patterns of risk persist and require nuanced management strategies. TNBC is characterized as a distinct molecular subtype demonstrating heightened invasiveness and early disease progression, with comparative analyses revealing a 8.3-fold increased risk of breast cancer-specific mortality and 6.1-fold elevated overall mortality risk at 2-year follow-up when contrasted with hormone receptor-positive/HER2-negative subtypes (25), and TNBC exhibits a significantly higher risk of death compared to non-TNBC breast cancer regardless of stage (26). Although pCR is regarded as a good prognostic indicator, the 36-month peak of early death in the TNBC patients in the present cohort is consistent with the results of existing studies, which may be closely related to the rapid metastasis-death process of TNBC: genomic instability of TNBC drives the explosive growth of cryptic micrometastases at 2–3 years postoperatively (27), and the lack of targeted therapeutics drastically shortens the survival once metastasis has occurred (median OS is only 12–18 months) (28). The first peak of recurrence in patients with TNBC type occurs in 3 years (29,30), while the peak of TNBC progression time in this study occurred in 111 month, which seems to be inconsistent with the earlier time of recurrence and metastasis in the published articles, but in fact, the corresponding mortality rate of that month is not significantly different from the rest of the time points, and in the analysis of the peak time-ordering of progression events the actual curve for TNBC subtypes was flatter. The simultaneous occurrence of peak progression events and peak death events in Luminal patients at month 114 supports the “dormant escape” hypothesis: for Luminal subtype, the tumor dormancy signature score is much higher than that of HR-negative breast cancers (31), and residual ER-positive disseminated tumor cells may remain quiescent for many years until acquiring endocrine therapy-resistant mutations (32), ultimately leading to explosive metastatic growth and rapid death, leading to explosive metastatic growth and rapid death. The peak of death events in HER2-positive breast cancer patients in the results of this study appeared at 97 months, and the peak of progression events appeared at 84 months, and the peak of progression event temporal analysis curve of the actual HER2-positive type did not differ much from the rest of the curve’s small peaks, while the peak of time to death was significantly higher than the rest of the peaks. The study suggests that the 5-year survival rate of HER2-positive patients after reaching pCR can reach 95% (33), and the targeted therapy significantly improves the prognosis and prolongs the survival of patients with HER2-positive breast cancer (34); therefore, HER2-positive types have a better prognosis after reaching pCR after NAT, but still need to pay attention to the recurrence and death events at the long term period of 7–8 years. The results of the survival sequential analysis of adverse events suggest that early warning strategies need to be developed for clinical types-metastasis screening (e.g., circulating tumor DNA dynamic monitoring) should be focused on 2–3 years for TNBC, whereas lifelong prevention and control of late recurrence is needed for Luminal subtype, and tumor screening at 7–8 years for HER2-positive breast cancers.

This study has the following limitations: (I) lack of detailed therapeutic information: As a registry-based study, the SEER database lacks critical details on systemic therapy, which is a well-known limitation. This omission could potentially confound our prognostic models and affect the accuracy of risk stratification, particularly for specific subtypes. For HER2-positive patients, the absence of data on the use and duration of HER2-targeted therapies (e.g., trastuzumab, pertuzumab) is a major constraint. Since these therapies profoundly improve outcomes, our model might misclassify risk for patients who did not receive adequate targeted treatment, underestimating the true benefit of modern regimens. Similarly, for Luminal subtypes, the lack of information on endocrine therapy (e.g., agent, duration, adherence) and CDK4/6 inhibitor use means our model cannot account for the significant impact of these adjuvant treatments on long-term survival. Consequently, the model’s predictions for these subtypes should be interpreted as reflecting baseline tumor biology and clinical presentation, independent of subsequent therapeutic interventions. Future models integrating genomic data with detailed treatment records are needed to provide a more comprehensive assessment of prognosis; and (II) lack of external validation: Our nomogram and RSF models were developed and validated internally. However, the absence of external validation in an independent cohort remains a significant limitation, as it is essential for assessing the generalizability and transportability of any predictive model. To address this in future work, we propose several viable pathways for validation. First, our models could be tested in large, multinational real-world breast cancer cohorts, such as the Brazilian cohort described by Antonini et al., which would assess performance in a distinct healthcare setting. Second, validation within trials with long-term follow-up data (e.g., NSABP, NeoALTTO, or I-SPY2) would provide a robust framework for assessing model utility in a clinical trial context. Finally, prospective validation in newly diagnosed patients undergoing neoadjuvant chemotherapy would be the ultimate test of its clinical applicability. We encourage researchers with access to such datasets to independently validate our models.


Conclusions

In this study, we explored the independent influences on the occurrence of death and progression events based on pCR patients in the SEER database, and constructed nomograms and RSF models to predict the prognosis, and stratified the analysis among patients with different molecular staging. Age, T-stage, N-stage, molecular subtypes, histologic classification and histologic grade were simultaneously independent risk factors for OS and EFS in patients with pCR. Among them, N stage had the most significant prognostic impact, with N3 patients having 5.35 times the risk of death and 4.68 times the risk of progression events than N0 patients. Among different molecular staging breast cancers, HER2-positive type had the best prognosis, followed by TNBC patients, and Luminal subtype patients had the worst prognosis. The results of subgroup analysis based on molecular subtypes suggested that there were differences in prognostic drivers among patients with different molecular staging: patients with higher histologic grades had significantly worse prognosis (OS, EFS) among Luminal type patients, whereas patients with lobular type TNBC patients had worse EFS, and both T-stage and N-stage affected the prognosis of patients with breast cancers of each molecular staging. In addition, the present study analyzed the differences in survival prognosis and the temporal sequence of event peaks for different molecular subtypes, suggesting that the peak of fatal events occurred at 36 months for patients with breast cancer type TNBC, at 97 months for HER2-positive type, and at 114 months for Luminal type. For the peak occurrence of progression events, the waveforms of TNBC and HER2-positive breast cancers were relatively flat, and the peak month of progression events for Luminal subtype breast cancers appeared at 114 months, which overlapped with its peak of time to death.

The nomogram model constructed based on multivariable Cox regression showed good predictive efficacy in the overall population (OS C-index =0.723, EFS C-index =0.682), and especially strong predictive ability for 3-year prognosis (OS 3-year AUC =0.707, EFS 3-year AUC =0.69). The nomogram model had the best predictive performance for OS and EFS in patients with Luminal subtype (OS 3-year AUC =0.734, EFS 3-year AUC =0.715). The randomized survival forest (RSF) model showed similar efficacy to nomogram in OS prediction (C-index =0.721), but superior predictive ability for patients with HER2-positive subtype (C-index =0.752).

This study provides a reliable tool for individualized prognostic assessment of pCR patients through traditional statistical modeling and machine learning methods, and reveals the differences in subtype-specific risk factors and the characteristics of peak event timing, which provides a theoretical basis for the development of subsequent precision treatment and follow-up strategies.


Acknowledgments

We gratefully acknowledge the Surveillance, Epidemiology, and End Results (SEER) program (www.seer.cancer.gov) for providing open-access data resources. The interpretation and reporting of these data are the sole responsibility of the authors.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://gs.amegroups.com/article/view/10.21037/gs-2025-181/rc

Peer Review File: Available at https://gs.amegroups.com/article/view/10.21037/gs-2025-181/prf

Funding: This work was supported by the National Natural Science Foundation of China (Nos. 81672729, 81972597, 81602471, and 81972453), Zhejiang Provincial Natural Science Foundation of China (grant Nos. LY19H160059, LR22H160011, and LY19H160055), Zhejiang Provincial Medical and Health Science and Technology (Youth Talent Program) (grant No. 2021RC016).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://gs.amegroups.com/article/view/10.21037/gs-2025-181/coif). All authors report that this work was supported by the National Natural Science Foundation of China (Nos. 81672729, 81972597, 81602471, and 81972453), Zhejiang Provincial Natural Science Foundation of China (grant Nos. LY19H160059, LR22H160011, and LY19H160055), Zhejiang Provincial Medical and Health Science and Technology (Youth Talent Program) (grant No. 2021RC016). The authors have no other conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. The study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Bray F, Laversanne M, Sung H, et al. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2024;74:229-63. [Crossref] [PubMed]
  2. Gradishar WJ, Moran MS, Abraham J, et al. Breast Cancer, Version 3.2024, NCCN Clinical Practice Guidelines in Oncology. J Natl Compr Canc Netw 2024;22:331-57. [Crossref] [PubMed]
  3. The Society of Breast Cancer China Anti-Cancer Association, Breast Oncology Group of the Oncology Branch of the Chinese Medical Association. Guidelines for breast cancer diagnosis and treatment by China Anti-cancer Association (2024 edition). Zhongguo aizheng zazhi 2023;33:1092–187.
  4. Schmid P, Cortes J, Pusztai L, et al. Pembrolizumab for Early Triple-Negative Breast Cancer. N Engl J Med 2020;382:810-21. [Crossref] [PubMed]
  5. Cortazar P, Zhang L, Untch M, et al. Pathological complete response and long-term clinical benefit in breast cancer: the CTNeoBC pooled analysis. Lancet 2014;384:164-72. [Crossref] [PubMed]
  6. von Minckwitz G, Untch M, Blohmer JU, et al. Definition and impact of pathologic complete response on prognosis after neoadjuvant chemotherapy in various intrinsic breast cancer subtypes. J Clin Oncol 2012;30:1796-804. [Crossref] [PubMed]
  7. Symmans WF, Wei C, Gould R, et al. Long-Term Prognostic Risk After Neoadjuvant Chemotherapy Associated With Residual Cancer Burden and Breast Cancer Subtype. J Clin Oncol 2017;35:1049-60. [Crossref] [PubMed]
  8. Huober J, van Mackelenbergh M, Schneeweiss A, et al. Identifying breast cancer patients at risk of relapse despite pathological complete response after neoadjuvant therapy. NPJ Breast Cancer 2023;9:23. [Crossref] [PubMed]
  9. Huang Z, Jin S, Zeng M, et al. Clinical and Therapeutic Factors Vary by Prognosis in Patients with Pathological Complete Response After Neoadjuvant Therapy for Breast Cancer. Cancer Manag Res 2021;13:9235-46. [Crossref] [PubMed]
  10. Fayanju OM, Nwaogu I, Jeffe DB, et al. Pathological complete response in breast cancer patients following neoadjuvant chemotherapy at a Comprehensive Cancer Center: The natural history of an elusive prognosticator. Mol Clin Oncol 2015;3:775-80. [Crossref] [PubMed]
  11. Papakonstantinou A, Gonzalez NS, Pimentel I, et al. Prognostic value of ctDNA detection in patients with early breast cancer undergoing neoadjuvant therapy: A systematic review and meta-analysis. Cancer Treat Rev 2022;104:102362. [Crossref] [PubMed]
  12. Loi S, Drubay D, Adams S, et al. Tumor-Infiltrating Lymphocytes and Prognosis: A Pooled Individual Patient Analysis of Early-Stage Triple-Negative Breast Cancers. J Clin Oncol 2019;37:559-69. [Crossref] [PubMed]
  13. Xiao C, Guo Y, Xu Y, et al. Clinicopathological characteristics and survival analysis of different molecular subtypes of breast invasive ductal carcinoma achieving pathological complete response through neoadjuvant chemotherapy. World J Surg Oncol 2024;22:250. [Crossref] [PubMed]
  14. Antonini M, Mattar A, Bauk Richter FG, et al. Real-world evidence of neoadjuvant chemotherapy for breast cancer treatment in a Brazilian multicenter cohort: Correlation of pathological complete response with overall survival. Breast 2023;72:103577. [Crossref] [PubMed]
  15. Tanioka M, Shimizu C, Yonemori K, et al. Predictors of recurrence in breast cancer patients with a pathologic complete response after neoadjuvant chemotherapy. Br J Cancer 2010;103:297-302. [Crossref] [PubMed]
  16. Xie LY, Wang K, Chen HL, et al. Markers Associated With Tumor Recurrence in Patients With Breast Cancer Achieving a Pathologic Complete Response After Neoadjuvant Chemotherapy. Front Oncol 2022;12:860475. [Crossref] [PubMed]
  17. Bidard FC, Michiels S, Riethdorf S, et al. Circulating Tumor Cells in Breast Cancer Patients Treated by Neoadjuvant Chemotherapy: A Meta-analysis. J Natl Cancer Inst 2018;110:560-7. [Crossref] [PubMed]
  18. Long-term outcomes for neoadjuvant versus adjuvant chemotherapy in early breast cancer: meta-analysis of individual patient data from ten randomised trials. Lancet Oncol 2018;19:27-39. [Crossref] [PubMed]
  19. van Maaren MC, de Munck L, de Bock GH, et al. 10 year survival after breast-conserving surgery plus radiotherapy compared with mastectomy in early breast cancer in the Netherlands: a population-based study. Lancet Oncol 2016;17:1158-70. [Crossref] [PubMed]
  20. De la Cruz Ku G, Karamchandani M, Chambergo-Michilot D, et al. Does Breast-Conserving Surgery with Radiotherapy have a Better Survival than Mastectomy? A Meta-Analysis of More than 1,500,000 Patients. Ann Surg Oncol 2022;29:6163-88. [Crossref] [PubMed]
  21. Gwark S, Kim HJ, Kim J, et al. Survival After Breast-Conserving Surgery Compared with that After Mastectomy in Breast Cancer Patients Receiving Neoadjuvant Chemotherapy. Ann Surg Oncol 2023;30:2845-53. [Crossref] [PubMed]
  22. Giannakeas V, Lim DW, Narod SA. Bilateral Mastectomy and Breast Cancer Mortality. JAMA Oncol 2024;10:1228-36. [Crossref] [PubMed]
  23. Spring LM, Fell G, Arfe A, et al. Pathologic Complete Response after Neoadjuvant Chemotherapy and Impact on Breast Cancer Recurrence and Survival: A Comprehensive Meta-analysis. Clin Cancer Res 2020;26:2838-48. [Crossref] [PubMed]
  24. Antonini M, Pannain GD, Mattar A, et al. Systematic Review of Nomograms Used for Predicting Pathological Complete Response in Early Breast Cancer. Curr Oncol 2023;30:9168-80. [Crossref] [PubMed]
  25. Manuscript A, Features C. Clinicopathological Features, Patterns of Recurrence, and Survival Among Women With Triple-Negative Breast Cancer in the National Comprehensive Cancer Network. Cancer 2012;118:5463-72. [Crossref] [PubMed]
  26. Li X, Yang J, Peng L, et al. Triple-negative breast cancer has worse overall survival and cause-specific survival than non-triple-negative breast cancer. Breast Cancer Res Treat 2017;161:279-87. [Crossref] [PubMed]
  27. Yates LR, Knappskog S, Wedge D, et al. Genomic Evolution of Breast Cancer Metastasis and Relapse. Cancer Cell 2017;32:169-184.e7. [Crossref] [PubMed]
  28. Venkatesan P. Trastuzumab emtansine for HER2-positive breast cancer. Lancet Oncol 2016;17:e528. [Crossref] [PubMed]
  29. Kumar P, Aggarwal R. An overview of triple-negative breast cancer. Arch Gynecol Obstet 2016;293:247-69. [Crossref] [PubMed]
  30. Dent R, Trudeau M, Pritchard KI, et al. Triple-negative breast cancer: clinical features and patterns of recurrence. Clin Cancer Res 2007;13:4429-34. [Crossref] [PubMed]
  31. Kim RS, Avivar-Valderas A, Estrada Y, et al. Dormancy signatures and metastasis in estrogen receptor positive and negative breast cancer. PLoS One 2012;7:e35569. [Crossref] [PubMed]
  32. Kingston B, Cutts RJ, Bye H, et al. Genomic profile of advanced breast cancer in circulating tumour DNA. Nat Commun 2021;12:2423. [Crossref] [PubMed]
  33. Rastogi P, Tang G, Hassan S, et al. Long-term outcomes of dual vs single HER2-directed neoadjuvant therapy in NSABP B-41. Breast Cancer Res Treat 2023;199:243-52. [Crossref] [PubMed]
  34. Cameron D, Piccart-Gebhart MJ, Gelber RD, et al. 11 years' follow-up of trastuzumab after adjuvant chemotherapy in HER2-positive early breast cancer: final analysis of the HERceptin Adjuvant (HERA) trial. Lancet 2017;389:1195-205. [Crossref] [PubMed]
Cite this article as: Lin X, Wang S, Zhu Z, Guo Z, Luo M, Ding Q, Wang L, Zhou J. Identifying risk factors for poor prognosis and developing prognostic model in patients achieving pathological complete response after neoadjuvant therapy for breast cancer. Gland Surg 2025;14(11):2159-2178. doi: 10.21037/gs-2025-181

Download Citation