Development and validation of a nomogram to assess recurrence risk in young patients with breast cancer based on preoperative serum tumor markers
Original Article

Development and validation of a nomogram to assess recurrence risk in young patients with breast cancer based on preoperative serum tumor markers

Xicheng Du# ORCID logo, Zhiqiang Tang, Ye Shen, Guangjun Zhang# ORCID logo

Department of General Surgery, Fengxian District Central Hospital, Shanghai, China

Contributions: (I) Conception and design:; (II) Administrative support: ; (III) Provision of study materials or patients: (IV) Collection and assembly of data: ; (V) Data analysis and interpretation:; (VI) Manuscript writing: All authors; (VII) Final approval of manuscript: All authors.

#These authors contributed equally to this work.

Correspondence to: Guangjun Zhang. Department of General Surgery, Fengxian District Central Hospital, Shanghai 201499, China. Email: zhgj73@126.com.

Background: Postoperative recurrence is a primary risk following breast cancer surgery. Compared to older patients, younger breast cancer patients typically experience a poorer prognosis. This study aimed to develop a predictive model based on preoperative serological markers—carbohydrate antigen 125 (CA125) and carbohydrate antigen 153 (CA153) and postoperative treatment modalities to evaluate the 3- and 5-year recurrence-free survival (RFS) in young breast cancer patients aged under 40 years.

Methods: This retrospective study enrolled 521 patients who underwent breast cancer surgery at Fengxian District Central Hospital between June 2015 and April 2020. Based on the inclusion and exclusion criteria, 411 patients aged under 40 were ultimately included and divided into a training cohort and a validation cohort. Clinical characteristics were evaluated, including age, surgical procedure, tumor histological type, history of radiotherapy and chemotherapy, lymphovascular invasion (LVI), neural invasion (NI), tumor-node-metastasis (TNM) stage, molecular subtype, and preoperative serum tumor markers such as CA125 and CA153. Univariate analysis and Cox proportional hazards regression were used to select variables and develop a nomogram based on the training cohort. Survival analysis and plotting were performed using Kaplan-Meier curves and the log-rank test. The reliability of the nomogram was assessed using the concordance index (C-index), calibration plots, and clinical decision curve analysis (DCA).

Results: Using the random grouping function in SPSS, the patients were divided into a training cohort (282/411) and a validation cohort (129/411) at a 7:3 ratio. The nomogram prediction model incorporated four risk factors: high expression of CA125 or CA153, receipt of radiotherapy, and LVI. In the training and validation cohorts, the area under the curve (AUC) of the nomogram for predicting 3-year RFS was 0.854 and 0.793, respectively, while the AUC for 5-year RFS was 0.85 and 0.801, respectively. Calibration curves demonstrated that the predicted probabilities of the model were in good agreement with the actual observations. Furthermore, DCA indicated that the nomogram model provided a superior net clinical benefit within the threshold probability range of 10–75%.

Conclusions: We developed a survival prognostic model for young breast cancer patients based on preoperative serum tumor markers and postoperative treatment. The results confirmed that the combination of CA125 and CA153 is of great significance in predicting RFS in young women with breast cancer.

Keywords: Breast cancer; recurrence; serological test; prognostic model; predictive value


Submitted Jan 07, 2026. Accepted for publication Apr 28, 2026. Published online May 15, 2026.

doi: 10.21037/gs-2026-1-0017


Highlight box

Key findings

• A prognostic nomogram established by incorporating four accessible variables [high expression of carbohydrate antigen 125 (CA125) and carbohydrate antigen 153 (CA153), lymphovascular invasion (LVI), and history of radiotherapy] is used to predict postoperative recurrence-free survival (RFS) in young patients with breast cancer.

• Predicting RFS in young breast cancer patients using a prognostic model based on preoperative serological markers is feasible, demonstrating good discrimination and calibration.

What is known and what is new?

• According to the National Comprehensive Cancer Network (NCCN) guidelines, postoperative surveillance for breast cancer primarily relies on routine imaging. Serum biomarkers such as CA153 and CA125 are traditionally reserved mainly for monitoring metastatic disease.

• A cost-effective and practical recurrence risk nomogram for young breast cancer patients was constructed, which verified the feasibility of utilizing preoperative serum indicators in predicting RFS.

What is the implication, and what should change now?

• This nomogram can be used as a risk stratification tool to guide postoperative risk assessment and corresponding treatment decisions in young breast cancer. For patients at a high risk of recurrence, more intensive systemic therapy should be considered.


Introduction

Globally, breast cancer has become the most prevalent malignant tumor in women, and its incidence is closely related to age (1). It is worth noting that there are differences in the subtypes of breast cancer and the risk of mortality worldwide (2). In recent years, the trend of breast cancer affecting younger women has become increasingly evident, attracting widespread international attention (3). Internationally, young breast cancer patients are defined as those diagnosed at an age of <40 years. Compared to older breast cancer patients, young patients often have a poorer prognosis (4,5). According to relevant literature, breast cancer in young women not only shows a worse prognostic outcome but is also more likely to present as highly aggressive subtypes (such as triple-negative breast cancer) (2,6). According to the conclusions of the 4th European Society for Medical Oncology (ESMO) International Consensus Guidelines, breast cancer patients under the age of 40 have a higher proportion of triple-negative breast cancer, poorer breast cancer subtypes, high expression of the human epidermal growth factor receptor 2 (HER2) gene, and stronger lymph node invasiveness (7). Related reports indicate that newly diagnosed young breast cancer patients account for 7% of all new breast cancer cases annually (8,9). Compared to older breast cancer patients, younger patients have a stronger desire for a quality living environment (1,10). Therefore, it is necessary to establish relevant predictive models for personalized risk assessment.

For decades, a large body of clinical practice has demonstrated the important role of carbohydrate antigen 125 (CA125) in the diagnosis and prognosis of ovarian cancer (11). According to relevant literature reports, CA125, as a key regulator, plays an important role not only in the multicellular survival pathway of ovarian cancer but also in the pathogenesis of breast cancer (12). Relevant literature studies have shown that CA125 and carbohydrate antigen 153 (CA153) are closely related to tumor prognosis and have good predictive ability in the intraocular metastasis pathway of lung cancer (13,14). These two indicators also play an important role in predicting distant metastasis (DM) of breast cancer (15,16). Under normal circumstances, mammography is the preferred method for breast cancer screening, but its sensitivity is limited when examining dense breast tissue. The variability in interpreting mammograms can also cause great confusion for clinicians. In contrast, serological examinations can be widely carried out in hospitals of different levels (17). In recent years, relevant reports have shown that CA125 and CA153 play an important predictive role in metastatic breast cancer (18).

At present, relevant models have been used for the prognostic analysis of breast cancer, but too many indicators were included in the process of establishing the models, and no relevant validation cohort was established to test the predictive performance of the models. At the same time, the steps of model establishment, calibration, and clinical application analysis are incomplete (16,19). A study reported a support vector machine (SVM) model established with CA125 and CA153 as target predictors to predict postoperative recurrence and metastasis of breast cancer (20). However, this model only analyzed the predictive ability of a single indicator, while ignoring the combined predictive ability of the two indicators.

Current National Comprehensive Cancer Network (NCCN) guidelines rely primarily on routine imaging for postoperative surveillance, yet detecting asymptomatic recurrence remains a clinical challenge (21). Serum biomarkers like CA15-3 and CA125 are traditionally reserved for metastatic monitoring, but they offer distinct advantages in cost-effectiveness and ease of longitudinal tracking (22,23). There is growing interest in whether combining these markers into prognostic models can provide early warnings for high-risk patients and ultimately improve clinical outcomes.

Nomograms are intuitive visualization tools that accurately predict the probability of clinical events for individual patients. By incorporating relevant potential biomarkers, nomograms enhance the specificity of predictive models, thereby facilitating precision and personalized medicine. They have been widely applied across various malignancies and proven to be reliable tools for cancer prognosis. Their accuracy in predicting postoperative survival for gynecological and breast cancers has also been well-established (24-26). Currently, however, there is a lack of prognostic models for young breast cancer patients based on CA125 and CA153. Therefore, this study aims to develop a nomogram model that enables clinicians to intuitively assess the 3-year and 5-year RFS of young patients following breast cancer surgery. We present this article in accordance with the TRIPOD reporting checklist (available at https://gs.amegroups.com/article/view/10.21037/gs-2026-1-0017/rc).


Methods

Study subjects

This study retrospectively analyzed 411 breast cancer patients who underwent radical mastectomy in Fengxian District Central Hospital from June 2015 to April 2020. Postoperative follow-up was conducted at 2-month intervals and ceased upon disease recurrence. The study’s total follow-up duration extended up to 60 months after surgery. The baseline research information collected for the enrolled patients included: demographic information, surgical method, neoadjuvant chemotherapy, postoperative radiotherapy, chemotherapy, and endocrine therapy, tumor size, lymph node metastasis, clinical stage, lympovascular invasion, nerve invasion, serum tumor markers (CA125, CA153), estrogen receptor status, progesterone receptor status, and HER2 status. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Shanghai Fengxian District Central Hospital (the South Campus of Shanghai Sixth People’s Hospital) (No. 2025-KY-106-02). Informed consent was obtained from all the patients. Inclusion criteria: (I) age at diagnosis <40 years; (II) no preoperative metastasis and no bilateral breast cancer; (III) complete clinicopathological data, including immunohistochemistry results; (IV) patients with HER2 (2+) underwent additional fluorescence in situ hybridization (FISH) testing. Exclusion criteria: (I) incomplete or missing clinicopathological data; (II) lost to follow-up for more than 2 months; (III) presence of DM.

Diagnostic criteria and grouping

In this study, all patients underwent surgical resection, and the final diagnosis was based on the postoperative paraffin pathological results provided by the Department of Pathology, which served as the gold standard. Based on the pathological results, breast lesions were divided into breast carcinoma in situ and invasive breast cancer. Carcinoma in situ includes ductal carcinoma in situ, lobular carcinoma in situ, and Paget’s disease of the nipple. Invasive cancer includes invasive ductal carcinoma, invasive lobular carcinoma, micropapillary carcinoma, mucinous carcinoma, and other special types of cancer. All patients who met the study criteria were divided into a training cohort and a validation cohort at a 7:3 ratio using the random grouping function of a software. Ultimately, 411 patients who met the study design criteria were randomly divided into a training cohort of 282 patients and a validation cohort of 129 patients. The flowchart for the inclusion and exclusion of study subjects is shown in Figure 1.

Figure 1 Flow chart of patient selection.

Statistical analysis

Locoregional recurrence refers to the invasion of tissues surrounding the breast (e.g., chest wall, ipsilateral axillary lymph nodes, ipsilateral breast tissue, ipsilateral supraclavicular lymph nodes) by the tumor, excluding the primary lesion. DM refers to the phenomenon in which tumor cells escape from the primary lesion or regional lymph nodes, migrate, and colonize distant sites or organs to form secondary tumors. The primary endpoint was recurrence-free survival (RFS), defined as the time from surgery to the first occurrence of any disease recurrence, including loco-regional recurrence (LRR) and DM.

Based on relevant literature, the cutoff values for CA125 and CA153 were set at CA125 <35, CA125 ≥35, and CA153 <25, and CA153 ≥25, respectively (12,13,15,16). A new variable, Score-new, was created, where the dummy variable Y represents the simultaneous high expression of both markers, and the remaining combinations are represented by X.

Differences between groups were analyzed using the Pearson’s chi-squared test and Fisher’s exact test. Continuous variables were tested for normality and homogeneity of variance using the t-test or Wilcoxon rank-sum test. Cox regression analysis was employed to identify independent risk factors for the model. Based on the results of the univariable Cox regression analysis, factors with P<0.05 were selected for further multivariable Cox regression analysis. Subsequently, variables that remained significant in the multivariable analysis, along with clinical stage and LVI—both of which have been established in previous studies as closely associated with RFS—were incorporated into the least absolute shrinkage and selection operator (LASSO) regression analysis. This approach was used to determine the final independent risk factors associated with RFS in young breast cancer patients. The discriminative ability and calibration of the nomogram were evaluated using the concordance index (C-index), the area under the receiver operating characteristic (ROC) curve (AUC), and Bootstrap calibration curves. Internal validation was performed using the bootstrapping method (500 resamples) for both the training and validation cohorts. Calibration curves were generated to evaluate the agreement between the predicted probabilities and the actual observations. The Prognostic Index (PI) was calculated based on the basic form of the Cox proportional hazards model, and ROC curves were constructed using the Kaplan-Meier method. The formula for the PI is as follows:

PI=β1X1+β2X2++βmXmPI=β1X1+β2X2++βmXm

Where β represents the regression coefficient of the covariate, and X represents the value of the covariate. Kaplan-Meier curves were utilized to estimate RFS rates, with differences analyzed via the log-rank test. The clinical utility of the model and the net benefit to patients were assessed through decision curve analysis (DCA).

ROC curves were generated using the “survivalROC” R package. The nomogram was constructed using the “nomogramEx” package, while calibration plots were produced using the “rms” package to conduct 500 bootstrap repetitions for model validation in both the training and validation cohorts. Decision curves were plotted using the “dcurves” package. All statistical analyses were performed using IBM SPSS Statistics 26.0 and Rstudio software. A two-sided P value <0.05 was considered statistically significant.


Results

Analysis of clinical characteristics

The median age of the patients in this study was 37 years (range, 22–40 years). All patients were diagnosed with stage T1–3, N0–3, M0 invasive breast cancer. Regarding surgical procedures, 42 patients underwent breast-conserving surgery (BCS) combined with sentinel lymph node biopsy (SLNB), while 369 patients underwent total mastectomy. In the training cohort, 204 non-recurrent patients (86.08%) and 44 recurrent patients (97.78%) received postoperative chemotherapy, while 52 non-recurrent patients (21.94%) and 26 recurrent patients (57.78%) received postoperative radiotherapy. In the validation cohort, 101 non-recurrent patients (88.6%) and all 15 recurrent patients (100%) received postoperative chemotherapy, while 34 non-recurrent patients (29.82%) and 8 recurrent patients (53.33%) received postoperative radiotherapy. Among the 364 patients (88.56%) who received postoperative chemotherapy, those with estrogen/progesterone receptor-positive and human epidermal growth factor receptor 2-low expression (ER/PR+, HER2-low) received doxorubicin, cyclophosphamide, and nab-paclitaxel (AC-T) or docetaxel and cyclophosphamide (TC) regimens. Patients with HER2-positive status received targeted therapy with trastuzumab or pertuzumab. Detailed clinicopathological characteristics are presented in Table 1. Follow-up was censored upon the occurrence of a recurrence event.

Table 1

Baseline characteristics in the training cohort and validation cohort

Characteristics Training cohort (n=282) Validation cohort (n=129)
Non-recurrence (N=237) Recurrence (N=45) P value Non-recurrence (N=114) Recurrence (N=15) P value
Age 41.00 (37.00–43.00) 39.00 (35.00–43.00) 0.22 40.00 (36.00–43.00) 39.00 (34.00–40.00) 0.09
CA153-category 0.001 0.001
   <25 235 (99.16) 27 (60.00) 112 (98.25) 10 (66.67)
   ≥25 2 (0.84) 18 (40.00)   2 (1.75) 5 (33.33)
Chemotherapy 0.03 0.17
   No 33 (13.92) 1 (2.22)   13 (11.40) 0
   Yes 204 (86.08) 44 (97.78)   101 (88.60) 15 (100.00)
Radiotherapy 0.001 0.07
   No 185 (78.06) 19 (42.22)   80 (70.18) 7 (46.67)
   Yes 52 (21.94) 26 (57.78)   34 (29.82) 8 (53.33)
Endocrinotherapy 0.16 0.057
   No 42 (17.72) 12 (26.67) 16 (14.04) 5 (33.33)
   Yes 195 (82.28) 33 (73.33) 98 (85.96) 10 (66.67)
NACT 0.16 0.002
   No 204 (86.08) 35 (77.78) 98 (85.96) 8 (53.33)
   Yes 33 (13.92) 10 (22.22) 16 (14.04) 7 (46.67)
Operation 0.47 0.14
   BCS 24 (10.13) 3 (6.67) 15 (13.16) 0
   Mastectomy 213 (89.87) 42 (93.33) 99 (86.84) 15 (100.00)
Tumor type 0.91 0.6
   Ductal 170 (71.73) 34 (75.56) 80 (70.18) 13 (86.67)
   Lobular 13 (5.49) 1 (2.22) 2 (1.75) 0
   Micropapillary 13 (5.49) 2 (4.44) 7 (6.14) 0
   Mucinous 9 (3.80) 2 (4.44) 8 (7.02) 0
   Other type 32 (13.50) 6 (13.33) 17 (14.91) 2 (13.33)
NI 0.35 0.02
   No 216 (91.14) 39 (86.67) 101 (88.60) 10 (66.67)
   Yes 21 (8.86) 6 (13.33) 13 (11.40) 5 (33.33)
LVI 0.002 0.009
   No 163 (68.78) 20 (44.44) 71 (62.28) 4 (26.67)
   Yes 74 (31.22) 25 (55.56) 43 (37.72) 11 (73.33)
T-stage 0.03 0.07
   T1 129 (54.43) 15 (33.33) 61 (53.51) 5 (33.33)
   T2 90 (37.97) 25 (55.56) 47 (41.23) 7 (46.67)
   T3 18 (7.59) 5 (11.11) 6 (5.26) 3 (20.00)
pN-stage 0.001 0.02
   N0 143 (60.34) 10 (22.22) 59 (51.75) 3 (20.00)
   N1 55 (23.21) 17 (37.78) 36 (31.58) 5 (33.33)
   N2 26 (10.97) 9 (20.00) 11 (9.65) 5 (33.33)
   N3 13 (5.49) 9 (20.00) 8 (7.02) 2 (13.33)
TNM-stage 0.001 0.02
   I 85 (35.86) 4 (8.89) 34 (29.82) 2 (14.29)
   II 109 (45.99) 22 (48.89) 60 (52.63) 5 (35.71)
   III 43 (18.14) 19 (42.22) 20 (17.54) 7 (50.00)
LN operation 0.07 0.39
   SLND 48 (20.25) 4 (8.89) 17 (14.91) 1 (6.67)
   ALND 189 (79.75) 41 (91.11) 97 (85.09) 14 (93.33)
CA125-category 0.001 0.001
   <35 229 (96.62) 26 (57.78) 106 (92.98) 10 (66.67)
   ≥35 8 (3.38) 19 (42.22) 8 (7.02) 5 (33.33)
ER 0.03 0.12
   Positive 189 (79.75) 29 (64.44) 95 (83.33) 10 (66.67)
   Negative 48 (20.25) 16 (35.56) 19 (16.67) 5 (33.33)
PR 0.49 0.053
   Positive 180 (75.95) 32 (71.11) 93 (81.58) 9 (60.00)
   Negative 57 (24.05) 13 (28.89) 21 (18.42) 6 (40.00)
HER2 0.46 0.04
   Positive 61 (25.74) 14 (31.11) 31 (27.19) 8 (53.33)
   Negative 176 (74.26) 31 (68.89) 83 (72.81) 7 (46.67)
KI67-category 0.68 0.001
   ≤15% 73 (30.80) 12 (26.67) 43 (37.72) 0
   15–30% 71 (29.96) 12 (26.67) 37 (32.46) 3 (20.00)
   >30% 93 (39.24) 21 (46.67) 34 (29.82) 12 (80.00)
Subtypes 0.68 0.03
   Luminal A 64 (27.00) 10 (22.22) 43 (37.72) 0
   Luminal B 86 (36.29) 14 (31.11) 30 (26.32) 4 (26.67)
   Lumial-HER2 45 (18.99) 9 (20.00) 25 (21.93) 6 (40.00)
   HER2-positive 16 (6.75) 5 (11.11) 6 (5.26) 2 (13.33)
   Triple-negative 26 (10.97) 7 (15.56) 10 (8.77) 3 (20.00)
Score-New 0.73 0.71
   X 236 (99.58) 35 (77.78) 114 (100) 12 (80.00)
   Y 1 (0.42) 10 (22.22) 0 3 (20.00)

Data are presented as n (%) or median (interquartile range). ALND, axillary lymph node dissection; BCS, breast-conserving surgery; CA125, carbohydrate antigen 125; CA153, carbohydrate antigen 153; ER, estrogen receptor; HER2, human epidermal growth factor receptor 2; LN, lymph node; LVI, lympho-vascular invasion; NACT, neoadjuvant chemotherapy; NI, nerve invasion; PR, progesterone receptor; Score-New, a new covariable that both highly expressed CA125 and CA153; SLND, sentinel lymph node dissection; TNM, tumor-node-metastasis; X, two indicators were not all highly expressed; Y, both highly CA125 and CA153 expression.

Univariate and multivariate Cox analysis and nomogram constructions

Screening of model variables was performed in the training cohort. In the training cohort, variables were screened for the model. Univariate Cox regression analysis showed that CA125, CA153, postoperative radiotherapy, lymphovascular invasion (LVI), tumor size, lymph node stage, clinical stage, estrogen receptor expression, and the co-expression of CA125 and CA153 were all correlated with recurrence (P<0.05). Indicators with statistical significance in univariate analysis were incorporated into multivariate Cox regression. Finally, multivariate analysis showed that categorical CA125, CA153, and postoperative radiotherapy were independent risk factors (Table 2). Since clinical stage and LVI have an impact on tumor recurrence, these five variables were included in LASSO regression to screen predictive variables and avoid losing important indicators. The five clinicopathological variables were reduced to four by LASSO regression: categorical CA125 and CA153, LVI, and postoperative radiotherapy (Figure 2).

Table 2

Cox regression analysis of the risk factors for recurrence in the training cohort

Factors Univariate analysis Multivariate analysis
HR 95% CI P HR 95% CI P
Age 1 0.92–1.10 0.87
CA153-category
   <25 1 1
   ≥25 15.5 8.41–28.68 0.001 11.69 4.78–28.56 0.001
Chemotherapy
   No 1
   Yes 6.54 0.90–47.48 0.06
Radiotherapy
   No 1 1
   Yes 4.22 2.33–7.62 0.001 2.40 1.13–5.11 0.02
Endocrinotherapy
   No 1
   Yes 0.60 0.30–1.16 0.13
NACT
   No 1
   Yes 1.67 0.82–3.36 0.16
Operation
   BCS 1
   Mastectomy 1.55 0.48–4.99 0.47
Tumor type
   Ductal 1
   Lobular 0.4 0.06–2.93 0.37
   Micropapillary 0.8 0.19–3.33 0.76
   Mucinous 1.10 0.27–4.59 0.89
   Other type 0.96 0.40–2.29 0.93
NI
   No 1
   Yes 1.57 0.67–3.71 0.30
LVI
   No 1 1
   Yes 2.58 1.43–4.64 0.002 1.46 0.70–3.06 0.32
T-stage
   T1 1 1
   T2 2.17 1.14–4.11 0.02 1.48 0.68–3.21 0.33
   T3 2.26 0.82–6.22 0.11 2.09 0.59–7.35 0.25
pN-stage
   N0 1 1
   N1 3.92 1.79–8.55 0.001 2.99 0.92–9.67 0.07
   N2 4.55 1.85–11.20 0.001 1.24 0.08–20.43 0.88
   N3 8.32 3.38–20.49 0.001 1.82 0.10–33.89 0.69
TNM-stage
   I 1 1
   II 3.93 1.36–11.41 0.01 0.62 0.14–2.83 0.54
   III 8.16 2.78–24.01 0.001 1.03 0.06–18.17 0.99
LN operation
   Sentinel 1
   Axillary 2.37 0.85–6.62 0.10
CA125-category
   <35 1 1
   ≥35 11.01 6.10–20.19 0.001 10.16 4.43–23.32 0.001
ER
   Positive 1 1
   Negative 2.14 1.16–3.93 0.02 1.65 0.81–3.34 0.18
PR
   Positive 1
   Negative 1.3 0.68–2.48 0.43
HER2
   Positive 1
   Negative 0.75 0.40–1.41 0.37
KI67-category
   ≤15% 1
   15–30% 1.06 0.48–2.36 0.88
   >30% 1.4 0.69–2.85 0.35
Subtypes
   Luminal A 1
   Luminal B 1.08 0.48–2.43 0.85
   Luminal-HER2 1.36 0.54–3.29 0.53
   HER2-positive 2.06 0.71–6.04 0.19
   Triple-negative 1.73 0.66–4.53 0.29
Score-New
   X 1 1
   Y 15.67 7.60–32.34 0.001 0.15 0.04–0.56 0.16

BCS, breast-conserving surgery; CA125, carbohydrate antigen 125; CA153, carbohydrate antigen 153; CI, confidence interval; ER, estrogen receptor; HER2, human epidermal growth factor receptor 2; HR, hazard ratio; LN, lymph node; LVI, lympho-vascular invasion; NACT, neoadjuvant chemotherapy; NI, nerve invasion; PR, progesterone receptor; Score-New, a new covariable that both highly expressed CA125 and CA153; TNM, tumor-node-metastasis; X, two indicators were not all highly expressed; Y, both highly CA125 and CA153 expression.

Figure 2 (A) In the LASSO model, the tuning parameter (λ) was selected using 10-fold cross-validation based on the 1-SE of the minimum criteria (the 1-SE criteria). Feature selection was guided by the λ value that yielded the minimum average binomial deviance. Dotted vertical lines indicate the optimal values determined by both the minimum criteria and the 1-SE criteria. The optimal λ of 0.02, was chosen. (B) The LASSO coefficient profiles for the five clinicopathology features are displayed. A vertical line marks the value selected through 10-fold cross-validation, where the optimal model included five non-zero coefficients (five variables were initially selected via LASSO regression: CA125 stage, CA153 stage, radiotherapy, LVI, and TNM stage, with β coefficients of 1.51, 1.6, 0.87, 0.15, and 0.03, respectively. Due to the minimal contribution of TNM stage, which could cause bias in the nomogram, the remaining four variables were ultimately chosen for the final model). CA125, carbohydrate antigen 125; CA153, carbohydrate antigen 153; LASSO, least absolute shrinkage and selection operator; LVI, lympho-vascular invasion; SE, standard error; TNM, tumor-node-metastasis.

Nomograms for predicting 3- and 5-year RFS were constructed based on CA125 status, CA153 status, LVI, and postoperative radiotherapy (Figure 3). In the nomogram, the length of the line corresponding to each variable reflects its relative contribution to the prognosis of breast cancer. The findings indicate that high CA125 expression, high CA153 expression, the receipt of postoperative radiotherapy, and the presence of LVI are the four most significant.

Figure 3 Nomogram for predicting 3- and 5-year RFS in patients with the indicated prognostic factors. The total score is calculated by summing the individual scores of all predictors. The predicted probability of RFS is then obtained by projecting the total score onto the bottom scale. CA125, carbohydrate antigen 125; CA153, carbohydrate antigen 153; LVI, lymphovascular invasion; RFS, recurrence-free survival.

Model performance and clinical application

Our nomogram was validated in both the training and validation cohorts. The calibration curves showed excellent agreement between the actual survival probability and the predicted survival probability in both cohorts (Figure 4). The AUC was used to assess discriminatory power. For predicting the 3-year RFS rate, the AUC was 0.854 in the training cohort and 0.793 in the validation cohort. For the 5-year RFS rate, the AUC was 0.85 in the training cohort and 0.801 in the validation cohort (Figure 5). Our findings indicate that the nomogram can effectively predict RFS in patients. During the overall 5-year follow-up, the C-index was 0.837 [95% confidence interval (CI), 0.774–0.900] for the training cohort and 0.783 (95% CI, 0.665–0.900) for the validation cohort, indicating that the model has significant discriminative ability.

Figure 4 Calibration curves for 3- and 5-year RFS. (A) 3-year RFS in the training cohort; (B) 3-year RFS in the validation cohort; (C) 5-year RFS in the training cohort; (D) 5-year RFS in the validation cohort. RFS, recurrence-free survival.
Figure 5 ROC curves for 3- and 5-year RFS prediction in young breast cancer. (A) 3-year RFS (training cohort); (B) 3-year RFS (validation cohort); (C) 5-year RFS (training cohort); (D) 5-year RFS (validation cohort). AUC, area under the curve; RFS, recurrence-free survival; ROC, receiver operating characteristic.

The clinical utility of the model was evaluated using DCA. The nomogram in the training cohort demonstrated a higher net benefit within the threshold probability range of 10% to 75% (Figure 6).

Figure 6 DCA of the nomogram for 3- and 5-year survival prediction. DCA, decision curve analysis.

Survival analysis

We calculated the PI and used it as a continuous variable to identify ROC curve cut-off values (Figure 5). In the training group, the 3-year cut-off was 0.875 (associated with a 5.4% recurrence rate), and the 5-year cut-off was 1.25 (associated with a 13.4% recurrence rate); values above these thresholds represent high recurrence risk. Kaplan-Meier curves were then plotted (Figure 7). Log-rank tests showed significant differences between high- and low-risk groups. For the training cohort, P=0.001 at both 3 and 5 years. For the validation cohort, P=0.001 at both 3 and 5 years. This means that the difference between the low-risk and high-risk recurrence groups is significant across the entire study cohort.

Figure 7 Kaplan-Meier curves for RFS: (A) 3-year RFS, training cohort; (B) 3-year RFS, validation cohort; (C) 5-year RFS, training cohort; (D) 5-year RFS, validation cohort. RFS, recurrence-free survival.

Discussion

This study proposes a comprehensive risk assessment model integrating preoperative serum tumor markers (CA125 and CA153) with subsequent anticipated treatment pathways to predict the risk of postoperative recurrence in young breast cancer patients. This represents a transition from qualitative description to quantitative analysis with reliable predictive value. During the validation process, the model demonstrated significant discriminatory power and a wide range of clinical application thresholds. Beyond the routine imaging surveillance recommended by NCCN guidelines, our developed nomogram serves as a quantitative tool. By combining baseline biological status with the expected treatment course, it assists clinicians in forecasting long-term recurrence risks and facilitates individualized risk stratification for high-risk populations.

Li et al. constructed a nomogram to predict the impact of serum tumor markers on overall survival (OS) and disease-free survival (DFS) in 576 young breast cancer patients through a cohort analysis (27). Unlike the present study, the nomogram by Li et al. has not yet been internally or externally validated to assess the predictive performance of the model, and it does not include histopathological indicators (such as nerve invasion and LVI), which can lead to the loss of highly relevant variables during the research process. Furthermore, Li L et al. established a SVM model based on the serum tumor markers CA125 and CA153 to predict the risk of recurrence in 168 patients with invasive breast cancer, but it only conducted analysis and prediction based on a single variable (28). The advantage of our model is that it has a complete construction process for the prediction model and a wider range of clinical application thresholds.

Wei et al. reported on a study of recurrence risk in postoperative radiotherapy and endocrine therapy for breast cancer, determining that postoperative radiotherapy significantly improved long-term survival in patients intolerant to endocrine therapy (29). Zhao et al. continued their research on postmastectomy radiotherapy (PMRT) for breast cancer, investigating the role of postoperative radiotherapy in patients with T1-T2 stage breast cancer and 1–3 positive axillary lymph nodes. Their work further demonstrated that PMRT reduces the probability of locoregional recurrence in these patients (30). Ronsini et al. reported on a semiquantitative evaluation of lymphovascular space invasion in patients with early-stage cervical cancer, proving that lymphovascular space invasion may shorten the DFS of patients with early-stage cervical cancer and increase the risk of lymph node and DM (31). This study confirmed that the risk factors from the above-mentioned studies remain closely related to the recurrence risk of the patients in our study. In related research and guidelines, postoperative radiotherapy has been widely recognized as a remedial measure after BCS (30,32,33). The advantage of our study is that we used these indicators with strong predictive ability as model variables. In addition, these recognized and routinely monitored clinical indicators have a higher degree of acceptance and adoption.

The findings of this study demonstrate that CA125 and CA153 exhibit high efficacy in predicting recurrence in young patients with breast cancer, as evidenced by their dominance in the LASSO regression analysis (34), where they accounted for the largest coefficient proportions. Consequently, the combined detection of these two markers holds significant clinical importance. Previous research has established CA125 as a specific tumor marker for ovarian cancer, while CA153, a glycoprotein of the mucin-1 (MUC-1) family, is overexpressed in various malignancies and has been identified as a biomarker for numerous tumors (10,12,16). The clinical value of this combination lies not only in enhancing the sensitivity and specificity of individual markers through biological complementarity—thereby providing a more reliable early warning for recurrence risk—but also in laying a robust foundation for the construction of high-efficiency risk prediction models. Future research should focus on integrating more multidimensional novel indicators to achieve dynamic monitoring and early warning of recurrence risk.

This study has several limitations regarding the development of the predictive model. First, since the study population was derived from a single center, the cohort size is relatively small and lacks sufficient external validation. Consequently, model validation was conducted internally by splitting the study cohort proportionally. Although a single-center design can reduce heterogeneity in treatment protocols and clinical indicators, the absence of external validation may limit the generalizability of the model in broader clinical practice. Second, the endpoint of this study was a general recurrence event rather than a recurrence at a specific site. Previous studies have shown that the site of recurrence after breast cancer surgery is closely related to long-term survival. In addition, the patient’s recurrence rate will change differently over time. When studying the impact of recurrence at a specific site on the long-term survival of patients, the use of competitive risk regression analysis has become a new statistical method (35-37). This method can exclude the influence of other recurrence sites on a specific recurrence site. When studying the OS and DFS of patients, this method excludes the influence of other outcome events, and finally achieves more accurate prediction performance (multi-factor Cox regression usually overestimates the OS and DFS values of the target variable because no interference variables are set). Furthermore, our research variables are limited to clinicopathological indicators. In recent years, the research on gene detection and diagnosis of breast cancer has progressed very rapidly, such as the research on the prognostic role of breast cancer susceptibility gene (BRCA) 1/2 in triple-negative breast cancer, the analysis of the prognosis of breast cancer by 21-gene screening, and the extensive application of the Nottingham Prognostic Index (NPI) in the practical research of clinical prediction models for breast cancer (38-40). In addition, there have been related reports based on imaging examinations of breast cancer, such as the study of the subsequent malignant probability of breast masses with ultrasound Imaging Reporting and Data System (BI-RADS) 4A, and the study of predicting breast sentinel lymph node metastasis based on breast magnetic resonance imaging (MRI) performance (41,42). In the following work, we will continue to study the impact of radiomics and genomics parameters on the prognosis of breast cancer, which has enlightening significance in promoting the establishment of different prognostic model forms and guiding clinical practice.


Conclusions

In summary, we developed a comprehensive recurrence risk assessment model for young breast cancer patients by integrating preoperative baseline characteristics with anticipated treatment pathways. Our findings confirm that the model incorporating preoperative serum tumor markers CA125 and CA153 achieves the highest predictive performance. Moreover, these variables are non-invasive and easily accessible. This model enables clinicians to preliminarily estimate the probability of recurrence in high-risk populations and foresee long-term risks, providing significant clinical value for guiding postoperative treatment in young breast cancer.


Acknowledgments

None.


Footnote

Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://gs.amegroups.com/article/view/10.21037/gs-2026-1-0017/rc

Data Sharing Statement: Available at https://gs.amegroups.com/article/view/10.21037/gs-2026-1-0017/dss

Peer Review File: Available at https://gs.amegroups.com/article/view/10.21037/gs-2026-1-0017/prf

Funding: This study was supported by the Key Medical Disciplines of Shanghai Municipal Health Commission (No. 2024ZDXK0045).

Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://gs.amegroups.com/article/view/10.21037/gs-2026-1-0017/coif). The authors have no conflicts of interest to declare.

Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of Shanghai Fengxian District Central Hospital (the South Campus of Shanghai Sixth People’s Hospital) (No. 2025-KY-106-02). Informed consent was taken from all the patients.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.


References

  1. Porcu E, Cillo GM, Cipriani L, et al. Impact of BRCA1 and BRCA2 mutations on ovarian reserve and fertility preservation outcomes in young women with breast cancer. J Assist Reprod Genet 2020;37:709-15. [Crossref] [PubMed]
  2. Hu X, Myers KS, Oluyemi ET, et al. Presentation and characteristics of breast cancer in young women under age 40. Breast Cancer Res Treat 2021;186:209-17. [Crossref] [PubMed]
  3. Sopik V. International variation in breast cancer incidence and mortality in young women. Breast Cancer Res Treat 2021;186:497-507. [Crossref] [PubMed]
  4. Poorvu PD, Gelber SI, Rosenberg SM, et al. Prognostic Impact of the 21-Gene Recurrence Score Assay Among Young Women With Node-Negative and Node-Positive ER-Positive/HER2-Negative Breast Cancer. J Clin Oncol 2020;38:725-33. [Crossref] [PubMed]
  5. Walsh SM, Zabor EC, Flynn J, et al. Breast cancer in young black women. Br J Surg 2020;107:677-86. [Crossref] [PubMed]
  6. Vila MM, Barco Berron SD, Gil-Gil M, et al. Psychosocial aspects and life project disruption in young women diagnosed with metastatic hormone-sensitive HER2-negative breast cancer. Breast 2020;53:44-50. [Crossref] [PubMed]
  7. Paluch-Shimon S, Cardoso F, Partridge AH, et al. ESO-ESMO 4th International Consensus Guidelines for Breast Cancer in Young Women (BCY4). Ann Oncol 2020;31:674-96. [Crossref] [PubMed]
  8. Anders CK, Johnson R, Litton J, et al. Breast cancer before age 40 years. Semin Oncol 2009;36:237-49. [Crossref] [PubMed]
  9. Merlo DF, Ceppi M, Filiberti R, et al. Breast cancer incidence trends in European women aged 20-39 years at diagnosis. Breast Cancer Res Treat 2012;134:363-70. [Crossref] [PubMed]
  10. Rosenberg SM, Dominici LS, Gelber S, et al. Association of Breast Cancer Surgery With Quality of Life and Psychosocial Well-being in Young Breast Cancer Survivors. JAMA Surg 2020;155:1035-42. [Crossref] [PubMed]
  11. Cramer DW, O’Rourke DJ, Vitonis AF, et al. CA125 immune complexes in ovarian cancer patients with low CA125 concentrations. Clin Chem 2010;56:1889-92. [Crossref] [PubMed]
  12. Li J, Liu L, Feng Z, et al. Tumor markers CA15-3, CA125, CEA and breast cancer survival by molecular subtype: a cohort study. Breast Cancer 2020;27:621-30. [Crossref] [PubMed]
  13. Lin Q, Chen XY, Liu WF, et al. Diagnostic value of CA-153 and CYFRA 21-1 in predicting intraocular metastasis in patients with metastatic lung cancer. Cancer Med 2020;9:1279-86. [Crossref] [PubMed]
  14. Ge QM, Zou YT, Shi WQ, et al. Ocular Metastasis in Elderly Lung Cancer Patients: Potential Risk Factors of CA-125, CA-153 and TPSA. Cancer Manag Res 2020;12:1801-8. [Crossref] [PubMed]
  15. Zhang J, Wei Q, Dong D, et al. The role of TPS, CA125, CA15-3 and CEA in prediction of distant metastasis of breast cancer. Clin Chim Acta 2021;523:19-25. [Crossref] [PubMed]
  16. Li Y, Zhang J, Wang B, et al. A nomogram based on clinicopathological features and serological indicators predicting breast pathologic complete response of neoadjuvant chemotherapy in breast cancer. Sci Rep 2021;11:11348. [Crossref] [PubMed]
  17. Chen S, Zeng J, Gong K, et al. A novel four-serum marker model for early detection and therapeutic monitoring of breast cancer. Sci Rep 2025;16:614. [Crossref] [PubMed]
  18. Giridhar KV, Sinnwell JP, Slettedahl SW, et al. Plasma assay of methylated DNA markers detects recurrent metastatic breast cancer. NPJ Breast Cancer 2025;12:2. [Crossref] [PubMed]
  19. Li X, Dai D, Chen B, et al. Prognostic Values Of Preoperative Serum CEA And CA125 Levels And Nomograms For Young Breast Cancer Patients. Onco Targets Ther 2019;12:8789-800. [Crossref] [PubMed]
  20. Li L, Gao Q, Xu G, et al. Postoperative recurrence analysis of breast cancer patients based on clinical serum markers using discriminant methods. Cancer Biomark 2017;19:403-9. [Crossref] [PubMed]
  21. Ajani JA, D’Amico TA, Bentrem DJ, et al. Gastric Cancer, Version 2.2025, NCCN Clinical Practice Guidelines In Oncology. J Natl Compr Canc Netw 2025;23:169-91. [Crossref] [PubMed]
  22. Liu J, Peng X, Yang Y, et al. The value of hsa_circ_0058514 in plasma extracellular vesicles for breast cancer. Front Oncol 2022;12:995196. [Crossref] [PubMed]
  23. Han T, Deng S, Xia D, et al. Clinical value of cystatin S in patients with colorectal cancer chemotherapy. Front Oncol 2025;15:1640646. [Crossref] [PubMed]
  24. Wang L, Wei S, Zhou B, et al. A nomogram model to predict the venous thromboembolism risk after surgery in patients with gynecological tumors. Thromb Res 2021;202:52-8. Erratum in: Thromb Res 2022;210:4-5.
  25. Qi Y, Wu S, Tao L, et al. Development of Nomograms for Predicting Lymph Node Metastasis and Distant Metastasis in Newly Diagnosed T1-2 Non-Small Cell Lung Cancer: A Population-Based Analysis. Front Oncol 2021;11:683282. [Crossref] [PubMed]
  26. Teng L, Zhang Z, Du J, et al. Nomogram construction and survival analysis in T3N0M0 breast cancer: a SEER population-based analysis. Sci Rep 2025;15:25194. [Crossref] [PubMed]
  27. Li X, Dai D, Chen B, et al. Prognostic Values Of Preoperative Serum CEA And CA125 Levels And Nomograms For Young Breast Cancer Patients. Onco Targets Ther 2019;12:8789-800. [Crossref] [PubMed]
  28. Li L, Gao Q, Xu G, et al. Postoperative recurrence analysis of breast cancer patients based on clinical serum markers using discriminant methods. Cancer Biomark 2017;19:403-9. [Crossref] [PubMed]
  29. Wei M, Wang X, Zimmerman DN, et al. Endocrine therapy and radiotherapy use among older women with hormone receptor-positive, clinically node-negative breast cancer. Breast Cancer Res Treat 2021;187:287-94. [Crossref] [PubMed]
  30. Zhao JM, An Q, Sun CN, et al. Prognostic factors for breast cancer patients with T1-2 tumors and 1-3 positive lymph nodes and the role of postmastectomy radiotherapy in these patients. Breast Cancer 2021;28:298-306. [Crossref] [PubMed]
  31. Ronsini C, Anchora LP, Restaino S, et al. The role of semiquantitative evaluation of lympho-vascular space invasion in early stage cervical cancer patients. Gynecol Oncol 2021;162:299-307. [Crossref] [PubMed]
  32. Denduluri N, Somerfield MR, Chavez-MacGregor M, et al. Selection of Optimal Adjuvant Chemotherapy and Targeted Therapy for Early Breast Cancer: ASCO Guideline Update. J Clin Oncol 2021;39:685-93. [Crossref] [PubMed]
  33. Korde LA, Somerfield MR, Carey LA, et al. Neoadjuvant Chemotherapy, Endocrine Therapy, and Targeted Therapy for Breast Cancer: ASCO Guideline. J Clin Oncol 2021;39:1485-505. [Crossref] [PubMed]
  34. Qin JN, Dai WB, Zhang WH, et al. Identification of optimal biomarkers associated with distant metastasis in breast cancer using Boruta and Lasso machine learning algorithms. BMC Cancer 2025;25:1311. [Crossref] [PubMed]
  35. Kennecke H, Yerushalmi R, Woods R, et al. Metastatic behavior of breast cancer subtypes. J Clin Oncol 2010;28:3271-7. [Crossref] [PubMed]
  36. Kimbung S, Loman N, Hedenfalk I. Clinical and molecular complexity of breast cancer metastases. Semin Cancer Biol 2015;35:85-95. [Crossref] [PubMed]
  37. Kast K, Link T, Friedrich K, et al. Impact of breast cancer subtypes and patterns of metastasis on outcome. Breast Cancer Res Treat 2015;150:621-9. [Crossref] [PubMed]
  38. Sparano JA, Crager MR, Tang G, et al. Development and Validation of a Tool Integrating the 21-Gene Recurrence Score and Clinical-Pathological Features to Individualize Prognosis and Prediction of Chemotherapy Benefit in Early Breast Cancer. J Clin Oncol 2021;39:557-64. [Crossref] [PubMed]
  39. Bernstein-Molho R, Laitman Y, Galper S, et al. Locoregional Treatments and Ipsilateral Breast Cancer Recurrence Rates in BRCA1/2 Mutation Carriers. Int J Radiat Oncol Biol Phys 2021;109:1332-40. [Crossref] [PubMed]
  40. Elwood JM, Tawfiq E. Development and validation of a new predictive model for breast cancer survival in New Zealand and comparison to the Nottingham prognostic index. BMC Cancer 2018;18:897. [Crossref] [PubMed]
  41. Yang Y, Hu Y, Shen S, et al. A new nomogram for predicting the malignant diagnosis of Breast Imaging Reporting and Data System (BI-RADS) ultrasonography category 4A lesions in women with dense breast tissue in the diagnostic setting. Quant Imaging Med Surg 2021;11:3005-17. [Crossref] [PubMed]
  42. Zhang X, Yang Z, Cui W, et al. Preoperative prediction of axillary sentinel lymph node burden with multiparametric MRI-based radiomics nomogram in early-stage breast cancer. Eur Radiol 2021;31:5924-39. [Crossref] [PubMed]
Cite this article as: Du X, Tang Z, Shen Y, Zhang G. Development and validation of a nomogram to assess recurrence risk in young patients with breast cancer based on preoperative serum tumor markers. Gland Surg 2026;15(6):166. doi: 10.21037/gs-2026-1-0017

Download Citation