A practical ultrasound-based risk stratification system for differentiating follicular thyroid carcinoma and adenoma
Highlight box
Key findings
• The ultrasound-based risk stratification system (F-Score) demonstrated good predictive performance both in the training set and the validation set, with an area under the curve of 0.877.
What is known and what is new?
• The preoperative diagnosis of follicular thyroid carcinoma (FTC) remains a major challenge, and the commonly used ultrasound-based risk stratification systems for thyroid nodules exhibits unsatisfactory predictive efficacy for FTC.
• The current study has identified margin and halo sign as the most critical ultrasound features for distinguishing FTC from follicular thyroid adenoma (FTA), followed by echogenicity, calcification, and echotexture. The F-Score can effectively differentiate between FTC and FTA.
What is the implication, and what should change now?
• The application of the F-Score will enhance clinicians’ ability to identify FTC and optimize the management of patients with follicular neoplasms.
• A prospective multicenter study is needed to validate the diagnostic efficacy of F-Score in the future.
Introduction
The prevalence of thyroid carcinoma remains high worldwide (1). Follicular thyroid carcinoma (FTC) is the second most common thyroid malignancy, accounting for approximately 10% of all thyroid carcinomas (2). Compared to the more common papillary thyroid carcinoma (PTC), FTC is relatively more aggressive, often invading blood vessels and/or the capsule, and has a higher incidence of bone and lung metastases (3). FTC has a higher mortality rate, and the 5- and 10-year survival rates of patients with distant metastasis are significantly reduced (4,5). Therefore, early diagnosis and appropriate treatment are of great significance for patients with FTC.
However, the preoperative diagnosis of FTC remains a major challenge in clinical practice, complicating the formulation of management strategies for follicular thyroid disease. The biggest challenge in diagnosing FTC lies in differentiating it from follicular thyroid adenoma (FTA), and the primary distinction between the two is that FTC exhibits invasive biological behavior. Currently, the diagnosis of FTC relies on postoperative histopathology to identify vascular/capsular invasions, while the preoperative diagnostic techniques have limited efficacy in FTC identification. Although fine-needle aspiration biopsy (FNAB) remains the gold standard for the preoperative evaluation of thyroid nodules (6), its utility in distinguishing FTC from FTA is constrained by their cytomorphological similarities. In terms of molecular markers, although some genetic mutations such as RAS, TP53, PAX8-PPARγ, TERT promoter, etc. are potentially associated with an increased risk of FTC (7,8), there is still significant overlap between FTC and FTA due to the common tissue origin and genetic background. Therefore, there are no efficient molecular markers that can be used for preoperative diagnosis of FTC so far (9).
Ultrasound remains an important imaging modality for thyroid nodule assessment, and the application of ultrasound-based risk stratification systems (RSSs) has further improved its usefulness. However, FTC often appears as moderate/low-risk on RSSs, exhibiting significant overlap with FTA and complicating their differentiation. Studies have shown that the commonly used RSSs for thyroid nodules have relatively low diagnostic efficacy in differentiating follicular neoplasms (10). Encouragingly, a few researchers have attempted to develop ultrasound-based prediction models specifically for follicular tumors or FTC recently (11-13). Among them, Li et al. (13) established the Follicular-Thyroid Imaging Reporting and Data System (F-TIRADS) for differentiating FTC from FTA based on a training cohort of 703 cases. In the validation set, the area under the receiver operating characteristic (ROC) curve of F-TIRADS was 0.81 [95% confidence interval (CI): 0.71–0.86], showing good diagnostic value and promising application prospects for differentiating follicular tumors. However, there is significant heterogeneity among studies, whether in terms of the study population or indicators evaluated, which will inevitably affect the generalizability of these prediction models. Therefore, the purposes of this study were to analyze the differences in the ultrasound characteristics between FTC and FTA, establish a ultrasound-based RSS for follicular tumors, and compare it with the commonly used TIRADS and F-TIRADS in differentiating follicular neoplasms. We present this article in accordance with the TRIPOD reporting checklist (available at https://gs.amegroups.com/article/view/10.21037/gs-2025-225/rc).
Methods
Study population
This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of the Affiliated Hospital of Integrated Traditional Chinese and Western Medicine, Nanjing University of Chinese Medicine (No. 2024-LWKY-044) as the lead center, with other centers providing retrospective anonymized data. The committee waived the requirement for informed consent due to the retrospective nature of the study. We reviewed the data of patients who underwent thyroid surgery at the Affiliated Hospital of Integrated Traditional Chinese and Western Medicine, Nanjing University of Chinese Medicine, Northern Jiangsu People’s Hospital Affiliated with Yangzhou University, and Affiliated People’s Hospital of Jiangsu University from January 2017 to December 2024. By reviewing the patient’s electronic medical records, demographic and clinical information, ultrasound images, and postoperative pathological examination results were collected. The inclusion criteria were: (I) patients who underwent surgical treatment with a definitive postoperative histopathological diagnosis of FTA or FTC; (II) availability of preoperative ultrasound imaging data. The exclusion criteria were as follows: (I) unclear correspondence between ultrasound images and pathological results; (II) unsatisfactory ultrasound images, insufficient to determine the sonographic characteristics of the nodules; (III) patients who underwent a second surgery for recurrent FTC without preoperative clinical and ultrasound information from the initial surgery; (IV) cases with an indeterminate postoperative histopathological diagnosis. Finally, a total of 448 patients were included in the study. The subjects were randomly divided into a training set (n=349) and a validation set (n=99) at a ratio of 8:2 (Figure 1).
Extraction of ultrasound characteristics
The ultrasound images of thyroid nodules of the participants were retrieved from the Picture Archiving and Communication System. The ultrasound images were obtained by the following machines: Hi Vision Preirus (Hitachi Healthcare, Tokyo, Japan), Aloka ProSound F75 (Hitachi Healthcare), EPIQ7 (Philips Healthcare, Amsterdam, The Netherlands), IU22 (Philips Healthcare), LOGIQ E9 (GE HealthCare, Chicago, USA), Acuson S3000 (Siemens Healthineers, Erlangen, Germany), Acuson Sequaia (Siemens Healthineers), and Resona R9 (Mindray Medical, Shenzhen, China). After receiving specialized training on ultrasonic characteristics, two experienced radiologists (with 10 and 8 years of experience in thyroid ultrasonography, respectively) jointly conducted blinded evaluations of all cases and reached a consensus through discussion. When a patient had more than one thyroid follicular tumor, the largest nodule with a definitive pathological diagnosis was selected for inclusion in the study.
The ultrasound characteristics of all thyroid nodules were evaluated mainly according to the 2017 Thyroid Ultrasound Reporting Lexicon issued by the American College of Radiology (ACR) (14) and the 2023 International Expert Consensus on US Lexicon for Thyroid Nodules (15). The following ultrasound characteristics were evaluated for each nodule: maximal diameter, composition (solid, predominately solid, predominately cystic), echogenicity (hyperechoic, isoechoic, hypoechoic, or markedly hypoechoic), shape (wider-than-tall, taller-than-wide), margin (smooth, ill-defined, irregular), calcification/echogenic foci (absent, microcalcifications, macrocalcifications, peripheral/rim calcifications), halo (uniform, uneven, or absent), echotexture (homogeneous, heterogeneous), trabecular formation (the spoke wheel-like hypoechoic bands within the nodules, absent or present), nodule-in-nodule sign (multiple nodule-like solid masses within the nodules, absent or present). All target nodules were risk-stratified according to ACR TI-RADS (16) and Chinese-TIRADS (C-TIRADS) (17).
Statistical analysis
Statistical analyses were conducted using SPSS statistical software (version 22.0, IBM, USA). Normally distributed data were expressed as means ± SD, and the differences between the two groups were analyzed using the two-sample Student’s t-test or Mann-Whitney test. Categorical variables were presented as frequencies, and the differences between the two groups were analyzed using the two-sided Chi-squared (χ2) test or Fisher’s exact test. Multivariate logistic regression analysis was employed to screen for independent predictive factors among the relevant factors of FTC. Odds ratios (ORs) with 95% CIs were calculated, and scores were assigned to the independent predictive factors to establish a prediction model for FTC. The ROC curve analysis was plotted to assess the prediction model’s performance and validate it in the validation cohort. A P value less than 0.05 was regarded as statistically significant.
Results
Patients characteristics
A total of 448 patients (448 nodules) were included in this study, among whom 112 were male (25.0%) and 336 were female (75.0%), with an average age of 47.15±14.90 years. According to the postoperative histopathological results, there were 270 cases of FTA (60.3%) and 178 cases of FTC (39.7%). The proportion of males in the FTC group was higher than that in the FTA group (32.6% vs. 20.0%, P=0.003), and there was no significant difference in the average age between the two groups (45.32±16.86 vs. 48.19±13.60 years, P=0.08).
A total of 349 patients were randomly assigned to the training cohort, and 99 were assigned to the validation cohort. There were no significant differences in the proportion of FTC (39.3% vs. 41.4%, P=0.70) and the maximum diameter of the lesions (3.68±1.61 vs. 3.74±1.81 cm, P=0.76) between the training cohort and the validation cohort.
Differences in ultrasonic characteristics between FTA and FTC
Table 1 shows the differences in ultrasonic characteristics between FTA and FTC in the training set. There was no significant difference in the maximum diameter between FTC and FTA (P=0.09). In FTC, the prevalence of ultrasound features such as solid structure, hypo-/markedly hypoechoic, ill-defined or irregular margin, any type of calcification, uneven or absent halo, heterogeneous echotexture, and trabecular formation was significantly higher compared to FTA (all P values <0.01). The taller-than-wide shape and nodule-in-nodule sign were more common in FTC than in FTA, but the differences were not statistically significant (both P values >0.05). According to ACR TIRADS, FTC mostly presented as TR 4 (56.9%), followed by TR 5 (24.1%), while FTA mostly presented as TR 3 (42.5%), followed by TR 4 (34.9%). According to C-TIRADS, FTC mostly presented as C-TR 4A and 4B (35.0%, 47.4%), while most of FTA were C-TR 3 (63.7%), followed by C-TR 4A (22.6%).
Table 1
| Ultrasonic characteristics | FTA (n=212) | FTC (n=137) | P value |
|---|---|---|---|
| Maximal diameter (cm) | 3.65±1.57 | 3.85±1.65 | 0.09 |
| Composition | 0.004 | ||
| Predominately cystic | 13 (6.1) | 0 (0.0) | |
| Predominately solid | 37 (17.5) | 17 (12.4) | |
| Solid | 162 (76.4) | 120 (87.6) | |
| Echogenicity | <0.001 | ||
| Isoechoic/hyperechoic | 139 (65.6) | 33 (24.1) | |
| Hypoechoic | 72 (34.0) | 95 (69.3) | |
| Markedly hypoechoic | 1 (0.5) | 9 (6.6) | |
| Margin | <0.001 | ||
| Smooth | 186 (87.7) | 47 (34.3) | |
| Ill-defined | 12 (5.7) | 32 (23.4) | |
| Irregular | 14 (6.6) | 58 (42.3) | |
| Shape | 0.09 | ||
| Wider-than-tall | 208 (98.1) | 130 (94.9) | |
| Taller-than-wide | 4 (1.9) | 7 (5.1) | |
| Calcification | <0.001 | ||
| Absent | 196 (92.5) | 94 (68.6) | |
| Rim calcification | 8 (3.8) | 19 (13.9) | |
| Macrocalcification | 4 (1.9) | 10 (7.3) | |
| Microcalcification | 4 (1.9) | 14 (10.2) | |
| Halo | <0.001 | ||
| Uniform | 154 (72.6) | 18 (13.1) | |
| Uneven | 25 (11.8) | 65 (47.4) | |
| Absent | 33 (15.6) | 54 (39.4) | |
| Echotexture | <0.001 | ||
| Homogeneous | 90 (42.5) | 22 (16.1) | |
| Heterogeneous | 122 (57.5) | 115 (83.9) | |
| Trabecular formation | 0.001 | ||
| Absent | 208 (98.1) | 124 (90.5) | |
| Present | 4 (1.9) | 13 (9.5) | |
| Nodule-in-nodule sign | 0.17 | ||
| Absent | 207 (97.6) | 130 (94.9) | |
| Present | 5 (2.4) | 7 (5.1) | |
| ACR TIRADS | <0.001 | ||
| TR 2 | 42 (19.8) | 5 (3.6) | |
| TR 3 | 90 (42.5) | 21 (15.3) | |
| TR 4 | 74 (34.9) | 78 (56.9) | |
| TR 5 | 6 (2.8) | 33 (24.1) | |
| C-TIRADS | <0.001 | ||
| C-TR 3 | 48 (22.6) | 5 (3.6) | |
| C-TR 4A | 135 (63.7) | 48 (35.0) | |
| C-TR 4B | 25 (11.8) | 65 (47.4) | |
| C-TR 4C | 6 (1.9) | 18 (13.1) | |
| C-TR 5 | 0 | 1 (0.7) |
Data are presented as n (%) or mean ± standard deviation. ACR, American College of Radiology; C-TIRADS, Chinese TIRADS; FTA, follicular thyroid adenoma; FTC, follicular thyroid carcinoma; TIRADS, Thyroid Imaging Reporting and Data System.
Establishment of the RSS for follicular thyroid neoplasm
In the training set, univariate logistic regression analysis showed that solid structure, hypo/markedly hypoechoic, calcification (any type), ill-defined/irregular margin, uneven/absent halo, heterogeneous echotexture, and trabecular formation were significantly correlated with FTC (all P values <0.05). In the multivariate logistic regression analysis, hypo/markedly hypoechoic, ill-defined/irregular margin, calcification, uneven/absent halo, and heterogeneous echotexture were independent risk factors for FTC (all P values <0.05, Table 2). Based on the results of the multivariate analysis, the independent risk factors of FTC were selected as the indicators of the prediction model and assigned scores. Specifically, hypo/markedly hypoechoic was assigned 1 point, calcification (any type) 1 point, ill-defined/irregular margin 2 points, uneven/absent halo 2 points, and heterogeneous echotexture 1 point. The sum of the scores of the above indicators was the predictive score for follicular tumors (F-Score, Figure 2). The relationship between F-Score and malignant risk stratification was shown in Table 3.
Table 2
| Potential predictors | Univariate analysis | Multivariate analysis | |||
|---|---|---|---|---|---|
| OR (95% CI) | P | OR (95% CI) | P | ||
| Composition (solid) | 2.179 (1.197–3.965) | 0.01 | – | ||
| Echogenicity (hypo/markedly hypoechoic) | 6.001 (3.701–9.730) | <0.001 | 2.073 (1.108–3.880) | 0.02 | |
| Margin (ill-defined/irregular) | 13.699 (7.794–23.534) | <0.001 | 4.063 (2.119–7.790) | <0.001 | |
| Shape (taller-than-wide) | 2.800 (0.804–9.752) | 0.11 | – | ||
| Calcification (any type) | 5.604 (3.001–10.463) | <0.001 | 2.232 (1.036–4.813) | 0.04 | |
| Halo (uneven/absent) | 17.554 (9.824–31.364) | <0.001 | 5.187 (2.576–10.444) | <0.001 | |
| Heterogeneous echotexture | 3.856 (2.267–6.559) | <0.001 | 2.058 (1.036–4.087) | 0.04 | |
| Trabecular formation | 5.452 (1.739–17.089) | 0.004 | – | ||
| Nodule-in-nodule sign | 2.229 (0.693–7.171) | 0.18 | – | ||
CI, confidence interval; OR, odds ratio.
Table 3
| F-Score | Risk of malignancy (%) | 95% CI | Risk stratification |
|---|---|---|---|
| 0–1 | 4.3 | 1.1–7.4 | Low |
| 2–3 | 32.6 | 22.5–42.7 | Low to medium |
| 4–5 | 55.1 | 44.5–65.6 | Medium to high |
| 6 | 80.5 | 71.5–89.6 | High |
| 7 | 97.0 | 90.8–100.0 | Extremely high |
CI, confidence interval; F-Score, predictive score for follicular tumors.
Evaluation and validation of the diagnostic efficiency of the prediction model
In the training set, the ROC curve analysis revealed that the area under the curve (AUC) of the F-Score was 0.878 (95% CI: 0.842–0.914) (Figure 3A). When using a threshold of ≥4 points to diagnose FTC, the sensitivity was 81.8%, the specificity was 79.2%, the positive predictive value (PPV) was 71.8%, the negative predictive value (NPV) was 87.0%, and the accuracy was 80.2%. In the validation set, the AUC of the F-Score was 0.871 (0.789–0.930) (Figure 3B). When diagnosing FTC with an F-Score of ≥4, the sensitivity, specificity, PPV, NPV, and accuracy were 75.6%, 79.3%, 72.1%, 82.1%, and 79.8%, respectively. Additionally, in the validation cohort, 40 patients underwent preoperative FNA, among which 21 cases were cytopathologically diagnosed as “follicular neoplasm”, including 10 FTA and 11 FTC cases. In these cases, the F-Score successfully identified 8 FTA and 9 FTC cases, with an accuracy of 81.0%.
The ROC curve analysis was used to compare the differences in the predictive value of the F-Score, ACR TI-RADS, C-TIRADS, and F-TIRADS for FTC. The results showed that, whether in the training set or the validation set, the AUC of the F-Score was significantly higher than that of ACR TI-RADS and C-TIRADS, with P values all <0.01 (Table 4). In the training set, the diagnostic efficacy of the F-Score was similar to that of F-TIRADS (AUCs were 0.878 (0.842–0.914) and 0.839 (0.796–0.876) respectively, P=0.054). However, in the validation set, the AUC of the F-Score [0.877 (0.812–0.943)] was significantly higher than that of the latter [0.810 (0.718–0.881)], with P=0.03.
Table 4
| Variables | Cutoff | Sensitivity (%) | Specificity (%) | Accuracy (%) | AUC (95% CI) | PAUC |
|---|---|---|---|---|---|---|
| Training set | ||||||
| F-Score | ≥4 points | 81.8 | 79.2 | 80.2 | 0.878 (0.842–0.914) | Ref |
| ACR TIRADS | ≥ TR 4 | 81.0 | 62.3 | 69.6 | 0.758 (0.707–0.809) | <0.001 |
| C-TIRADS | ≥ C-TR 4B | 61.3 | 86.3 | 76.5 | 0.770 (0.719–0.821) | <0.001 |
| F-TIRADS | ≥5 points | 80.3 | 75.9 | 77.7 | 0.839 (0.796–0.876) | 0.054 |
| Validation set | ||||||
| F-Score | ≥4 points | 78.0 | 79.3 | 78.8 | 0.877 (0.812–0.943) | Ref |
| ACR TIRADS | ≥ TR 4 | 82.9 | 58.6 | 68.7 | 0.762 (0.666–0.842) | 0.001 |
| C-TIRADS | ≥ C-TR 4B | 61.0 | 82.8 | 73.7 | 0.766 (0.670–0.845) | 0.001 |
| F-TIRADS | ≥5 points | 78.0 | 74.1 | 75.8 | 0.810 (0.718–0.881) | 0.03 |
ACR, American College of Radiology; AUC, area under the curve; CI, confidence interval; C-TIRADS, Chinese TIRADS; F-Score, predictive score for follicular tumors; F-TIRADS, follicular TIRADS; FTC, follicular thyroid carcinoma; Ref, reference; TIRADS, Thyroid Imaging Reporting and Data System.
Discussion
This study demonstrated significant differences in ultrasonic features between FTC and FTA. Hypo/markedly hypoechoic, calcifications (any type), ill-defined/irregular margins, uneven/absent halo, and heterogeneous echotexture were identified as independent risk factors for FTC. The F-Score developed based on these predictors can effectively distinguish FTC from FTA. Its efficacy was superior to that of ACR TI-RADS and C-TIRADS and was similar to that of F-TIRADS. The application of the F-Score will enhance clinicians’ ability to identify FTC and optimize the management of patients with follicular neoplasms.
The current study demonstrated that larger nodule size and solid composition were not reliable indicators for differentiating FTC from FTA. Our study included all patients with postoperative pathological confirmation of FTA or FTC, covering tumors of various sizes. The results demonstrated that there was no significant difference in the maximum diameter between FTC and FTA (3.85±1.65 vs. 3.65±1.57 cm), which is consistent with the findings of Zhang et al. (11). In this study, both FTAs and FTCs were solid or predominantly solid, but the proportion of FTAs with cystic degeneration was significantly higher than that of FTCs (23.6% vs. 12.4%), which was consistent with previous studies (12,13,18,19). However, in the multivariate analysis, solid composition was not an independent risk factor for FTC. We believe that for both FTA and FTC, the occurrence of cystic degeneration tends to increase as the tumor volume enlarges. In our study, there was no significant difference in size between FTC and FTA, which may partially explain this finding.
Most studies have suggested that FTA often presents as isoechoic, while FTC mostly presents as hypoechoic or markedly hypoechoic (11,13). Our study also supports this view and demonstrates that hypo/markedly hypoechoic can serve as an independent predictor for FTC (OR =2.073, 95% CI: 1.108–3.880). This difference may be related to the structure and growth pattern of the tumor, with normal/macro follicular growth patterns being more common in FTA, and solid/trabecular and micro follicular growth patterns being more prevalent in FTC (18). In comparison, FTC has an increased number of follicular epithelial cells and a reduced colloidal component, which makes it less likely to form reflective interfaces, resulting in decreased echogenicity on ultrasound.
Calcification is an important indicator for differentiating benign and malignant thyroid tumors. Microcalcification is highly specific indicators for PTC, while the value of macrocalcification and rim calcifications in the diagnosis of thyroid nodules remains controversial. Our results indicated that the proportion of FTC with calcifications was higher than that of FTA (31.4% vs. 7.5%). The incidence of microcalcifications, macrocalcifications, and rim calcifications in FTC were all around 10% (ranging from 7.3% to 13.9%), whereas the incidence of various types of calcifications in FTA were all approximately 2–3% (ranging from 1.9% to 3.8%). Consistent with most previous studies (11,12,19), we believe that any type of calcification is associated with an increased risk of malignancy in follicular tumors and can serve as an independent predictor for FTC. The calcifications in FTC are likely to be secondary to tissue necrosis, hemorrhage, or both, and thus often present as macrocalcifications. Although rim calcifications are generally associated with benignity (20), their malignant predictive value cannot be ignored in follicular neoplasms (11,13). Some researchers have proposed that the calcifications at the edge of FTC may be due to the degenerative changes caused by the repeated growth and necrosis of the tumor edge over a long period, and their presence is related to the duration and distant metastasis (21). This viewpoint still needs validation in future studies.
In this study, we confirmed that the echotexture of the tumor was a significant indicator for differentiating follicular neoplasms, despite previous controversies surrounding its predictive value for malignancy. Kuo et al. suggest that over 90% of follicular tumors, regardless of whether they are FTA or FTC, exhibit heterogeneous echotexture (22). Li et al. proposed that FTCs were more likely to exhibit heterogeneous echotexture (67.9%), while FTAs were more likely to be homogeneous (58.5%) (18). Our results indicated that both FTCs and FTAs predominantly show heterogeneous echotexture, but the proportion of heterogeneous echotexture was significantly higher in FTCs than in FTAs (83.9% vs. 57.5%). In multivariate analysis, heterogeneous echotexture was an independent risk factor for FTC (OR =2.058, 95% CI: 1.036–4.087). Furthermore, recent studies have found that some unique echotextures might help predict FTC. For example, some researchers have proposed the concepts of “trabecular formation” and “nodule-in-nodule” signs (22). The nodule-in-nodule sign refers to multiple nodule-like changes within a thyroid lesion, which might be related to the multinodular growth of the tumor (21). Trabecular formation, defined as the appearance of spoke-wheel-like or reticular scar-like echoes inside the nodule, which may be related to excessive fibrosis in FTC (23). Several studies have suggested that the “trabecular formation” or “nodule-in-nodule” sign may serve as independent predictors of FTC (11). Our study revealed that the prevalence of the nodule-in-nodule sign and trabecular formation was relatively low, though they were more common in FTC compared to FTA (5.1% vs. 2.4%, 9.5% vs. 1.9% respectively). Only trabecular formation showed a statistically significant difference between the two groups, however, multivariate regression analysis indicated that it was not an independent risk factor for FTC. Further analysis suggested that the occurrence of these two features might be associated with tumor size. When the maximum diameter of FTC exceeded 3 cm, the prevalence of both nodule-in-nodule sign and trabecular formation increased (6.7% vs. 3.4%, P=0.50; 14.3% vs. 3.4%, P=0.04, respectively). Therefore, we speculate that our results may be partially explained by our cohort’s smaller FTC size distribution versus prior studies. Moreover, we propose that both the two features represent specific manifestations of heterogeneous echotexture, making the latter a more representative predictor of FTC. We found that FTC often shows an alternating distribution of isoechoic and hypoechoic areas, or mild hypoechoic and markedly hypoechoic areas (Figure 2C), or irregular hypoechoic areas interspersed in isoechoic areas (similar to the trabecular formation, Figure 2G). We speculate that the rapid and unbalanced proliferation of cancer cells, as well as the occurrence of internal hemorrhage, necrosis, and fibrosis in the tumor, maybe the reasons for heterogeneous echotexture (18). The correlation between this ultrasonic characteristic and clinical features still needs further research.
The characteristics of the nodule marginal areas have consistently served as crucial diagnostic criteria for differentiating thyroid lesions. Recently, Yang et al. (24) established a preoperative ultrasound-based prediction system for FTC using deep learning techniques. The heatmaps showed that the relatively distorted textural features in the marginal areas are the primary basis for diagnosing FTC. Nodule marginal features primarily comprise the margin and halo sign on ultrasound. Previous studies have proven that irregular margins, including spiculations and lobulations, and thickened/absent halo are associated with an increased risk of FTC (11,13,19,22,25). In addition, Li et al. found that blurred margins are also an independent risk factor for FTC (13). In this study, 87.7% of FTA cases exhibited smooth margins, while 23.4% and 42.3% of FTC showed ill-defined and irregular margins respectively. In addition, 72.6% of FTA cases showed a uniform halo, while FTC cases predominantly exhibited uneven halo sign (47.4%) or absence of halo sign (39.4%). After multivariate analysis, both ill-defined/irregular margin and uneven/absent halo were identified as independent predictors of FTC. The uneven halo of FTC may be due to local invasion or breakthrough of the capsule by cancer cells, causing significant hyperplasia of fibrous tissue and resulting in irregular thickening of the capsule. When cancer cells continue to invade the surrounding normal thyroid tissue, an irregular edge can be observed. The absence of a halo in FTC may be related to the fact that FTC is often hypoechoic, and it is difficult to distinguish the hypoechoic halo from tumor mass. Unlike the extensive irregular margins observed in PTC, we noticed that FTC lesions often present as local protrusions or lobulations (Figure 2E), and their halo sign also frequently appears as localized thickening or absence. Therefore, for the ultrasonic diagnosis of FTC, a more comprehensive and detailed assessment of the marginal area is required.
Previous studies have shown that the commonly used ultrasound-based RSS for thyroid nodules, which are more applicable to PTC, exhibits unsatisfactory predictive efficacy for FTC (10,26). Therefore, some researchers have attempted to develop RSSs for follicular tumors (11-13), with reported AUC values ranging from 0.73 to 0.948 for FTC differentiation. However, these studies exhibit considerable variations in terms of the selection of study populations and the definition of ultrasound characteristics. Our study included all patients with histologically confirmed FTA or FTC, regardless of tumor size, and employed standardized and recognized ultrasound terminology. The results obtained are therefore more universally applicable. Our cohort also demonstrated that both ACR TIRADS and C-TIRADS had limited efficacy in differentiating follicular neoplasms, with the former exhibiting relatively low specificity and the latter demonstrating suboptimal sensitivity. Encouragingly, the F-Score constructed in this study performed excellently in differentiating FTC from FTA, with an AUC of 0.877, which is better than that of ACR TI-RADS (0.762) and C-TIRADS (0.766). F-TIRADS also performed well in our cohort (AUC 0.810–0.839), slightly lower than that of the F-Score. However, compared to F-TIRADS, the F-Score is more concise, which might be more advantageous for clinical application.
This study still has some limitations. First, due to the retrospective nature of this study, all the ultrasound images are static, which may lead to slight deviations in the interpretation of ultrasonic characteristics. Second, this study did not incorporate the color Doppler ultrasound characteristics of tumors, which we will further explore in subsequent research. Additionally, this RSS lacks prospective external validation. In the future, prospective, large-scale studies are still needed to further explore the ultrasound, clinical, and molecular characteristics of follicular tumors, as well as to validate and refine this prediction model.
Conclusions
In summary, this study identified margin and halo sign as the most critical ultrasound features for distinguishing FTC from FTA, followed by echogenicity, calcification, and echotexture. The ultrasound-based F-Score can efficiently differentiate FTC from FTA, which will be beneficial for the preoperative precise diagnosis of FTC and optimize the treatment strategies for patients with follicular tumors.
Acknowledgments
We sincerely appreciate the guidance provided by Dr. Junfang Ge from the Department of Pathology for this study.
Footnote
Reporting Checklist: The authors have completed the TRIPOD reporting checklist. Available at https://gs.amegroups.com/article/view/10.21037/gs-2025-225/rc
Data Sharing Statement: Available at https://gs.amegroups.com/article/view/10.21037/gs-2025-225/dss
Peer Review File: Available at https://gs.amegroups.com/article/view/10.21037/gs-2025-225/prf
Funding: This study was supported by
Conflicts of Interest: All authors have completed the ICMJE uniform disclosure form (available at https://gs.amegroups.com/article/view/10.21037/gs-2025-225/coif). The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. This study was conducted in accordance with the Declaration of Helsinki and its subsequent amendments. The study was approved by the Ethics Committee of the Affiliated Hospital of Integrated Traditional Chinese and Western Medicine, Nanjing University of Chinese Medicine (No. 2024-LWKY-044) as the lead center, with other centers providing retrospective anonymized data. The committee waived the requirement for informed consent due to the retrospective nature of the study.
Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
References
- Han B, Zheng R, Zeng H, et al. Cancer incidence and mortality in China, 2022. J Natl Cancer Cent 2024;4:47-53. [Crossref] [PubMed]
- Pereira M, Williams VL, Hallanger Johnson J, et al. Thyroid Cancer Incidence Trends in the United States: Association with Changes in Professional Guideline Recommendations. Thyroid 2020;30:1132-40. [Crossref] [PubMed]
- Sugino K, Ito K, Nagahama M, et al. Prognosis and prognostic factors for distant metastases and tumor mortality in follicular thyroid carcinoma. Thyroid 2011;21:751-7. [Crossref] [PubMed]
- Thyroid Cancer Survival Rates, by Type and Stage. American Cancer Society. 2023. Available online: https://www.cancer.org/cancer/types/thyroid-cancer/detection-diagnosis-staging/survival-rates.html.
- Mishra A, Kumar C, Chand G, et al. Long-Term Outcome of Follicular Thyroid Carcinoma in Patients Undergoing Surgical Intervention for Skeletal Metastases. World J Surg 2016;40:562-9. [Crossref] [PubMed]
- Ali SZ, Baloch ZW, Cochand-Priollet B, et al. The 2023 Bethesda System for Reporting Thyroid Cytopathology. Thyroid 2023;33:1039-44. [Crossref] [PubMed]
- Ohori NP, Nishino M. Follicular Neoplasm of Thyroid Revisited: Current Differential Diagnosis and the Impact of Molecular Testing. Adv Anat Pathol 2023;30:11-23. [Crossref] [PubMed]
- Borowczyk M, Szczepanek-Parulska E, Dębicki S, et al. Differences in Mutational Profile between Follicular Thyroid Carcinoma and Follicular Thyroid Adenoma Identified Using Next Generation Sequencing. Int J Mol Sci 2019;20:3126. [Crossref] [PubMed]
- Borowczyk M, Dobosz P, Szczepanek-Parulska E, et al. Follicular Thyroid Adenoma and Follicular Thyroid Carcinoma-A Common or Distinct Background? Loss of Heterozygosity in Comprehensive Microarray Study. Cancers (Basel) 2023;15:638. [Crossref] [PubMed]
- Lin Y, Lai S, Wang P, et al. Performance of current ultrasound-based malignancy risk stratification systems for thyroid nodules in patients with follicular neoplasms. Eur Radiol 2022;32:3617-30. [Crossref] [PubMed]
- Zhang F, Mei F, Chen W, et al. Role of Ultrasound and Ultrasound-Based Prediction Model in Differentiating Follicular Thyroid Carcinoma From Follicular Thyroid Adenoma. J Ultrasound Med 2024;43:1389-99. [Crossref] [PubMed]
- Yu Q, Liu K, Xie C, et al. Development and validation of a preoperative prediction model for follicular thyroid carcinoma. Clin Endocrinol (Oxf) 2019;91:348-55. [Crossref] [PubMed]
- Li J, Li C, Zhou X, et al. US Risk Stratification System for Follicular Thyroid Neoplasms. Radiology 2023;309:e230949. [Crossref] [PubMed]
- Grant EG, Tessler FN, Hoang JK, et al. Thyroid Ultrasound Reporting Lexicon: White Paper of the ACR Thyroid Imaging, Reporting and Data System (TIRADS) Committee. J Am Coll Radiol 2015;12:1272-9. [Crossref] [PubMed]
- Durante C, Hegedüs L, Na DG, et al. International Expert Consensus on US Lexicon for Thyroid Nodules. Radiology 2023;309:e231481. [Crossref] [PubMed]
- Tessler FN, Middleton WD, Grant EG, et al. ACR Thyroid Imaging, Reporting and Data System (TI-RADS): White Paper of the ACR TI-RADS Committee. J Am Coll Radiol 2017;14:587-95. [Crossref] [PubMed]
- Zhou J, Yin L, Wei X, et al. 2020 Chinese guidelines for ultrasound malignancy risk stratification of thyroid nodules: the C-TIRADS. Endocrine 2020;70:256-79. [Crossref] [PubMed]
- Li W, Song Q, Lan Y, et al. The Value of Sonography in Distinguishing Follicular Thyroid Carcinoma from Adenoma. Cancer Manag Res 2021;13:3991-4002. [Crossref] [PubMed]
- Borowczyk M, Woliński K, Więckowska B, et al. Sonographic Features Differentiating Follicular Thyroid Cancer from Follicular Adenoma-A Meta-Analysis. Cancers (Basel) 2021;13:938. [Crossref] [PubMed]
- Yoon DY, Lee JW, Chang SK, et al. Peripheral calcification in thyroid nodules: ultrasonographic features and prediction of malignancy. J Ultrasound Med 2007;26:1349-55; quiz 1356-7. [Crossref] [PubMed]
- Kim H, Shin JH, Hahn SY, et al. Prediction of follicular thyroid carcinoma associated with distant metastasis in the preoperative and postoperative model. Head Neck 2019;41:2507-13. [Crossref] [PubMed]
- Kuo TC, Wu MH, Chen KY, et al. Ultrasonographic features for differentiating follicular thyroid carcinoma and follicular adenoma. Asian J Surg 2020;43:339-46. [Crossref] [PubMed]
- Cracolici V, Kadri S, Ritterhouse LL, et al. Clinicopathologic and Molecular Features of Metastatic Follicular Thyroid Carcinoma in Patients Presenting With a Thyroid Nodule Versus a Distant Metastasis. Am J Surg Pathol 2019;43:514-22. [Crossref] [PubMed]
- Yang Z, Yao S, Heng Y, et al. Automated diagnosis and management of follicular thyroid nodules based on the devised small-dataset interpretable foreground optimization network deep learning: a multicenter diagnostic study. Int J Surg 2023;109:2732-41. [Crossref] [PubMed]
- Liu BJ, Liu YY, Wan J, et al. New Thyroid Imaging Reporting and Data System (TIRADS) Based on Ultrasonography Features for Follicular Thyroid Neoplasms: A Multicenter Study. Ultrasound Med Biol 2025;51:1343-51. [Crossref] [PubMed]
- Yang J, Sun Y, Li X, et al. Diagnostic performance of six ultrasound-based risk stratification systems in thyroid follicular neoplasm: A retrospective multi-center study. Front Oncol 2022;12:1013410. [Crossref] [PubMed]


