Conventional and machine learning-based risk scores for patients with early-stage hepatocellular carcinoma

Chun-Ting Ho; Elise Chia-Hui Tan; Pei-Chang Lee; Chi-Jen Chu; Yi-Hsiang Huang; Teh-Ia Huo; Yu-Hui Su; Ming-Chih Hou; Jaw-Ching Wu; Chien-Wei Su

doi:10.3350/cmh.2024.0103

Clin Mol Hepatol > Volume 30(3); 2024 > Article

Ho, Tan, Lee, Chu, Huang, Huo, Su, Hou, Wu, and Su: Conventional and machine learning-based risk scores for patients with early-stage hepatocellular carcinoma

Original Article

Clin Mol Hepatol. 2024; 30(3): 406-420.

Published online: April 11, 2024

DOI: https://doi.org/10.3350/cmh.2024.0103

Conventional and machine learning-based risk scores for patients with early-stage hepatocellular carcinoma

Chun-Ting Ho¹, Elise Chia-Hui Tan²

, Pei-Chang Lee^1,³, Chi-Jen Chu^1,³, Yi-Hsiang Huang^3,⁴, Teh-Ia Huo⁵, Yu-Hui Su⁶, Ming-Chih Hou^1,³, Jaw-Ching Wu⁴, Chien-Wei Su^1,^3,^4,⁷

Corresponding author : Chien-Wei Su Division of General Medicine, Department of Medicine, Taipei Veterans General Hospital, No. 201, Sec. 2, Shih-Pai Rd., Peitou District, Taipei 11217, Taiwan
Tel: +886-2-28712121 ext. 3352, Fax: +886-2-28739318, E-mail: cwsu2@vghtpe.gov.tw

Elise Chia-Hui Tan Department of Health Service Administration, College of Public Health, China Medical University, 406040 No. 100, Section 1, Economic and Trade Road, Beitun District, Taichung City, Taiwan
Tel: +886-422053366 ext. 6321, Fax: +886-422031108, E-mail: elisetam.g@gmail.com

Editor: Grace Wong, Chinese University of Hong Kong, Hong Kong

Received February 9, 2024 Revised April 10, 2024 Accepted April 10, 2024

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

ABSTRACT

Background/Aims

The performance of machine learning (ML) in predicting the outcomes of patients with hepatocellular carcinoma (HCC) remains uncertain. We aimed to develop risk scores using conventional methods and ML to categorize early-stage HCC patients into distinct prognostic groups.

Methods

The study retrospectively enrolled 1,411 consecutive treatment-naïve patients with the Barcelona Clinic Liver Cancer (BCLC) stage 0 to A HCC from 2012 to 2021. The patients were randomly divided into a training cohort (n=988) and validation cohort (n=423). Two risk scores (CATS-IF and CATS-INF) were developed to predict overall survival (OS) in the training cohort using the conventional methods (Cox proportional hazards model) and ML-based methods (LASSO Cox regression), respectively. They were then validated and compared in the validation cohort.

Results

In the training cohort, factors for the CATS-IF score were selected by the conventional method, including age, curative treatment, single large HCC, serum creatinine and alpha-fetoprotein levels, fibrosis-4 score, lymphocyte-tomonocyte ratio, and albumin-bilirubin grade. The CATS-INF score, determined by ML-based methods, included the above factors and two additional ones (aspartate aminotransferase and prognostic nutritional index). In the validation cohort, both CATS-IF score and CATS-INF score outperformed other modern prognostic scores in predicting OS, with the CATSINF score having the lowest Akaike information criterion value. A calibration plot exhibited good correlation between predicted and observed outcomes for both scores.

Conclusions

Both the conventional Cox-based CATS-IF score and ML-based CATS-INF score effectively stratified patients with early-stage HCC into distinct prognostic groups, with the CATS-INF score showing slightly superior performance.

Keywords: Fibrosis; Hepatocellular carcinoma; Inflammation; Machine learning; Prognosis

Study Highlights

• Question: Does a ML-based approach exhibit superior performance in predicting the outcomes of patients with earlystage HCC compared to the conventional Cox proportional hazards model?

• Findings: In this cohort study of 1,411 patients with early-stage HCC, both the Cox-based CATS-IF score and ML-based CATS-INF score demonstrated superior predictive performance for the overall survival compared to other prognostic scores. Notably, the CATS-INF score exhibited the lowest Akaike information criterion value.

• Meaning: These findings suggest that the ML-based CATS-INF score could stratify patients with early-stage HCC into distinct prognostic groups.

Graphical Abstract

INTRODUCTION

Primary liver cancer is the seventh most frequently occurring cancer and the third leading cause of cancer mortality in the world [1]. Hepatocellular carcinoma (HCC) accounts for approximately 75–90% of primary liver cancers [2]. HCC typically develops over an extended period of time, often in the setting of advanced chronic liver diseases (ACLDs), such as chronic hepatitis B virus (HBV) infection, hepatitis C virus (HCV) infection, alcoholism, and metabolic dysfunction-associated steatotic liver disease [3].

The outcomes of HCC patients have improved through advances in the surveillance of high-risk patients, antiviral therapy for chronic HBV or HCV infection, surgical and local regional therapy, and the introduction of systemic therapy [4-6]. One survey from the United States showed that the 5-year overall survival (OS) rates of patients with HCC increased from 3% between 1975 and 1977 to 18% between 2004 and 2010 [7]. Nevertheless, there is still room to improve.

The Barcelona Clinic Liver Cancer (BCLC) clinical algorithm is the most commonly applied system for staging and patient stratification and plays an important role in clinical decision-making, especially for the management of HCC [8]. The BCLC algorithm mainly focuses on tumor burden, liver functional reserve, performance status, vascular invasion, and extrahepatic spread. Early stage HCC vs early-stage HCC (BCLC stage 0-A) are defined by the presence of either a single tumor irrespective of size or 2 to 3 tumors less than 3 cm in size, well-preserved liver function, good performance, and a lack of vascular invasion, extra-hepatic metastasis, and cancer-related symptoms. Patients with BCLC stage 0-A HCC usually have better outcomes compared to those with intermediate or advanced stage HCC, and the median OS time is more than 5 years [8,9]. However, the clinical manifestations and outcomes of patients in stage 0-A HCC are not the same [10,11]. To improve the prognoses of early stage HCC vs early-stage HCC patients, it is warranted to identify the risk factors associated with the outcomes and to adopt individual strategies for treatment and follow-up.

Machine learning (ML) has been described as a “marriage between mathematics and computer science” and has been emerging as a promising method for biomarker selection and prognostic models [12]. Prognostic models based on ML have been developed for several different diseases, such as acute stoke, depression, and malignancy [13-15]. However, ML-based models to predict the outcomes of patients with early stage HCC have not been widely studied. Some discrepancies between conventional and ML-based methods have also been reported in several studies [16]. Therefore, we aimed to develop risk scores using ML and conventional methods to stratify patients with early-stage HCC into distinct prognostic groups.

MATERIALS AND METHODS

Data source and study population

This retrospective cohort study analyzed patient-level data from the HCC registration system [17] at Taipei Veterans General Hospital (TVGH), a major medical center in northern Taiwan. TVGH is a medical center located in northern Taiwan, boasting 3,160 beds. It handles a substantial volume of patient care in Taiwan, with approximately 8,000 outpatients visit per day. This registration system prospectively collects comprehensive data on demographics, etiology, baseline laboratory information, tumor factors, treatments, and outcomes for newly diagnosed HCC patients and has been used in previous studies [9,17-19]. The diagnosis of HCC of the registration system was based on diagnostic criteria of the American Association for the Study of Liver Disease (AASLD) [20]. The study involved 3,832 consecutive treatment-naïve HCC patients from 2012 to 2021 (Fig. 1). Exclusions were made for patients with advanced HCC stages (BCLC stage B to D), missing initial serum biomarker data or diagnostic imaging, or lost to follow-up post-diagnosis. A total of 1,411 patients with BCLC stage 0-A HCC were enrolled. The index date was defined as the date on which HCC was diagnosed. The study assigned 70% of these patients randomly to the training cohort and the remaining 30% to an external validation cohort.

Curative treatments included liver transplantation, surgical resection, and local ablation therapy. Trans-arterial chemoembolization (TACE), molecular target therapy, immune checkpoint inhibitors, radiotherapy, chemotherapy, and best supportive treatment were considered non-curative treatment modalities. Patients were followed until death, loss of follow-up, or the end of the study (June 30, 2022). This study adhered to the Declaration of Helsinki and received approval from the Institutional Review Board (IRB) of Taipei Veterans General Hospital, Taiwan (IRB number: 2022-07-007BC). Informed consent was waived by the IRB since this was a retrospective observational cohort study, and patient information was de-identified before the study commenced.

Outcome measurement

The vital status of each patient was collected from the electronic health record and linked to the registry data. The primary outcome was OS, which was calculated from the index date to the date of death or the last date of follow-up.

Construction and validation of prediction model and risk score

To construct a novel prediction model, the study initially utilized univariate Cox proportional hazard models to identify significant variables for OS. These variables were then integrated into a multivariable Cox model, refined through stepwise selection based on the Akaike Information Criterion (AIC), and verified by residual analysis. The detailed methods for model construction and validation are provided in Supplementary Material.

For each prognostic factor, a Cox-based risk score was calculated and standardized on a 0–100 scale. The cumulative risk score for each patient, derived from the sum of individual factor scores, was segmented into high, medium, and low-risk categories using the 33rd and 66th percentiles for effective risk stratification.

We employed the least absolute shrinkage and selection operator (LASSO) method to overcome the common challenges of multicollinearity and overfitting in complex models. This led to the creation of a ML-based risk score utilizing the variables retained in the LASSO-based Cox model. The effectiveness of this ML-based risk score, compared to a standard Cox regression model, wasthen evaluated.

Validations of Cox-based and ML-based risk scores were conducted in training and validation cohorts. This encompassed a three-step process: assessing predictive performance for OS through comparisons of homogeneity, AIC, and AUROC; generating calibration plots for predicted versus observed survival; and implementing time-dependent ROC curve analysis to evaluate prognostic performance at 1, 2, 3, and 5 years post-diagnosis, acknowledging the dynamic nature of disease status and survival time.

Statistical analysis

The baseline characteristics, including demographics, treatments, tumor factors, viral hepatitis status, and HCC-related biomarkers, were collected. The ROCs and the Youden index were utilized to determine the optimal cutoff values for non-invasive serum marker scores in predicting the risk of mortality in patients with early-stage HCC.

Continuous variables were presented as the median and interquartile range (IQR) and compared using the Mann–Whitney U test. Categorical variables were expressed as frequencies and percentages and compared using the chi-squared test or Fisher’s exact test. Cumulative OS rates were estimated using the Kaplan-Meier method and compared using the Cox proportional hazards model. All statistical analyses were conducted using SPSS version 24.0 (IBM Corp., Armonk, NY, USA) and R software (version 4.2.3) (R Foundation for Statistical Computing, Vienna, Austria). SPSS was employed for the forward stepwise Cox regression, while R and the “glmnet” package were used for the LASSO Cox regression and subsequent plots, and the “timeROC” package was used to generate the time-dependent ROC curves. A two-tailed P-value of <0.05 was considered statistically significant.

RESULTS

Basic characteristics

Among the 1,411 patients with BCLC stage 0-A HCC, 1276 underwent curative treatments: 830 had surgical resection, 5 underwent liver transplantation, 425 received radiofrequency ablation therapy, and 16 were treated with percutaneous ethanol injection therapy. Of the 135 patients who received non-curative therapy, 116 underwent TACE, 17 received radiotherapy, and 2 received best supportive care. The study divided these patients into two cohorts: 988 patients in the training cohort and 423 in the validation cohort. The clinical characteristics of patients are shown in Table 1. In both cohorts, males were the majority, and most of the etiologies were HBV or HCV-related HCC.

Biomarker selection

After a median follow-up of 38.0 months (IQR 18.0–57.0 months), 368 patients died, and the 5-year OS rate was 67.5%. All available clinical variables, including the clinicopathological features and serum biomarkers in Table 1, were subjected to stepwise Cox regression and LASSO Cox regression. As shown in Table 2, the multivariate analysis by conventional Cox regression showed that OS correlated with age, treatment modalities, single large (>5 cm) HCC (SLHCC), serum creatinine levels, fibrosis-4 (FIB-4), lymphocyte-to-monocyte ratio (LMR), albumin-bilirubin (ALBI) grade, and alpha-fetoprotein (AFP) levels. The risk factors associated with OS by the LASSO Cox regression included the above eight factors and two additional ones: aspartate aminotransferase (AST) and prognostic nutritional index (PNI) (Fig. 2).

Development of the risk scores

The β-coefficients in the Cox regression for each selected factor were simplified based on their ratios, resulting in a user-friendly and clinically applicable score named the CATS-IF score (abbreviated from the contributing factors, as outlined in Table 3). Simultaneously, variables demonstrating significance in the ML-based LASSO Cox regression were identified to develop our ML-based risk score. The β-coefficients in the multivariate Cox regression for each selected factor in LASSO COX regression were similarly simplified based on their ratios. The coefficients between these two models were compared and the results showed high consistency (Supplementary Fig. 1). Both the Cox-based CATS-IF score and ML-based CATS-INF score exhibited good predictive capability for OS in the training cohort, with respective AUC values of 0.723 and 0.729. To stratify patients into risk groups, we sorted them based on their CATS-IF score and CATS-INF score, establishing cutoffs for high, intermediate, and low-risk groups at the 33rd and 67th percentiles of the patients’scores. The Kaplan–Meier plots depicting the survival outcomes in the training cohort are presented in Figure 3A and Figure 3B for the CATS-IF score and CATS-INF score, respectively.

Calibration, validation and performance of the Cox-based CATS-IF and ML-based CATS-INF score

To evaluate the predictive efficacy of both the Cox-based CATS-IF score and ML-based CATS-INF score for OS, we calculated these scores in the validation cohort and stratified patients into high, intermediate, and low risk groups. In the validation cohort, both the CATS-IF score and CATS-INF score exhibited well-predictive capabilities for OS (AUC=0.695 and 0.707, respectively) and accurately stratified patients into low, intermediate, and high-risk groups. The 5-year OS rates in these groups stratified by CATS-IF were 81.8%, 62.8%, and 43.3%, while groups stratified by CATS-INF were 83.5%, 62.0%, and 42.9 %, respectively (Fig. 3C and 3D, P<0.001).

We subsequently generated calibration plots by plotting observed and predicted probabilities,stratified by 10 percentiles of the predicted probability from both CATS-IF score and CATS-INF score (Supplementary Fig. 2A and 2B). The calibration plot matched well with the ideal 45-degree line and showed good correlation between predicted and observed outcomes. We then compared the ability of the modern prognostic scores for HCC. As shown in Table 4, the CATS-INF score and CATS-IF scores had better predictive capability when compared to the ALBIscore, AST-to-platelet ratio index (APRI), LMR, PNI, FIB-4, and the model for end-stage liver disease (MELD) score. Notably, the ML-based CATS-INF score exhibited the lowest AIC value.

The time-dependent ROC curves were generated for the Cox-based CATS-IF and ML-based CATS-INF scores (Fig. 4). While the two scores had similar performance in predicting OS within the first year after diagnosis (AUROC 0.672 vs. 0.672), the ML-based CATS-INF score exhibited enhanced predictive accuracy during the second, third, and fifth years post-diagnosis compared to the conventional Cox-based CAT-IF score (AUROC 0.722 vs. 0.712, 0.712 vs. 0.702, and 0.704 vs. 0.690, respectively).

DISCUSSION

In this study, we showed that age, SLHCC, serum creatinine and AFP levels, FIB-4, LMR, ALBI grade, and treatment modalities are crucial risk factors for determining OS in patients with HCC in BCLC stage 0 or A. Both the CATS-INF and CATS-IF scores had excellent ability in prognosis prediction and risk stratification of the patients. It was validated by the validation cohort, and the calibration plot was well matched. Moreover, it outperformed currently available prognostic scores. Hence, the novel CATS-INF and CATS-IF scores could provide individual risk identification for patients with early-stage HCC and could benefit patients with HCC by allowing for more individualized medical services and treatment plans.

The factors that determine the prognoses of patients with HCC include tumor factors, field factors in the background liver (including the grade of inflammation and steatosis and the stage of fibrosis), and treatment factors [21,22]. For patients with early-stage HCC, the impact of tumor factors might be less critical, while the field factors play a more important role in determining the outcomes of patients [22-24].

Studies have shown that systemic inflammation can promote cancer growth, invasion, and metastasis in patients with malignant tumors [25,26]. Leukocytes play an important role in the immune response. Lymphocytes participate in cytotoxic cell death and inhibition of tumor-cell proliferation and migration [27,28]. Conversely, monocytes can promote tumor progression and metastasis [29]. As a result, the LMR has been used as a serum biomarker to predict the prognosis of several cancers and has shown good predictive capability of prognosis for patients with HCC in previous studies [30,31]. Our research revealed that low LMR is related to poorer OS in patients with early-stage HCC, indicating that inflammation was an important factor in the outcomes of patients.

Besides inflammation status, fibrosis also plays a crucial role in hepatic carcinogenesis as well as deteriorating liver function [32]. The FIB-4 score combines standard biochemical values(platelets, alanine aminotransferase, and AST) and age and has been recognized as an accurate, convenient, and non-invasive serum marker for evaluating the status of liver fibrosis for patients with ACLD [33]. We demonstrated that patients with high FIB-4 scores had poor OS compared to their counterparts, suggesting that fibrosis was also an important prognostic factor [32,34,35].

Moreover, the significance of nutritional status in predicting the outcomes of patients has been explored in diverse malignancies, including HCC [11,36,37]. It was postulated that a better nutritional function indicated a better body reserve against disease burden and could promote the immune response to combat malignancies [38,39]. By incorporating the PNI, a well-validated marker for assessing a patient’s nutritional status, into the risk scores, we could gain additional insights into their nutritional well-being. This inclusion enabled a more comprehensive understanding of a patient’s prognosis, considering the role of nutritional status as a contributing factor to OS.

The ALBI score has been widely validated as a reliable tool for evaluating liver functional reserve, as well as predicting the prognosis of patients with HCC [11,19,40,41]. Our study also confirmed that the ALBI grade was an independent factor in the outcomes of patients with early-stage HCC. Taken together, our results revealed the importance of inflammatory status, fibrosis status, and liver functional reserve in the survival of patients with early-stage HCC. The prognostic model that we developed based on the results could serve as a convenient, accessible, and economical way to predict patients’ outcomes and facilitate management of patients individually.

The categorization of SLHCC as BCLC stage A or B has been controversial in some studies, especially in Asia [42-44]. In the 2022 updated BCLC strategy for HCC, SLHCC was classified as BCLC stage A [8]. However, our previous study showed that patients who had SLHCC had a 5-year OS rate of 42.6%, which showed significant differences compared to those in BCLC stage A (57.0%) and stage B (27.3%) HCC [9]. Therefore, we proposed that SLHCC might be a distinctive stage between A and B [9]. In the current study, SLHCC was an independent risk factor associated with poorer OS among patients with HCC and BCLC stage 0-A and was a component of the CATS-IF score and CATS-INF score. Consequently, it is suggested that patients with SLHCC should be closely followed up or given adjuvant systemic therapy because they have a higher risk of mortality. More prospective studies are warranted to elucidate this issue.

For patients with HCC in BCLC stage 0-A, curative treatment modalities are recommended as front-line therapy [8]. In our study, around 90% of patients received curative treatments. Moreover, non-curative treatment was an independent risk factor associated with poorer OS. This indicates that curative treatment modalities should be performed for patients with early-stage HCC if there are no contraindications.

There were some disparities in the significance of our variables when comparing the conventional Cox regression model with the ML-based LASSO Cox regression model. Notably, serum AST levels and PNI were independent risk factors for OS in the ML model but not in the conventional model [45]. Cox logistic regression has traditionally been a stalwart in survival analysis due to its accuracy in identifying factors influencing survival. However, its precision can be compromised when variables interact with each other [46]. In our study, both the conventional Cox regression model and ML-based methods demonstrated superior performance in predicting the prognoses of early-stage HCC patients compared to other current prognostic scores. It is noteworthy that the ML-based CATS-INF score, with its lowest AIC value, particularly excelled in long-term prognostication (Fig. 4). The results highlighted the impact of nutritional status on the outcomes of patients with early-stage HCC. Furthermore, they validated the potential of ML-based methods, such as LASSO Cox regression, as valuable complements to traditional analytic approaches and promising tools for future model development.

Despite the good performance of the CATS-IF score and CATS-INF score, there are still several limitations that need to be addressed. First, this study was a single-center retrospective cohort study, and further validation with prospective cohorts will be needed to confirm the predictive capability for prognosis. Second, the model was developed based on a limited database, so further research on possible prognostic factors and biomarkers will be required to predict the outcome of patients with HCC. Third, most of the patients in our study cohort had viral HCC, and further study is needed to determine whether etiologies of HCC interfere with the result. Fourth, microvascular invasion (MVI) is a critical factor to determine OS and recurrence for patients with early-stage HCC who underwent surgical resection [47,48]. However, it is important to note that the diagnosis of MVI relies on pathological examination. In our study, approximately 40% of the patients received non-surgical treatments, which limited the availability of detailed pathological specimens necessary for determining MVI status. Consequently, we were unable to adequately incorporate MVI into our analysis. Fifth, in our study, the treatment modality emerged as a significant parameter in determining the OS of patients with early-stage HCC. The results were consistent with previous studies [9,10,49]. However, the proportion of patients receiving non-curative treatment modalities was relatively small, potentially impacting the significance of our analysis. Lastly, some of the biomarkers involved in this study still lack universal acknowledgement of cutoff values and clinical applicability. Hence, further studies with larger scale and more detailed information are needed.

The CATS-IF score developed by conventional Cox regression and the CATS-INF score developed by ML-based methods both showed excellent prognostic ability in early HCC patients, while the ML-based CATS-INF score showed slightly superior performance, especially in the long-term follow-up.

ACKNOWLEDGMENTS

This work was supported by grants from the National Science and Technology Council of Taiwan (MOST 111-2314-B-075-056, NSTC 112-2314-B-075-043-MY2), Taipei Veterans General Hospital (V112C-039, Center of Excellence for Cancer Research MOHW112-TDU-B-221-124007, and Big Data Center), Y.L. Lin Hung Tai Education Foundation. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

A part of this study was presented as a poster exhibition at the annual meeting of the annual meeting of the American Association for the Study of the Liver, Boston, USA, November 10-14, 2023. It has been identified as a “Poster of Distinction.”

Writing assistance: American Manuscript Editors.

FOOTNOTES

Authors’ contribution

Tan ECH and Su CW had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. Concept and design: Tan ECH, Lee PC, Su CW. Acquisition, analysis, or interpretation of data: All authors. Drafting of the manuscript: Ho CT, Tan ECH, Su CW. Critical review of the manuscript for important intellectual content: Chu CJ, Huang YH, Huo TI, Hou MC, Wu JC. Statistical analysis: Ho CT, Tan ECH, Su YH. Administrative, technical, or material support: Ni. Supervision: Tan ECH, Su CW.

All authors approved the final version of the article, including the authorship list.

Conflicts of Interest

There are no potential conflicts of financial and non-financial interests in the study. Chien-Wei Su: Speakers’ bureau: Gilead Sciences, Bristol-Myers Squibb, AbbVie, Bayer, and Roche. Advisory arrangements: Gilead Sciences. Grants: Bristol-Myers Squibb and Eiger.

SUPPLEMENTAL MATERIAL

Supplementary material is available at Clinical and Molecular Hepatology website (http://www.e-cmh.org).

SUPPLEMENTARY MATERIALS

cmh-2024-0103-Supplementary-Materials.pdf

Supplementary Figure 1.

Calibration plot of coefficients of ML-based score and coefficients of Cox-based score.

cmh-2024-0103-Supplementary-Fig-1.pdf

Supplementary Figure 2.

(A) Calibration plot of the predicted probability by the conventional Cox-based CATS-IF score and the observed probability of overall survival (OS). (B) Calibration plot of the predicted probability by the ML-based CATS-INF and the observed probability of OS. Both curves matched the ideal 45-degree line well and showed good correlation between predicted and observed outcomes.

cmh-2024-0103-Supplementary-Fig-2.pdf

Figure 1.

Study flow chart. HCC, hepatocellular carcinoma; TPEVGH, Taipei Veterans General Hospital; BCLC, Barcelona Clinic Liver Cancer classification.

Figure 2.

(A) LASSO coefficient profiles of the 17 variants (1: age, 2: sex, 3: curative treatment or not, 4: multiple tumors or not, 5: size >3, 6: single large HCC, 7: platelet<100,000, 8: albumin<3.5, 9: creatinine>1.2, 10: bilirubin>1, 11: ALT>40, 12: AST>45, 13: FIB-4>3.25, 14: LMR<3.62 15: PNI<45, 16: ALBI grade 2 or 3, 17: AFP>20). (B) 10 risk factors selected using LASSO Cox regression analysis. The two vertical dotted lines were drawn at the optimal scores according to minimum criteria and 1-s.e. LASSO, least absolute shrinkage and selection operator; HCC, hepatocellular carcinoma; ALT, alanine aminotransferase; AST, aspartate aminotransferase; FIB-4, fibrosis-4 index; LMR, lymphocyte-to-monocyte ratio; PNI, prognostic nutritional index; ALBI, albumin-bilirubin; AFP, alpha fetoprotein.

Figure 3.

(A) Kaplan–Meier survival analysis of training cohort according to the conventional Cox-based CATS-IF score (low risk: 0–102, intermediate risk: 103–211, high risk: 212+). (B) Kaplan–Meier survival analysis of training cohort according to the ML-based CATS-INF score (low risk: 0–114, intermediate risk: 115–223, high risk: 224+). (C) Kaplan–Meier survival analysis of the validation cohort according to the conventional Cox-based CATS-IF score (low risk: 0–102, intermediate risk: 103–211, high risk: 212+). (D) Kaplan–Meier survival analysis of the validation cohort according to the ML-based CATS-INF score (low risk: 0–102, intermediate risk: 103–211, high risk: 212+). ML, machine learning.

Figure 4.

(A) Time-dependent ROC curve of Cox-based CATS-IF score. (B) Time-dependent ROC curve of ML-based CATS-INF score. (C) Consecutive comparison of time-dependent area under the ROC curve (AUROC) of the Cox-based CATS-IF score and ML-based CATS-INF score. Both scores showed good predictability and were equally competitive, while ML-based CATS-INF score had slightly greater AUC in the long term (≥2 years). ML, machine learning.

Table 1.

Baseline characteristics of the training and validating cohorts

Characteristic	Training cohort (n=988)	Validating cohort (n=423)	P-value
Age	66.0 (59.0–74.0)	68.0 (59.0–75.0)	0.676
Sex, male (%)	680 (68.8)	313 (74.0)	0.051
Curative Tx, yes (%)	891 (90.2)	385 (91.0)	0.762
Tumor number, single (%)	889 (90)	370 (87.5)	0.164
Tumor size, > 3cm (%)	392 (39.7)	167 (39.5)	0.945
HBsAg, + (%)	508 (59.4)	215 (59.7)	0.921
Anti-HCV, + (%)	263 (32.0)	111 (32.0)	0.991
SLHCC, yes/no (%)	155/833 (15.7)	64/359 (15.1)	0.791
Platelet (/mm³)	152,000 (105,000–198,000)	148,000 (106,000–198,000)	0.914
Albumin (g/dL)	4.0 (3.7–4.3)	4.0 (3.7–4.2)	0.690
Creatinine (mg/dL)	0.89 (0.76–1.09)	0.90 (0.76–1.14)	0.700
Total bilirubin (mg/dL)	0.71 (0.50–1.00)	0.70 (0.51–0.99)	0.331
ALT (U/L)	32.0 (21.0–50.0)	31.0 (20.0–48.0)	0.653
AST (U/L)	33.0 (24.0–50.0)	34.0 (24.0–54.0)	0.703
LMR	3.19 (2.40–4.15)	3.08 (2.27–3.98)	0.033
PNI	47.9 (43.8–51.4)	47.1 (43.2–51.2)	0.117
FIB-4	2.79 (1.75–4.64)	2.91 (1.79–4.80)	0.253
ALBI, 1/2 or 3 (%)	564/424 (57.1/42.9)	247/176 (58.4/41.6)	0.649
AFP (ng/mL)	10.1 (3.5–81.8)	9.68 (3.44–74.6)	0.532

Continuous variables are expressed as the median with the 25th and 75th percentiles.

HBsAg, hepatitis B surface antigen; HCV, hepatitis C virus; SLHCC, single large hepatocellular carcinoma; ALT, alanine aminotransferase; AST, aspartate aminotransferase; LMR, lymphocyte-to-monocyte ratio; PNI, prognostic nutritional index; FIB-4, fibrosis-4 index; ALBI, albumin–bilirubin; AFP, alpha fetoprotein.

Table 2.

Univariate and multivariate Cox regression and LASSO Cox regression of selected biomarkers to predict overall survival in the training cohort

Variable	Univariate analysis		Multivariate analysis		LASSO Cox Reg.
Variable	HR (95% CI)	P-value	HR (95% CI)	P-value	Estimate
Age
>65	2.001 (1.538–2.604)	<0.001	1.532 (1.148–2.043)	0.004	0.13
≤65	1.00		1.00
Sex
Male	1.065 (0.814–1.392)	0.647
Female	1.00
Curative Tx
Yes	1.00		1.00
No	3.223 (2.365–4.390)	<0.001	1.823 (1.303–2.550)	<0.001	1.63
Tumor No.
Single	1.00
Multiple	1.284 (0.890–1.852)	0.182
Max. Size (cm)
>3	1.304 (1.015–1.676)	0.038	0.983 (0.711–1.361)	0.919
≤3	1.00		1.00
SLHCC
Yes	1.842 (1.359–2.497)	<0.001	2.291 (1.538–3.413)	<0.001	0.23
No	1.00		1.00
Platelet (/uL)
<100,000	1.841 (1.409–2.405)	<0.001	1.047 (0.743–1.475)	0.793
≥100,000	1.00		1.00
ALB (g/dL)
<3.5	2.330 (1.714–3.166)	<0.001	1.204 (0.832–1.744)	0.325
≥3.5	1.00		1.00
Creatinine (mg/dL)
>1.2	1.824 (1.372–2.425)	<0.001	1.437 (1.062–1.945)	0.019	0.02
≤1.2	1.00		1.00
Total bilirubin (mg/dL)
>1	1.309 (0.995–1.723)	0.054
≤1	1.00
ALT (IU/L)
>40	1.493 (1.162–1.919)	0.002	1.030 (0.740–1.434)	0.859
≤40	1.00		1.00
AST (IU/L)
>45	2.346 (1.828–3.010)	<0.001	1.210 (0.834–1.730)	0.325	0.16
≤45	1.00		1.00
FIB-4
>3.25	2.624 (2.032–3.389)	<0.001	1.685 (1.178–2.410)	0.004	0.30
≤3.25	1.00		1.00
LMR
<3.62	1.890 (1.432–2.494)	<0.001	1.446 (1.066–1.960)	0.018	0.038
≥3.62	1.00		1.00
PNI
<45	2.632 (2.049–3.383)	<0.001	1.102 (0.766–1.586)	0.600	0.24
≥45	1.00		1.00
ALBI
1	1.00		1.00
2 or 3	2.459 (1.902–3.178)	<0.001	1.548 (1.103–2.174)	0.012	0.16
AFP (ng/mL)
>20	1.780 (1.387–2.284)	<0.001	1.578 (1.206–2.064)	0.001	0.09
≤20	1.00		1.00

Table 3.

Parameter, coefficient, and formula of the conventional Cox-based CATS-IF score and ML-based CATS-INF score from the training cohort

Conventional Cox-Based CATS-IF score

If fulfill=1, else=0	β-coefficient	Score
Age >65 years	0.419	53
No curative treatment received	0.592	75
SLHCC	0.788	100
Serum creatinine >1.2 mg/dL	0.359	46
FIB-4 >3.25	0.559	71
LMR <3.62	0.350	44
ALBI grade 2 or 3	0.468	59
AFP >20 ng/mL	0.442	56

SLHCC, single large hepatocellular carcinoma; FIB-4, fibrosis-4 index; LMR, lymphocyte-to-monocyte ratio; PNI, prognostic nutritional index; ALBI, albumin–bilirubin; AFP, alpha fetoprotein.

Formula: CATS-IF score=53*(Age >65 years)+75*(No curative treatment received)+100*SLHCC+46*(Serum creatinine >1.2 mg/dL)+71*(FIB-4 >3.25)+44*(LMR <3.62)+59*(ALBI grade 2 or 3)+56*(AFP >20 ng/mL)

Table 3.

ML-based CATS-INF score

If fulfill=1, else=0	β-coefficient	Score
Age >65 years	0.419	53
No curative treatment received	0.592	75
SLHCC	0.788	100
Serum creatinine >1.2 mg/dL	0.359	46
AST >45 IU/L	0.216	27
FIB-4 >3.25	0.559	71
LMR <3.62	0.350	44
PNI <45	0.147	19
ALBI grade 2 or 3	0.468	59
AFP >20 ng/mL	0.442	56

SLHCC, single large hepatocellular carcinoma; FIB-4, fibrosis-4 index; LMR, lymphocyte-to-monocyte ratio; PNI, prognostic nutritional index; ALBI, albumin–bilirubin; AFP, alpha fetoprotein.

Formula: CATS-INF score=53*(Age >65 years)+75*(No curative treatment received)+100*SLHCC+46*(Serum creatinine >1.2 mg/dl)+27*(Serum AST >45 U/L)+71*(FIB-4 >3.25)+44*(LMR <3.62)+19*(PNI <45)+59*(ALBI grade 2 or 3)+56*(AFP >20 ng/mL)

Table 4.

Comparison of CAT-IF, CATS-INF, and modern non-invasive scores for predicting overall survival in the validation cohort

Score	Homogeneity	AUC	AIC	P-value
CATS-INF	47.656	0.707	1259.825	<0.001
CATS-IF	43.398	0.695	1264.421	<0.001
PNI	33.202	0.672	1275.337	<0.001
ALBI score	30.364	0.670	1278.611	<0.001
LMR	11.850	0.586	1292.475	0.001
MELD score	15.430	0.654	1296.598	<0.001
APRI	0.069	0.583	1307.147	0.793
FIB-4 score	0.089	0.604	1307.182	0.680

PNI, prognostic nutritional index; ALBI, albumin–bilirubin; LMR, lymphocyte-to-monocyte ratio; MELD, model for end-stage liver disease; AST, aspartate aminotransferase; APRI, AST to platelet ratio index; FIB-4, fibrosis-4 index.

Abbreviations

machine learning

HCC

hepatocellular carcinoma

BCLC

Barcelona Clinic Liver Cancer

overall survival

ACLDs

advanced chronic liver diseases

HBV

hepatitis B virus

HCV

hepatitis C virus

MASLD

metabolic dysfunction-associated steatotic liver disease

AASLD

American Association for the Study of Liver Disease

TACE

transarterial chemoembolization

FIB-4

fibrosis-4

LMR

lymphocyte-to-monocyte ratio

ALBI

albumin-bilirubin

AFP

alpha-fetoprotein

AST

aspartate aminotransferase

PNI

prognostic nutritional index

APRI

AST-to-platelet ratio index

MELD

model for end-stage liver disease

MVI

microvascular invasion

REFERENCES

1. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2018;68:394-424.

2. Singal AG, Lampertico P, Nahon P. Epidemiology and surveillance for hepatocellular carcinoma: New trends. J Hepatol 2020;72:250-261.

3. Rinella ME, Lazarus JV, Ratziu V, Francque SM, Sanyal AJ, Kanwal F, et al. A multisociety Delphi consensus statement on new fatty liver disease nomenclature. Hepatology 2023;78:1966-1986.

4. Ho SY, Liu PH, Hsu CY, Hsia CY, Huang YH, Lei HJ, et al. Evolution of etiology, presentation, management and prognostic tool in hepatocellular carcinoma. Sci Rep 2020;10:3925.

5. Hui VW, Chan SL, Wong VW, Liang LY, Yip TC, Lai JC, et al. Increasing antiviral treatment uptake improves survival in patients with HBV-related HCC. JHEP Rep 2020;2:100152.

6. Dang H, Yeo YH, Yasuda S, Huang CF, Iio E, Landis C, et al. Cure with interferon-free direct-acting antiviral is associated with increased survival in patients with hepatitis C virus-related hepatocellular carcinoma from both East and West. Hepatology 2020;71:1910-1922.

7. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2015. CA Cancer J Clin 2015;65:5-29.

8. Reig M, Forner A, Rimola J, Ferrer-Fàbrega J, Burrel M, Garcia-Criado Á, et al. BCLC strategy for prognosis prediction and treatment recommendation: The 2022 update. J Hepatol 2022;76:681-693.

9. Fang KC, Kao WY, Su CW, Chen PC, Lee PC, Huang YH, et al. The prognosis of single large hepatocellular carcinoma was distinct from Barcelona clinic liver cancer stage A or B: The role of albumin-bilirubin grade. Liver Cancer 2018;7:335-358.

10. Roayaie S, Jibara G, Tabrizian P, Park JW, Yang J, Yan L, et al. The role of hepatic resection in the treatment of hepatocellular cancer. Hepatology 2015;62:440-451.

11. Chang CY, Wei CY, Chen PH, Hou MC, Chao Y, Chau GY, et al. The role of albumin-bilirubin grade in determining the outcomes of patients with very early-stage hepatocellular carcinoma. J Chin Med Assoc 2021;84:136-143.

12. Deo RC. Machine learning in medicine. Circulation 2015;132:1920-1930.

13. Campagnini S, Arienti C, Patrini M, Liuzzi P, Mannini A, Carrozza MC. Machine learning methods for functional recovery prediction and prognosis in post-stroke rehabilitation: a systematic review. J Neuroeng Rehabil 2022;19:54.

14. Chen W, Zhou C, Yan Z, Chen H, Lin K, Zheng Z, et al. Using machine learning techniques predicts prognosis of patients with Ewing sarcoma. J Orthop Res 2021;39:2519-2527.

15. Qu Y, Lin Z, Yang Z, Lin H, Huang X, Gu L. Machine learning models for prognosis prediction in endodontic microsurgery. J Dent 2022;118:103947.

16. Zhao Z, Liu H, Zhou X, Fang D, Ou X, Ye J, et al. Necroptosis-related lncRNAs: Predicting prognosis and the distinction between the cold and hot tumors in gastric cancer. J Oncol 2021;2021:6718443.

17. Hsieh WY, Chen PH, Lin IY, Su CW, Chao Y, Huo TI, et al. The impact of esophagogastric varices on the prognosis of patients with hepatocellular carcinoma. Sci Rep 2017;7:42577.

18. Su CW, Fang KC, Lee RC, Liu CA, Chen PH, Lee PC, et al. Association between esophagogastric varices in hepatocellular carcinoma and poor prognosis after transarterial chemoembolization: A propensity score matching analysis. J Formos Med Assoc 2020;119:610-620.

19. Liao CY, Lee CY, Wei CY, Chao Y, Huang YH, Hou MC, et al. Differential prognoses among male and female patients with hepatocellular carcinoma. J Chin Med Assoc 2022;85:554-565.

20. Bruix J, Sherman M; American Association for the Study of Liver Diseases. Management of hepatocellular carcinoma: an update. Hepatology 2011;53:1020-1022.

21. Teng W, Liu YC, Jeng WJ, Su CW. Tertiary prevention of HCC in chronic hepatitis B or C infected patients. Cancers (Basel) 2021;13:1729.

22. Su CW, Chau GY, Hung HH, Yeh YC, Lei HJ, Hsia CY, et al. Impact of steatosis on prognosis of patients with early-stage hepatocellular carcinoma after hepatic resection. Ann Surg Oncol 2015;22:2253-2261.

23. Lee PC, Chiou YY, Chiu NC, Chen PH, Liu CA, Kao WY, et al. Liver stiffness measured by acoustic radiation force impulse elastography predicted prognoses of hepatocellular carcinoma after radiofrequency ablation. Sci Rep 2020;10:2006.

24. Kao WY, Su CW, Chiou YY, Chiu NC, Liu CA, Fang KC, et al. Hepatocellular carcinoma: Nomograms based on the albumin-bilirubin grade to assess the outcomes of radiofrequency ablation. Radiology 2017;285:670-680.

25. Mantovani A, Allavena P, Sica A, Balkwill F. Cancer-related inflammation. Nature 2008;454:436-444.

26. Yang YM, Kim SY, Seki E. Inflammation and liver cancer: Molecular mechanisms and therapeutic targets. Semin Liver Dis 2019;39:26-42.

27. Wu SJ, Lin YX, Ye H, Li FY, Xiong XZ, Cheng NS. Lymphocyte to monocyte ratio and prognostic nutritional index predict survival outcomes of hepatitis B virus-associated hepatocellular carcinoma patients after curative hepatectomy. J Surg Oncol 2016;114:202-210.

28. Yang YT, Jiang JH, Yang HJ, Wu ZJ, Xiao ZM, Xiang BD. The lymphocyte-to-monocyte ratio is a superior predictor of overall survival compared to established biomarkers in HCC patients undergoing liver resection. Sci Rep 2018;8:2535.

29. Olingy CE, Dinh HQ, Hedrick CC. Monocyte heterogeneity and functions in cancer. J Leukoc Biol 2019;106:309-322.

30. Lin S, Lin Y, Fang Y, Mo Z, Hong X, Ji C, et al. Clinicopathological and prognostic value of preoperative lymphocyte to monocyte ratio for hepatocellular carcinoma following curative resection: A meta-analysis including 4,092 patients. Medicine (Baltimore) 2021;100:e24153.

31. Mano Y, Yoshizumi T, Yugawa K, Ohira M, Motomura T, Toshima T, et al. Lymphocyte-to-monocyte ratio is a predictor of survival after liver transplantation for hepatocellular carcinoma. Liver Transpl 2018;24:1603-1611.

32. Sakurai T, Kudo M. Molecular link between liver fibrosis and hepatocellular carcinoma. Liver Cancer 2013;2:365-366.

33. Sterling RK, Lissen E, Clumeck N, Sola R, Correa MC, Montaner J, et al. Development of a simple noninvasive index to predict significant fibrosis in patients with HIV/HCV coinfection. Hepatology 2006;43:1317-1325.

34. Kawaguchi T, Ide T, Amano K, Arinaga-Hino T, Kuwahara R, Sano T, et al. Enhanced liver fibrosis score as a predictive marker for hepatocellular carcinoma development after hepatitis C virus eradication. Mol Clin Oncol 2021;15:215.

35. Tamaki N, Kurosaki M, Yasui Y, Mori N, Tsuji K, Hasebe C, et al. Change in fibrosis 4 index as predictor of high risk of incident hepatocellular carcinoma after eradication of hepatitis C virus. Clin Infect Dis 2021;73:e3349-e3354.

36. Okadome K, Baba Y, Yagi T, Kiyozumi Y, Ishimoto T, Iwatsuki M, et al. Prognostic nutritional index, tumor-infiltrating lymphocytes, and prognosis in patients with esophageal cancer. Ann Surg 2020;271:693-700.

37. Lei K, Deng ZF, Wang JG, You K, Xu J, Liu ZJ. PNI-based nomograms to predict tumor progression and survival for patients with unresectable hepatocellular carcinoma undergoing transcatheter arterial chemoembolization. J Clin Med 2023;12:486.

38. Chen W, Zhang M, Chen C, Pang X. Prognostic nutritional index and neutrophil/lymphocyte ratio can serve as independent predictors of the prognosis of hepatocellular carcinoma patients receiving targeted therapy. J Oncol 2022;2022:1389049.

39. Wang D, Hu X, Xiao L, Long G, Yao L, Wang Z, et al. Prognostic nutritional index and systemic immune-inflammation index predict the prognosis of patients with HCC. J Gastrointest Surg 2021;25:421-427.

40. Johnson PJ, Berhane S, Kagebayashi C, Satomura S, Teng M, Reeves HL, et al. Assessment of liver function in patients with hepatocellular carcinoma: a new evidence-based approach-the ALBI grade. J Clin Oncol 2015;33:550-558.

41. Chan AW, Kumada T, Toyoda H, Tada T, Chong CC, Mo FK, et al. Integration of albumin-bilirubin (ALBI) score into Barcelona Clinic Liver Cancer (BCLC) system for hepatocellular carcinoma. J Gastroenterol Hepatol 2016;31:1300-1306.

42. Jung YK, Jung CH, Seo YS, Kim JH, Kim TH, Yoo YJ, et al. BCLC stage B is a better designation for single large hepatocellular carcinoma than BCLC stage A. J Gastroenterol Hepatol 2016;31:467-474.

43. Cho Y, Sinn DH, Yu SJ, Gwak GY, Kim JH, Yoo YJ, et al. Survival analysis of single large (>5 cm) hepatocellular carcinoma patients: BCLC A versus B. PLoS One 2016;11:e0165722.

44. Zhong JH, Pan LH, Wang YY, Cucchetti A, Yang T, You XM, et al. Optimizing stage of single large hepatocellular carcinoma: A study with subgroup analysis by tumor diameter. Medicine (Baltimore) 2017;96:e6608.

45. Pinato DJ, North BV, Sharma R. A novel, externally validated inflammation-based prognostic algorithm in hepatocellular carcinoma: the prognostic nutritional index (PNI). Br J Cancer 2012;106:1439-1445.

46. Zhang Z, Reinikainen J, Adeleke KA, Pieterse ME, Groothuis-Oudshoorn CGM. Time-varying covariates and coefficients in Cox regression models. Ann Transl Med 2018;6:121.

47. Chan AWH, Zhong J, Berhane S, Toyoda H, Cucchetti A, Shi K, et al. Development of pre and post-operative models to predict early recurrence of hepatocellular carcinoma after surgical resection. J Hepatol 2018;69:1284-1293.

48. Su CW, Lei HJ, Chau GY, Hung HH, Wu JC, Hsia CY, et al. The effect of age on the long-term prognosis of patients with hepatocellular carcinoma after resection surgery: a propensity score matching analysis. Arch Surg 2012;147:137-144.

49. Fu CC, Wei CY, Chu CJ, Lee PC, Huo TI, Huang YH, et al. The outcomes and prognostic factors of patients with hepatocellular carcinoma and normal serum alpha fetoprotein levels. J Formos Med Assoc 2023;122:593-602.