INTRODUCTION
Metabolic dysfunction-associated steatotic liver disease (MASLD) has emerged as a significant global health issue, paralleling the increased incidences of obesity, diabetes, and metabolic syndrome. Recent epidemiological studies suggest that MASLD affects approximately 30% of the global population, with different characteristics according to ethnicity and region [
1-
3]. The increasing burden of MASLD is a major concern due to its potential to progress to more severe liver diseases, including cirrhosis and hepatocellular carcinoma (HCC). Therefore, identifying high-risk patients within the MASLD population is critical for implementing early interventions and preventing disease progression.
In clinical practice, the identification of high-risk patients with MASLD is sometimes challenging. Although liver biopsy remains the gold standard of diagnosis, it is invasive, subject to inter-observer variability, and associated with potential complications [
4]. Consequently, noninvasive tests (NITs) have gained attention as potential tools for assessing liver fibrosis and disease severity [
5,
6]. These tests range from simple biochemical markers and scoring systems such as the fibrosis-4 (FIB-4) index and non-alcoholic fatty liver disease fibrosis score to advanced imaging techniques such as vibration-controlled transient elastography (VCTE) and magnetic resonance elastography (MRE) [
7]. However, the clinical application of these tests is limited due to variations in their accuracy and the lack of consensus on their optimal use. Moreover, clinicians remain uncertain about which NITs are most appropriate for different clinical scenarios, leading to inconsistent patient management strategies.
To address these challenges, the Korean Association for the Study of the Liver (KASL) recently published a clinical practice guideline on the use of NITs in the assessment of MASLD [
8]. This guideline aims to provide a standardized approach for clinicians, enhancing the accuracy of risk stratification and improving patient outcomes. The KASL guideline suggests evaluations using NITs, initially using the FIB-4, followed by the VCTE for those identified as at risk for advanced fibrosis. If the FIB-4 index is greater than 1.3, then VCTE can be performed. When the VCTE result is greater than 8 kPa, additional NITs, such as MRE, enhanced liver fibrosis (ELF), and the Agile score, should be considered. However, MRE and ELF are not easy to perform in clinical practice due to their high cost or issues with availability.
Therefore, we evaluated the performance of the KASL NIT-based two-step approach (FIB-4 index and subsequent VCTE) in identifying high-risk patients with MASLD and predicting liver-related events (LREs). In addition, the performance of the KASL two-step approach was compared with those of other algorithms.
RESULTS
Baseline characteristics
After excluding 1,147 patients according to exclusion criteria, 8,131 patients with MASLD who had available FIB-4 and VCTE results were included in the analysis (
Fig. 1). The baseline characteristics of the study population are summarized in
Table 1. The mean age was 51.4 years with female accounting for 59.4% of the cohort. The median body mass index was 26.5 kg/m
2. The proportions of patients with diabetes and hypertension were 33.9% and 34.1%, respectively. The median AST, ALT, platelet count, and fasting glucose were 29.0 IU/L, 34.0 IU/L, 241×10
3/μl, and 104.1 mg/dL, respectively. The median FIB-4 score, LS by VCTE, Agile 3+, Agile 4, and FAST score were 1.10, 5.5 kPa, 0.15, 0.01, and 0.23, respectively. The median CAP value by VCTE was 300 dB/m.
Application of the KASL two-step approach
As described in Figure 2, 67.6% (n=5,493), 17.7% (n=1,443), 5.7% (n=460) and 9.0% (n=735) of patients were classified as low-, intermediate-low-, intermediate-high-, and high-risk, respectively. Patients in the high-risk group tended to be older (mean 64.6 years vs. 47.7–58.1 years), female (57.7% vs. 37.6–49.6%), and had higher AST (median 56.0 IU/L vs. 26.0–50.0 IU/L) and fasting glucose (median 109.2 mg/dL vs. 102.2–111.2 mg/dL) compared with the other lower-risk groups (
Table 2).
Fibrotic burden showed a progressive increase across the intermediate-high and high-risk groups when assessed using the FIB-4 index, Agile 3+, and Agile 4 scores, as well as LS values, with significant differences between the risk groups (P<0.001 for all). However, the low- and intermediate- low-risk groups had similar levels of fibrotic burden. Specifically, the FIB-4 index medians were 0.86, 1.67, 1.90, and 3.74; Agile 3+ scores were 0.09, 0.23, 0.67, and 0.77; and Agile 4 scores were 0.01, 0.02, 0.16, and 0.26 across the risk groups. For LS values, the medians were 5.3 kPa, 5.1 kPa, 10.8 kPa, and 9.1 kPa, respectively. The FAST score and CAP score also demonstrated significant differences between the risk groups, with FAST score medians of 0.18, 0.23, 0.62, and 0.58, and CAP scores of 303 dB/m, 289 dB/m, 314 dB/m, and 291 dB/m. However, no consistent or sequential increase was observed across all the risk groups.
Clinical outcomes
During the follow-up period (median 46.6 months [IQR 24.0–67.1]), 86 (1.1%) patients developed LREs, including 39 (0.5%) patients with hepatic decompensation and 47 (0.6%) with HCC. Patients who developed LREs (n=86) were older (60.8 years vs. 51.4 years) and had higher AST levels (39.5 IU/L vs. 29.0 IU/L), lower platelet counts (159x103/μ vs. 242x10
3/μl), higher FIB-4 indices (2.97 vs. 1.09), higher LS values (16.5 kPa vs. 5.5 kPa), lower CAP scores (285 dB/m vs. 300 dB/m), and higher Agile 3+ (0.91 vs. 0.15), Agile 4 (0.49 vs. 0.01), and FAST (0.64 vs. 0.23) scores compared with those without LREs (all
P<0.05,
Supplementary Table 1).
According to the KASL two-step approach, the 3-, 5- and 7-year cumulative incidences of LREs were 0.07% (95% confidence interval [CI] 0.02–0.20), 0.35% (95% CI 0.18– 0.64), and 0.96% (95% CI 0.53–1.64) in the low-risk group, 0.10% (95% CI 0.01–0.57), 0.26% (95% CI 0.05–0.92), and 0.5% (95% CI 0.13–1.43) in the intermediate-low risk group, 0.28% (95% CI 0.03–1.48), 1.94% (95% CI 0.71–4.32), and 5.51% (95% CI 2.51–10.2) in the intermediate-high risk group, and 1.51% (95% CI 0.71–2.85), 5.46% (95% CI 3.43–8.15) and 12.40% (95% CI 8.38–17.3) in the high-risk group (
Figs. 2,
3A).
The 3-, 5- and 7-year cumulative incidences of HCC were 0.03% (95% CI 0.00–0.15), 0.22% (95% CI 0.09–0.47) and 0.84% (95% CI 0.42–1.51) in the low-risk group, 0.00%, 0.16% (95% CI 0.02–0.86), and 0.39% (95% CI 0.08–1.36) in the intermediate-low risk group, 0.28% (95% CI 0.03–1.48), 1.18% (95% CI 0.31–3.32), and 2.12% (95% CI 0.61–5.46) in the intermediate-high risk group, and 0.76% (95% CI 0.26– 1.85), 3.26% (95% CI 1.74–5.53) and 6.11% (95% CI 3.48– 9.76) in the high-risk group, respectively (
Figs. 2,
3B).
The 3-, 5- and 7-year cumulative incidences of DCC were 0.04% (95% CI 0.01–0.15), 0.13% (95% CI 0.04–0.34) and 0.13% (95% CI 0.04–0.34) in the low-risk group, 0.10% (95% CI 0.01–0.57), 0.10% (95% CI 0.01–0.57), and 0.10% (95% CI 0.01–0.57) in the intermediate-low risk group, 0.00%, 0.77% (95% CI 0.15–2.58), and 4.38% (95% CI 1.69–9.09) in the intermediate-high risk group, and 0.96% (95% CI 0.37–2.13), 3.13% (95% CI 1.67–5.32) and 8.01% (95% CI 4.75–12.30) in the high-risk group, respectively (
Figs. 2,
3C).
Prognostic performance of KASL approach
At 3 years, the sensitivity and NPV of the low-risk group in identifying patients unlikely to develop LREs were 75.6% (95% CI 49.4–91.6) and 99.9% (95% CI 99.8–100), respectively (
Table 3). The specificity of the high-risk group was 100%. The overall accuracy of the KASL two-step approach ranged from 67.7% (95% CI 66.6–68.7) to 99.8% (95% CI 99.7–99.9). Compared with other algorithms such as AGA, FIB-4 index–Agile 3+, FIB-4 index–Agile 4, or FIB-4 index–FAST, the KASL two-step approach was non-infe-rior to ruling out or predicting LRE development among patients with MASLD.
At 5 years, the sensitivity and NPV of the low-risk group in ruling out LREs were 70.2% (95% CI 54.3–82.3) and 99.8% (95% CI 99.7–99.9), respectively. The specificity of the high-risk group was 100%, and the overall accuracy of the high-risk group was 99.5% (95% CI 99.4–99.7). The robust prognostic performance of the KASL two-step approach was still sustained over a 5-year period.
At 7 years, the sensitivity and NPV of the low-risk group were 65.7% (95% CI 53.2–76.4) and 99.6% (95% CI 99.4– 99.8), respectively. The specificity of the high-risk group was 68.3% (95% CI 67.3–69.3), and the overall accuracy remained consistent over the extended follow-up period.
Simplified KASL algorithm
Given the similarity in fibrotic and steatotic burdens between the low-and intermediate-low-risk groups, we reclassfied patients into three simplified risk categories: low-(n=5,493) and intermediate-low-risk (n=1,443) as the lowrisk group (n=6,936), intermediate-high-risk as the intermediate-risk group (n=460), and high-risk as the highrisk group (n=735) (
Supplementary Table 2). Patients in the high-risk group tended to be older (mean 64.6 years vs. 49.8–55.2 years) and female predominant (57.7% vs. 38.2-49.6%) and had lower BMIs (mean 26.1 kg/m
2 vs. 26.5-27.8 kg/m
2), higher AST levels (56.0 IU/L vs. 27.0-50.0 IU/L), lower platelet counts (164×10
3/μl vs. 206-249×10
3/μl), and higher FIB-4 indices (3.74 vs. 0.98-1.90), Agile 3+ scores (0.77 vs. 0.11-0.67), and Agile 4 scores (0.26 vs. 0.01-0.16) compared with the low- and intermediate-risk groups (all
P<0.001). Although the distinction between referral and non-referral groups remains unchanged, the simplified classification system offers primary care physicians a more practical and efficient tool for risk stratification and patient management.
Regarding the simplified KASL two-step approach, the sensitivity and NPV for the low-risk group were 68.0% and 99.9% at 3 years and 66.4% and 99.8% at 5 years, respectively (
Supplementary Table 3). The specificity of the highrisk group was 100%, and the overall predictive accuracies for the high-risk group were 99.8% at 3 years and 99.5% at 5 years.
According to the simplified KASL approach, the 3-, 5- and 7-year cumulative incidences of LRE were 0.08% (95% CI 0.03–0.19), 0.33% (95% CI 0.18–0.57), and 0.85% (95% CI 0.50–1.39) in the low-risk group, 0.28% (95% CI 0.03– 1.48), 1.94% (95% CI 0.71–4.32), and 5.51% (95% CI 2.51– 10.20) in the intermediate risk group, and 1.51% (95% CI 0.71–2.85), 5.45% (95% CI 3.43–8.14), and 12.4% (95% CI 8.37–17.3) in the high-risk group (
Supplementary Fig. 2). The incidence rates of HCC were 0.02% (95% CI 0–0.12) at 3 years, 0.21% (95% CI 0.09–0.42) at 5 years, and 0.73% (95% CI 0.39–1.26) at 7 years. The incidence rates of DCC were 0.06% (95% CI 0.02–0.16) at 3 years, 0.12% (95% CI 0.05–0.29) at 5 years, and 0.12% (95% CI 0.05– 0.29) at 7 years (
Supplementary Fig. 3).
Comparison of AUCs according to the different approaches
The KASL two-step approach showed an integrated time-dependent AUC of 0.801 (c-index 0.880) in predicting LREs (
Supplementary Fig. 4A). The similar AGA approach using FIB-4 and an LS cut-off value of 10 kPa showed an integrated time-dependent AUC of 0.807 (c-index 0.920). Substituting the LS value with the Agile 3+ (AUC 0.815, cindex 0.901), Agile 4 (AUC 0.792, c-index 0.923), and FAST scores (AUC 0.776, c-index 0.885) as the second step showed good LRE prediction ability. Moreover, the simplified KASL approach (AUC 0.807) showed a similar AUC compared with different NIT approaches (AGA 0.807, FIB-4–Agile 3+ 0.815, FIB-4–Agile 4 0.792, FIB-4–FAST 0.776) (
Supplementary Fig. 4B). The AUCs of FIB-4 alone and liver stiffness measurement alone are 0.796 and 0.835, respectively.
DISCUSSION
In this study, which was based on a large and diverse spectrum of patients with MASLD, we evaluated the effectiveness of the KASL two-step approach in assessing LREs in patients with MASLD. Our results demonstrated that the KASL two-step approach, consisting of the FIB-4 index and VCTE, effectively stratified patients into risk categories with significant differences in the cumulative incidence of LREs, including DCC and HCC. In our cohort, a significant proportion of patients were categorized as lowrisk (67.6%) or intermediate-low-risk (17.7%), with very low cumulative incidences of LREs over the 3- to 7-year followup period. Conversely, high-risk patients, who accounted for 9.0% of the cohort, had a notably higher risk of developing LREs. These findings are consistent with previous reports showing that higher FIB-4 scores and LS values by VCTE are associated with a greater risk of fibrosis progression and liver-related mortality among patients with MASLD.
Our current study has several clinical implications. First, our study builds upon these findings by independently validating the recently published KASL NIT guidelines in a Korean cohort. Given the growing emphasis on non-invasive fibrosis assessment, real-world validation is essential. The simplicity of this two-step approach based on FIB-4 index and subsequent use of VCTE makes it particularly appealing for clinical application. In 2021, the AGA recommended a two-step clinical care pathway for the assessment of MASLD [
16]. The AGA two-step method classified low-, indeterminate-, and high-risk groups according to their FIB-4 index and subsequent LS values. We initially classified patients with low-, intermediate-low-, intermediate-high- and high-risk groups according to the KASL recommendation. Our study highlights the effectiveness of the KASL twostep approach in predicting LREs, with high-risk patients showing a higher incidence of LREs. Early identification of these patients allows for proactive management strategies, potentially improving patient outcomes and reducing the burden of advanced liver disease.
Second, in contrast to our expectation, there were no significant differences between low- and intermediate-low-risk groups. To simplify the algorithm, we also validated the simplified KASL approach with low-, intermediate-, and high-risk groups. We can confirm that LREs increased based on risk stratification by FIB-4 and LS values. However, this approach did not reflect the changes in fibrotic burden over time. Not only baseline fibrosis but also changes in fibrotic burden are important for predicting LRE development in patients with MASLD. A retrospective study showed that longitudinal changes in the FIB-4 score were associated with disease progression in patients with MASLD. High FIB-4 scores at baseline and 3 years were associated with > 50-fold higher risk of HCC than persistently low FIB-4 values. To overcome the low sensitivity of the FIB-4, changes in the LS value can also be used to predict the prognosis of patients with MASLD. Changes in the LS value over 3 years also could predict clinical outcomes in patients with MASLD. LS value changes and the platelet count were independent factors in predicting the development of cirrhosis or HCC. However, the number of events was small (8.6%), and the 3-year follow-up was not sufficient to monitor disease progression in all MASLD fibrosis groups. In addition, the overall risk of LREs was low in our study, even among high-risk patients, when using non-invasive tests for hepatic steatosis. This contrasts with biopsy-based cohorts, which show a higher risk of LREs. However, VCTE-based cohorts better reflect routine clinical practice, capturing a broader patient population. Biopsy cohorts typically include patients with more advanced disease, leading to higher LRE risks. This underscores the value of non-invasive tests for managing MASLD in realworld settings.
Third, fibrotic burden, as assessed by LS value, FAST score, and steatotic burden, as evaluated by CAP score showed differences between groups, but did not show sequential increase across the groups. This discrepancy may be due to the multifactorial nature of MASLD, where fibrosis progression is influenced by various factors, including metabolic comorbidities and inflammation, which may not always correlate with steatosis or CAP score [
17,
18]. Additionally, the relationship between fibrosis and steatosis is complex, as some patients with advanced fibrosis may have mild steatosis and vice versa. The relatively small number of high-risk patients in our cohort might have limited the ability to detect gradual increases in these parameters. Furthermore, fibrosis and steatosis may not follow a linear progression, and the baseline measurements used may not fully capture their dynamic nature over time. Longitudinal monitoring could provide a clearer understanding of how these factors evolve and interact in MASLD. Finally, treatment interventions or lifestyle factors during follow-up could have influenced the observed fibrotic and steatotic burden.
Fourth, the combination of FIB-4 and Agile score could be applied to classify the risk of patients with MASLD. The Agile score has shown strong power for predicting the development of LREs. A recent study showed that single or serial Agile scores are highly accurate in predicting LREs among patients with MASLD [
9]. In particular, Agile 3+ and Agile 4 scores classified fewer patients between the low and high cutoffs than most other fibrosis scores. Agile scores also showed the highest discriminatory power in predicting LREs. The incidences of LREs were 0.6 per 1,000 person-years in patients with persistently low Agile 3+ scores and 30.1 per 1000 person-years in patients with persistently high Agile 3+ scores. These data supported the importance of NIT assessment and serial monitoring using NITs. However, the complexity and intricate formula required for the calculation of the Agile score can pose challenges for everyday clinical use. Therefore, although the Agile score enhances risk stratification, its application might be limited due to these practical difficulties. Simplifying the formula or integrating it into user-friendly methods such as the KASL two-step method could improve its usability in routine practice.
Although complex algorithms like Agile and FAST were expected to offer more precise risk stratification than the FIB-4 and VCTE approach, our study found similar predictive performance between the two. For 7-year LRE prediction, the specificity of 68.4% indicated moderate success in avoiding unnecessary referrals, but 31.6% of low-risk patients were misclassified as high-risk. The sensitivity of 66% showed that while most LRE cases were identified, about one-third were missed. Our analysis of referral rates showed that the KASL approach led to a referral rate of 14.7%, which was comparable to FIB-4 Agile 3+ (12.5%) and FIB-4 Agile 4 (9.6%) but higher than FIB-4-FAST (3.1%). While lower referral rates may reduce the burden on healthcare systems, they must be interpreted alongside diagnostic accuracy to avoid missing at-risk individuals. The observed differences in referral rates highlight the need for further validation in diverse cohorts to determine the most clinically efficient strategy. Another important aspect of risk prediction models is the consideration of competing risks, particularly non-liver-related mortality. In our cohort, 14 patients (0.17%) died from non-liver-related causes during follow-up. Given that this represents a very small proportion of the study population, we did not perform a competing risk analysis, as its impact on our overall findings would likely be minimal. However, we acknowledge that competing risks may play a more significant role in longer-term follow-up studies or populations with higher comorbidity burdens.
Despite its clinical strengths, our study has several limitations. First, the observational nature of the study limits our ability to establish causality between the KASL two-step approach and the prevention of liver-related outcomes. While our results suggest that the approach is effective for risk stratification, prospective randomized controlled trials are needed to confirm its impact on patient outcomes. Second, this study was conducted at a single tertiary center, which may limit the external validity of our findings. Patients who underwent VCTE may represent a subset with greater access to specialized care, introducing potential selection bias. Additionally, factors such as obesity and concomitant liver diseases may influence the accuracy of VCTE-based LSMs, which should be considered when interpreting the results. Third, the discordance between FIB-4 and LSM, as well as the interpretation of changes in these measurements over time, presents an additional limitation. This discordance may affect the accuracy of risk stratification and needs further exploration. Furthermore, the limited number of LREs and short follow-up duration in some groups poses another limitation. While the relatively small events and varying follow-up duration may limit the ability to detect significant differences in risk stratification, it also highlights the promising potential of our findings, suggesting that the KASL two-step approach can effectively stratify patients and identify risk groups with meaningful clinical implications. Importantly, this study not only validates previously reported pathways in Eastern and Western cohorts but also serves as the first validation of the recently published KASL NIT guidelines. Given the significance of validating these guidelines in a Korean cohort, our findings provide crucial insights. Additionally, patients with severe obesity may be underrepresented in our study, which could limit the generalizability of our findings to this group. Lastly, while we accounted for a broad range of confounding factors, residual confounding cannot be ruled out due to the observational design of the study. Despite the advantages of non-invasive tests, liver biopsy can still offer important insights into fibrotic changes and histological patterns such as MASH, which may further refine LRE risk stratification. This could be particularly useful in cases where non-invasive markers present conflicting or ambiguous results.
In conclusion, the KASL two-step approach provides an efficient and practical framework for risk stratification in patients with MASLD, facilitating the early detection of highrisk individuals and enabling timely interventions to optimize patient care.