Role of noninvasive tests in the prognostication of metabolic dysfunction-associated steatotic liver disease
Article information
Abstract
In managing metabolic dysfunction-associated steatotic liver disease, which affects over 30% of the general population, effective noninvasive biomarkers for assessing disease severity, monitoring disease progression, predicting the development of liver-related complications, and assessing treatment response are crucial. The advantage of simple fibrosis scores lies in their widespread accessibility through routinely performed blood tests and extensive validation in different clinical settings. They have shown reasonable accuracy in diagnosing advanced fibrosis and good performance in excluding the majority of patients with a low risk of liver-related complications. Among patients with elevated serum fibrosis scores, a more specific fibrosis and imaging biomarker has proved useful to accurately identify patients at risk of liver-related complications. Among specific fibrosis blood biomarkers, enhanced liver fibrosis is the most widely utilized and has been approved in the United States as a prognostic biomarker. For imaging biomarkers, the availability of vibration-controlled transient elastography has been largely improved over the past years, enabling the use of liver stiffness measurement (LSM) for accurate assessment of significant and advanced fibrosis, and cirrhosis. Combining LSM with other routinely available blood tests enhances the ability to diagnose at-risk metabolic dysfunction-associated steatohepatitis and predict liver-related complications, some reaching an accuracy comparable to that of liver biopsy. Magnetic resonance imaging-based modalities provide the most accurate quantification of liver fibrosis, though the current utilization is limited to research settings. Expanding their future use in clinical practice depends on factors such as cost and facility availability.
INTRODUCTION
Metabolic dysfunction-associated steatotic liver disease (MASLD), previously known as nonalcoholic fatty liver disease (NAFLD) or metabolic-associated fatty liver disease, affects over 30% of the general population and is emerging as a leading cause of cirrhosis and hepatocellular carcinoma (HCC) [1]. Its active form, metabolic dysfunction-associated steatohepatitis (MASH), is associated with accelerated fibrosis progression and a higher risk of liver-related complications [2]. Traditionally, the diagnosis of MASH and liver fibrosis required a liver biopsy. However, liver biopsy is an invasive procedure with a small risk of bleeding. Its accuracy in detecting MASH and fibrosis is also limited by sampling variability and intraobserver and interobserver variability [3]. Besides, although serial liver biopsies are often required to demonstrate MASH resolution and fibrosis improvement in clinical trials, this approach is unrealistic in real-world practice. It is thus important to develop noninvasive tests to assess the severity of MASLD.
To qualify as a surrogate endpoint, a biomarker has to demonstrate the ability to diagnose (i.e., MASLD, MASH or fibrosis), prognosticate (i.e., predicting future development of liver-related events [LREs] such as HCC, hepatic decompensation and liver-related death), monitor (i.e., detecting patients who have disease progression over time), or determine treatment response (i.e., detecting patients who improve over time after treatment) [4]. Emerging data support that some existing noninvasive tests may already fulfill some of these requirements. Among various biomarkers, noninvasive tests of fibrosis have been most extensively evaluated and adopted in clinical practice, especially as liver fibrosis has the strongest correlation with LREs among various histological parameters [5]. Some of these tests also demonstrate the ability to predict the presence of clinically significant portal hypertension (CSPH) and high-risk varices. In this article, we review the diagnosis, stratification and prognostication of MASLD by noninvasive tests and discuss the implications for clinical practice and future research. In particular, according to the Food and Drug Administration-National Institute of Health Biomarker Working Group, a prognostic biomarker is “a biomarker used to identify the likelihood of a clinical event, disease recurrence, or progression in patients who have the disease or medical condition of interest” [4].
Notably, most studies on noninvasive tests were published in the NAFLD era, whereas the diagnosis of MASLD requires the presence of at least one cardiometabolic risk factor [6]. However, because the vast majority of patients with NAFLD would have cardiometabolic risk factors anyway, studies have consistently shown over 95% overlap between the NAFLD and MASLD populations [7]. Not surprisingly, the two populations showed similar disease characteristics and natural history [8-10]. For the purpose of this review, data from NAFLD are considered applicable to the MASLD population unless otherwise specified.
SIMPLE FIBROSIS SCORES
Examples of conventional simple fibrosis scores developed on widely available serum parameters include aspartate aminotransferase (AST)-to-platelet ratio index (APRI), Fibrosis-4 index (FIB-4), and NAFLD fibrosis score (NFS). All three scores have good diagnostic accuracy for liver fibrosis and particularly for advanced fibrosis among MASLD with the moderate area under the receiver-operating characteristic curve (AUROC) (Table 1) [4,11,12]. FIB-4 is the most validated and useful test among the three. It is recommended as the first-line assessment for liver fibrosis due to its superior diagnostic accuracy compared to other non-invasive tests [11,13,14]. All three panels have negative predictive values (NPV) exceeding 90%, thus effectively ruling out advanced fibrosis [11]. Consequently, they can be utilized as first-tier screening tools to select patients from low-risk populations requiring further investigations. However, these scores are less accurate in detecting milder degrees of fibrosis (Table 1) [15,16]. Also, data suggested that APRI has inferior performance to FIB-4 and NFS in detecting advanced fibrosis [15].
Prediction of cirrhotic complications and HCC
The three fibrosis scores showed consistently good prediction for LREs including cirrhotic complications and HCC among patients with MASLD (Table 1) [17]. Based on validated cutoffs of FIB-4, NFS and APRI, patients stratified into intermediate- or high-risk groups, compared with the low-risk group, had a significantly higher risk of LREs [18,19]. APRI has the lowest accuracy among the three scores in predicting LRE and overall mortality [17,20,21].
FIB-4 is the most studied and validated tool for accurately risk stratifying patients with MASLD [17,22]. Despite evidence showing that vibration-controlled transient elastography (VCTE) and histological assessment outperformed FIB-4 in their accuracy of predicting LREs, FIB-4 remains an independent predictor of LREs, and the FIB4-VCTE stepwise algorithm accurately identifies high-risk individuals with MASLD for liver-related complications [22]. Moreover, an individual participant data meta-analysis suggests that FIB-4 and NFS perform similarly to liver biopsy in predicting LREs [23]. This is further proved among patients with MASH [24], indicating these panels could be considered as alternatives to unnecessary invasive methods for monitoring the risk of LREs.
Elevated FIB-4 also represents an important predictor of HCC in European and Asian MASLD cohorts [25,26]. A high NFS is also strongly associated with HCC development [23]. As such, the American Gastroenterological Association recommended that patients with non-invasive scores suggestive of cirrhosis should be screened for HCC [27].
In contrast, several studies suggest that FIB-4 might not predict overall and cardiovascular mortality [28-31]. This could be related to the study setting and patient composition.
The steatosis-associated fibrosis estimator (SAFE) score and LiverRisk score are examples of fibrosis scores that have been developed more recently, both utilizing simple serum biomarkers [32,33]. SAFE score was designed to aid in triaging patients at the lowest risk of clinically significant fibrosis (F0-1 vs. ≥F2) [34], while LiverRisk score was developed to predict individual levels of liver stiffness for the general population without chronic liver disease. Both scores outperformed conventional blood-based algorithms in detecting liver fibrosis and identifying high-risk individuals for liver-related outcomes [32,33].
Fibrosis stage is a strong predictor for the development of adverse liver outcomes in MASLD [35]. Advanced fibrosis as determined by simple fibrosis scores is associated with higher mortality [29]. Data suggest that patients who develop advanced fibrosis have greater changes in NFS and FIB-4 [36]. However, whether repeated measures of these panels over time would carry the prognostic importance for predicting adverse outcomes remains to be demonstrated.
Although the predictive ability of FIB-4, NFS, and APRI for fibrosis progression was found to be suboptimal with AUROC levels ranging from 0.65 to 0.73 [17], it is important to note that a significant association was found between a one-unit change in all these tests and the corresponding change in fibrosis stage [13], which underscores the relevance of minor score changes in these biomarkers and liver progression. However, Siddiqui et al. found a significant increase in all three scores only among individuals who showed progression in fibrosis, but not among those who experienced regression [13]. This suggests that improvement in FIB-4 can be primarily driven by a falsely high score at the first test and subsequent regression towards the mean [37]. These findings highlight the importance of exercising caution about false positives when interpreting changes in repeated testing to identify high-risk individuals.
A Swedish population-based cohort study found that a transition from low- to intermediate- or high-risk group by repeated FIB-4 testing within five years is associated with increased severity of liver disease [37]. This finding was also supported by a recent UK population-based cohort study that changes in FIB-4 levels over an ever shorter interval of 12 months predicted the risk of subsequent liver events [18]. Another large cohort study showed a strong association between longitudinal changes in FIB-4 and subsequent development of both HCC and LREs by a three-year landmark analysis [38]. Furthermore, changes in FIB-4 are found to be associated with cardiovascular events and all-cause mortality [18]. This is of particular significance given the high prevalence of metabolic diseases observed within the FIB-4 elevated group in the population without known liver disease [39], suggesting the longitudinal utility of these panels in both liver- and non-liver-related adverse outcomes. Additionally, a Phase 2b clinical trial found significant correlations between reductions in FIB-4, APRI, and NFS and improvement in fibrosis after the treatment with obeticholic acid [40]. This correlation along with the confirmed obeticholic acid-related benefits for fibrosis indicates the prognostic utility of these measures in predicting treatment response in future clinical trials.
Utilization of serum biomarkers for fibrosis scoring has practical advantages of wide applicability and easy availability [41]. However, the tests are less accurate than patented markers such as VCTE with modest diagnostic value (positive predictive value [PPV] around 70%) and can thus lead to unnecessary referral to hepatology clinics [41,42]. More specific assessments with higher diagnostic accuracy are required to rule in positive cases more efficiently.
Furthermore, these tests utilize non-liver-specific biomarkers, and challenges are present with the reproducibility. The results are also influenced by spectrum bias, e.g., acute hepatitis can lead to false positive results of FIB-4 and APRI due to reactive surge in aminotransferase levels; both younger and older age (<35 y; >65 y) are suggested to limit the accuracy of FIB-4 and NFS [4,41]. The interpretation of these scores thus requires additional caution to avoid unnecessary examination and undue stress for healthy individuals and healthcare systems.
FIB-4 and APRI were derived from smaller cohorts of patients with chronic hepatitis C with a high prevalence of liver fibrosis [41]. NFS was derived from the population with MASLD for separation of those with and without advanced fibrosis [43]. These scores were developed in the population with existing liver disease and thus limited in their generalizability and role as a screening tool. There is a clear need for the development of scoring systems in primary care settings for early risk detection and tailored interventions to reduce overall liver disease burden. However, we should not deny the constant usefulness and prognostic value of these scoring systems.
SPECIFIC FIBROSIS BIOMARKERS
Specific fibrosis biomarkers for MASLD may serve as the first- or second-line detection methods. Examples include the enhanced liver fibrosis (ELF) score, PRO-C3, FibroTest, FibroMeter, and Mac2-binding protein (Table 2). These markers reflect the dynamic changes of inflammatory cytokines, fibrogenesis or fibrinolysis, vascular, and neuroendocrine signals in the body. Among them, ELF is more widely used and has been approved by the US Food and Drug Administration as a prognostic biomarker. ELF combines three serum biomarkers, namely hyaluronic acid, tissue inhibitor of metalloproteinase 1, and N-terminal propeptide of type III procollagen, providing a more reliable basis for assessing and predicting fibrosis and cirrhosis compared to individual indicators [44]. A meta-analysis indicated that when using a cut-off of 7.7, ELF has a high sensitivity of 0.93 and a specificity of 0.34 for excluding fibrosis. In populations with disease prevalence <50%, adopting a low cut-off value for excluding advanced fibrosis (F≥3) yields a sensitivity of 0.90, with an NPV ranging from 0.82 to 0.99 [45]. In a study conducted in Denmark, the false positive rate of ELF testing for screening the general population for liver disease was lower than that of FIB-4, making it an effective tool for screening high-risk populations for liver fibrosis and reducing the occurrence of ineffective referral pathways [46].
PRO-C3 has been validated as an independent factor associated with the fibrotic stage and related to the NAFLD activity score, with an AUROC of 0.81 for detecting significant fibrosis and an AUROC of 0.79 for advanced fibrosis [47]. In a cohort study involving patients with diabetes, PRO-C3 has been confirmed to identify rapid disease progression and patients who respond to antimetabolic treatments, and potentially accelerate the development and validation of antifibrotic medications [48]. Additionally, the ADAPT score, which was developed in the MASLD population, incorporating PRO-C3 together with platelet count, diabetes, and age, outperforms APRI and FIB-4 in identifying patients with advanced fibrosis, suggesting its potential as a routine blood biochemical marker for screening [49].
FibroTest utilizes age and gender information, α-2-macroglobulin, haptoglobin, apolipoprotein A1, GGT, cholinesterase, and total bilirubin to assess liver fibrosis. Table 2 reveals that the FibroTest applies a 90% sensitivity exclusion threshold for MASLD at a cut-off score of 0.3, and a 90% sensitivity inclusion criterion at a score of 0.7. Additionally, the FibroTest demonstrated a 72% sensitivity and an 85% specificity for accurately detecting the significant fibrosis with a cut-off range from 0.30 to 0.48 [50].
FibroMeter includes liver function indicators, coagulation indicators, urea, and hyaluronic acid. It has been used to evaluate liver fibrosis in different chronic liver diseases. A cross-sectional study involving 452 patients with MASLD demonstrated that FibroMeter exhibited comparable accuracy to liver stiffness measurement (LSM) in detecting fibrosis stage, with greater predictive ability in cohorts with advanced fibrosis [20]. Furthermore, FibroMeter has better discriminatory ability than other blood markers (Obuchowski index 0.798±0.016), with a cutoff of 0.31 recommended for patients with advanced fibrosis to achieve maximum sensitivity and specificity.
Mac2-binding protein is a secretory protein composed of a carbohydrate-rich N-terminal and a galectin-like domain. Elevated levels of wisteria floribunda agglutinin-positive mac-2 binding protein in the blood reflect the progression of fibrosis. Moreover, serum mac-2 binding protein glycosylation isomer (M2BPGi) has been clinically validated as a glyco-biomarker for predicting cirrhosis in patients with MASLD [51,52]. Additionally, in a Japanese cohort, high levels of M2BPGi have been found to be independently associated with an increased incidence of HCC [53]. Finally, although created in patients with chronic hepatitis C, hepascore and MMP/MP3 have also been identified as specific biomarkers with predictive value for the degree of liver fibrosis, but further evidence related to MASLD is still needed.
In a retrospective cohort study recruiting 453 patients with chronic hepatitis B in Hong Kong, a two-step scoring system combining ELF score (cutoff 9.8) and LSM (cutoff 20 kPa) effectively predicted HCC, with a sensitivity of 86.7% and NPV of 95.3% [54]. This model allows for HCC management based on different ELF risk levels, although further data for MASLD cohorts are needed. Another prospective cohort study in Denmark involving 462 patients followed for an average of 4.1 years confirmed the predictive role of ELF and FibroTest for LREs, with C-indexes of 0.859 and 0.794, respectively. The time-dependent AUROC for the prediction of 5-year LRE reached 0.87, higher than FIB-4 (0.83) and NFS (0.78) [55]. Additionally, advanced fibrosis increases the burden of LREs, with serological markers showing high predictive value in MASLD patient populations, particularly for advanced fibrosis stages. Moreover, this study suggests that an ELF score with a cut-off of 9.1 for advanced fibrosis achieves better diagnostic accuracy than the manufacturer-recommended value of 7.7 [56]. Another meta-analysis mentioned that FibroMeter testing showed acceptable sensitivity (83.1%) and specificity (84.4%) in detecting late-stage fibrosis in MASLD patients. When FibroMeter was integrated with VCTE, the sensitivity increased to 83.5%, while specificity reached 91.1% [57].
Nevertheless, serological testing for predicting MASLD in the general population has limitations, including high economic costs and its inability to be used independently for early diagnosis and patient referral. Consequently, its utilization as a screening tool is less frequent compared to APRI or FIB-4 [58]. Moreover, the accuracy of ELF in detecting fibrosis varies depending on the etiology of the disease, with ELF showing lower accuracy for staging in MASLD compared to alcohol-associated liver disease and chronic hepatitis C. Therefore, it is recommended to discuss further the more specific cut-off values for ELF in diagnosing different fibrosis stage in MASLD [59]. Furthermore, not every biomarker was initially designed for MASLD prognostication. For example, ELF score and FibroTest were developed in patients with chronic hepatitis C. Additionally, PRO-C3 is not a liver-specific marker; it is also increased in pulmonary fibrosis [47].
IMAGING BIOMARKERS
In the past decades, imaging biomarkers have been extensively investigated in the MASLD population. VCTE is currently widely used for disease screening and prognosis assessment. In addition, there are emerging technologies and scoring systems that have been developed and examined, and have been shown to be of significant importance in the assessment of liver fibrosis (Table 3) [41,60-86].
Vibration-controlled transient elastography
VCTE is the most commonly employed diagnostic technique. VCTE utilizes pulse-echo ultrasound to evaluate LSM and controlled attenuation parameter (CAP) [87,88]. Steatosis, advanced fibrosis and cirrhosis can be detected with high sensitivity and specificity [23,89]. In a prospective study of MASLD population, the AUROC of LSM by VCTE was 0.89 when evaluating advanced fibrosis [90]. LSM has shown promising results in predicting the presence and severity of hepatic decompensation and is also associated with the risk of HCC [20,91-93]. According to a prospective study of 2,251 MASLD patients with a median follow-up of 27 months, VCTE could accurately predict LREs, with a C-index of 0.911 [92]. Similarly, in a study among 594 participants with MASLD, VCTE demonstrated comparable accuracy to histologically assessed fibrosis in predicting clinical outcomes, outperforming FIB-4 in the whole study cohort [22]. In another multicenter study of 16,603 patients with MASLD, VCTE demonstrated superior performance in predicting hepatic decompensation and HCC [60]. Importantly, a change in VCTE-LSM also appears to be prognostic. In patients with LSM >15 kPa at baseline, the incidence of liver-related events was 7.8 per 1,000 person-years in patients whose LSM decreased to <10 kPa during follow-up, compared with 38.7 per 1,000 person-years if LSM remained >15 kPa. Despite being a blind technique, VCTE is widely utilized for monitoring treatment efficacy, as well as long-term follow-up in diverse chronic liver diseases. Changes in LSM by VCTE were found to be correlated with transitions in histological fibrosis stage, liver biochemistry, and other metabolic-related clinical parameters in phase III clinical trials [94]. Although VCTE is recommended by guidelines as a robust and consistent tool for fibrosis assessment, screening high-risk patients and progression monitoring [95,96], it has the limitation of being affected by technical factors and patient-related conditions like obesity, food or alcohol consumption and acute inflammation [97].
Scoring systems combining VCTE with other parameters have been developed to identify patients at high risk [98]. FibroScan-AST (FAST) score, comprised of LSM, CAP and serum AST, is a screening test for fibrotic MASH. Patients with FAST score ≥0.67 are more likely to have at-risk MASH, and FAST score ≤0.35 suggests low risk [99]. The numerical variation of FAST score can also reflect changes in histological fibrosis stage [100]. FAST score has good accuracy (C-index 0.80) and excellent sensitivity for the 0.35 cut-off and specificity for the 0.67 cut-off [99]. However, its PPV is relatively low, and the 0.67 cut-off is associated with a high case missing rate of 51.7% [99]. Agile 3+ and Agile 4 are novel scores based on LSM by VCTE, routinely available clinical parameters and demographic features to identify advanced fibrosis or cirrhosis [101]. In cohort studies verifying its performance, the Agile scores were superior to FIB-4 and LSM in terms of AUROC, percentage of patients with indeterminate outcomes, and PPV to rule-in cirrhosis or advanced fibrosis [101]. For predicting cirrhosis, Agile 4 achieved optimal combinations with 85% sensitivity for ruling out and 95% specificity for ruling in. In the case of predicting advanced fibrosis, Agile 3+ demonstrated rule-out sensitivity of 85% and rule-in specificity of 90% [102]. Agile scores also showed excellent capability in LRE prediction [102,103]. The potential use of VCTE-based scores to monitor disease progression and their use to predict clinical outcomes needs to be further investigated.
Point shear wave elastography and two-dimensional shear wave elastography
Shear wave elastography (SWE) is another imaging technique utilizing an acoustic radiation force impulse to generate shear waves within liver tissue visualized, which can reduce measurement failure [104]. Point shear wave elastography (pSWE) shows an excellent prediction result in advanced fibrosis and cirrhosis in patients with MASLD [105,106], regardless of the presence of ascites [41,107-109]. Previous studies mainly focused on predicting liver fibrosis and associated complications in populations with chronic viral hepatitis [110]. In a recent systematic review, only MRE and pSWE met the ideal criteria of greater than 80% sensitivity and specificity for the diagnosis of advanced fibrosis [106]. However, the performance of pSWE varied in different studies [111-113]. Compared to other imaging biomarkers, steatosis significantly impacts the results of pSWE [113]. In addition, the narrow range of values for pSWE restricts the accurate thresholds for the fibrosis stage, which leads to potential variations in the results [114].
Two-dimensional shear wave elastography (2D-SWE) can measure a larger area in the liver and enables the generation of real-time and quantitative elasticity maps of liver tissue, providing visual differentiation of stiff tissues [108]. Unlike VCTE, inflammation does not affect the results of 2D-SWE in MASLD patients [115]. Recent studies developed deep learning radiomics of elastography which applies the radiomic methodology to perform quantitative analysis of the heterogeneity observed in 2D-SWE images, showing excellent diagnostic performance in assessing liver fibrosis in patients with chronic hepatitis B [116]. Among patients with MASLD, a prospective study reported an AUROC of 0.82 for predicting fibrosis risk and 0.90 for diagnosing advanced fibrosis when utilizing 2D-SWE [117]. Currently, 2D-SWE is primarily utilized in research due to insufficient evidence supporting its efficacy in monitoring disease progression and diagnosing associated complications [87].
Magnetic resonance elastography and Iron-corrected T1
Magnetic resonance elastography (MRE) is based on an improved phase-contrast method to image the propagation of shear waves within liver tissue, providing a more quantitative assessment of significant fibrosis, advanced fibrosis, cirrhosis and MASH than ultrasound-based elastography [15,118,119]. As reported by Xiao et al. [15], MRE performed better than other NITs for detecting advanced fibrosis, with an AUROC of 0.96, sensitivity of 0.84, and specificity of 0.90. MRE is also a useful tool for predicting and monitoring cirrhotic complications. In a systematic review of 1,707 MASLD patients, individuals whose LSM by MRE were from 5 to 8 kPa had a hazard ratio of 11.0 for liver-related outcome. For those with LSM results ≥8 kPa, the hazard ratio was 15.9, which indicates a significantly higher risk compared to participants with MRE measurements <5 kPa [120]. Other cohort studies also showed that elevated liver stiffness by MRE was correlated with the presence of ascites, hepatic encephalopathy, variceal bleeding, and overall mortality [119,121]. In contrast, reduction in MRE reflects fibrosis stage improvement with an AUROC of 0.62 [122]. The combination of MRE and FIB-4 is developed as MEFIB (MREFIB-4) for assessing fibrosis stage. MEFIB showed associations with disease progression like decompensation and HCC [98], and the NPV was over 95% in ruling out liver disease progression [123,124]. Due to its diagnostic accuracy for liver fibrosis, MRE is recommended as a promising surrogate measure for monitoring liver disease progression and evaluating therapeutic endpoints [125].
Iron-corrected T1 (cT1) by magnetic resonance imaging (MRI) was used in the UK Biobank population health study, and was designed to evaluate a combination of liver inflammation and fibrosis [126-128]. A meta-analysis examined the diagnostic ability of cT1 and reported its high sensitivity but relatively low specificity [106]. In a study of 50 patients with biopsy-proven MASLD, cT1 differentiated simple steatosis from MASH with an AUROC of 0.69 [129]. In a general hepatology outpatient setting, cT1 showed predictive ability for various liver-related outcomes [130]. In longitudinal disease monitoring, cT1 had better prediction ability for LREs in patients with cirrhosis when taking into account technical failure and unreliability (HR 9.9) [131]. Based on MRI technique, MRI-proton density fat fraction (MRI-PDFF) has been examined as a useful tool for treatment response evaluation, providing more accurate results than CAP in detecting all grades of steatosis in MASLD patients [118]. By combining MRI-PDFF, MRE result and serum AST level, MRI-AST (MAST) score was designed for fibrotic MASH diagnosis. It had an AUROC of 0.93, which outperformed FAST score, FIB-4, and NFS [132].
MRI-based techniques possess the benefit of being accurate, operator-independent and less influenced by patient-related factors [133], while their utilization is restricted in clinical trials due to the high cost and unavailability of facilities for daily clinical practice [98].
PORTAL HYPERTENSION IN MASLD
MASLD ranges from simple steatosis and MASH with or without liver fibrosis, to cirrhosis. With the progressive distortion in architecture of the liver parenchyma and hepatic microcirculatory dysfunction leading to increased intrahepatic vascular resistance, portal hypertension ensues and is an important component in the natural history of cirrhosis [134,135].
As the gold standard, portal pressure is assessed by hepatic venous pressure gradient (HVPG) in which a balloon-tipped catheter is inserted into the hepatic vein under fluoroscopic guidance to acquire the wedged hepatic venous pressure (WHVP) and free hepatic venous pressure (FHVP), and HVPG is determined by subtracting the FHVP from WHVP [136-138]. A HVPG of ≥5 mmHg signifies portal hypertension and that of ≥10 mmHg suggests CSPH [138]. CSPH is an important hallmark as it determines a significantly higher risk of hepatic decompensation [139,140]. However, HVPG measurement is constrained by its invasiveness and potential complications, limiting it for academic purposes most of the time. As well, several observational studies suggested that HVPG tends to underestimate portal pressure gradient in MASH-related cirrhosis due to an underestimation of WHVP in the actual portal vein pressure, with the development of decompensating events while HVPG is <10 mmHg [141-144]. On the other hand, portal hypertension may precede liver fibrosis in MASLD, suggesting a bidirectional effect in these pathological processes [145]. Given the increasing burden of MASLD worldwide, the number of patients with MASH-related cirrhosis and thus portal hypertension will surge [146]. It is thus unrealistic to perform HVPG in a huge number of patients, particularly with the uncertainty in its diagnostic accuracy for MASH-related portal hypertension.
Application of Baveno VII consensus and its limitation in MASLD
Numerous validation studies have been conducted over the past two decades to show high accuracy in liver fibrosis assessment using LSM measured by VCTE [147-150]. With the broader use of LSM as a noninvasive tool to assess liver fibrosis, the term “compensated advanced chronic liver disease” (cACLD) depicts a spectrum consisting of advanced liver fibrosis to compensated cirrhosis [138,151]. Moreover, LSM is shown to correlate with not only the degree of fibrosis but also the severity of portal hypertension [152,153]. From the ANTICIPATE model which studied 518 patients with cACLD, the combination of LSM and platelet count had an AUROC of 0.85 in identifying CSPH by HVPG, and that could also predict <5% of varices needing treatment using LSM of 20 kPa and platelet count ≥150×109/L [154]. On this basis, the updated Baveno VII consensus endorsed the noninvasive modality in the determination of cACLD and CSPH using LSM and platelet count [138]. For instance, LSM ≥10 kPa is suggestive of cACLD. LSM <15 kPa with platelet count ≥150×109/L rules out CSPH whereas that of ≥25 kPa depicts CSPH in which non-selective beta-blockers should be initiated to prevent hepatic decompensation [138]. Nonetheless, a significant proportion of cACLD patients (around 40%) have LSM 15–24.9 kPa and/or platelet count <150×109/L, a category known as the gray zone [154]. Patients in gray zone still carry a substantial risk of decompensation compared to non-cACLD patients [155,156].
Despite the adoption of Baveno VII consensus in diagnosing cACLD patients, there is a pitfall in its use in obese patients with MASLD as it is difficult for the shear wave from VCTE to penetrate through the thick subcutaneous and prehepatic fat, causing inaccurate LSM. Studies reported a body mass index (BMI) of 28–30 kg/m2 as a factor associated with failed or unreliable LSM [157]. A study with 248 patients with MASH confirmed suboptimal performance of the ANTICIPATE model in detecting CSPH among obese (i.e., BMI ≥30 kg/m2) patients with MASH across different LSM categories, and that LSM ≥25 kPa only had a PPV of 62.8% in identifying CSPH [154]. Thus, the ANTICIPATE model, as well as the Baveno VII consensus, cannot be applied to obese patients with MASH. However, the ANTICIPATE-NASH model, which was developed from the same cohort, considered the phenomenon of lower HVPG in higher BMIs leading to inaccuracy in the original ANTICIPATE model [154]. With the addition of BMI to LSM and platelet count, the ANTICIPATE-NASH model serves as a new tool for prediction of CSPH in obese patients with MASH, but further validation is needed.
Combining spleen stiffness measurement into Baveno VII consensus
Alongside LSM, spleen stiffness measurement (SSM) can be obtained by the same VCTE machine, and it appears to be a reliable adjunct to solve the limitation of Baveno VII criteria. Regardless of the liver disease etiology, by combining SSM (40 kPa as cutoff) with LSM and platelet count, a significant proportion of patients in the gray zone can be categorized into having CSPH and non-CSPH, reducing the proportion in gray zone to less than 20%, while maintaining a high discrimination in the risk prediction of hepatic decompensation between the CSPH and non-CSPH groups [155,158]. Second, SSM, as a direct surrogate of portal pressure, enhances the performance of VCTE in MASLD and hence aids its prognostication [159]. In a study involving 154 patients with MASLD, SSM <40 kPa was validated with 100% NPV for high-risk varices and >40 kPa had 100% sensitivity in predicting esophageal varices. SSM ≥21 kPa had a high sensitivity and PPV of 96% and 88% for cirrhosis, respectively [160]. In another study, SSM was also shown to have good correlation with HVPG and hence was accurate, with AUROC of 0.95, in ruling in (using SSM ≥50 kPa) and out (using SSM <40.9 kPa) CSPH in patients with MASLD [161]. Although there are no unified cutoff values for SSM in MASLD, its validation supports the utility in assisting the diagnosis of portal hypertension in patients with MASLD along with the use of Baveno VII consensus, especially in obese patients. Figure 1 summarizes the role of Baveno VII consensus and SSM in MASLD.

The role of Baveno VII consensus and spleen stiffness measurement (SSM) in metabolic dysfunction-associated steatotic liver disease (MASLD). cACLD, compensated advanced chronic liver disease; CSPH, clinically significant portal hypertension; MASLD, metabolic dysfunction-associated steatotic liver disease; SSM, spleen stiffness measurement; BMI, body mass index. *Liver stiffness measurement by transient elastography.
Apart from LSM/SSM-based assessments, the von Willebrand factor antigen-to-platelet ratio (VITRO) appears to have similar prognostic performance as HVPG and the ANTICIPATE±NASH models in patients with cACLD [162]. In a study of 420 patients, the incidence of hepatic decompensation at 1 year was 0% in patients with VITRO <2.5, compared with 10.4% in those with VITRO ≥2.5.
CONCLUSION AND FUTURE PERSPECTIVES
Various noninvasive tests, especially those for liver fibrosis, have demonstrated the ability to predict HCC and hepatic decompensation at a level that is non-inferior to that of liver biopsy, thus raising the question of whether liver biopsy should remain the only acceptable surrogate marker for prognostication and drug approval. In recent years, a number of regional and international guidelines have converged towards recommending FIB-4 as the initial assessment in patients with MASLD due to its low cost and wide availability, followed by LSM or other specific fibrosis biomarkers in case FIB-4 falls into the gray zone (i.e., between 1.3 and 2.67) [14,163,164]. However, the algorithm has much room for improvement. Despite the high NPV, FIB-4 still misses a significant proportion of patients with advanced fibrosis.165 In contrast, at the low cutoff of 1.3, FIB-4 has a rather low PPV, meaning that the majority of patients referred for LSM or hepatology consultation are not expected to have advanced fibrosis. To have a sustainable care model, it is imperative to make a second fibrosis available to primary care physicians and colleagues in other specialties [145]. Additionally, as colleagues outside the hepatology field need to attend to many other medical conditions, it is difficult for them to prioritize MASLD assessment even in a high-risk population. To achieve this, we have to make the process part of the routine or automatic. For example, many groups have demonstrated the possibility of including fibrosis assessment as part of a comprehensive screening for diabetic complications [167,168]. Some medical centers have also started automatic FIB-4 calculation based on existing laboratory values, and flagging abnormal test results in the electronic health record can increase the identification of patients for further evaluation [169].
Case selection is another issue for consideration. Current guidelines recommend MASLD and fibrosis assessment in patients with type 2 diabetes, obesity and other cardiometabolic risk factors [164,170]. Among them, type 2 diabetes has the strongest association with advanced liver disease and liver-related complications [171]. While the high pretest probability of advanced liver disease makes it logical to consider fibrosis assessment in this high-risk group, some studies suggest that the performance of noninvasive tests might be worse in patients with type 2 diabetes [172]. Further refinement of the diagnostic process is necessary.
Although this article focuses on liver fibrosis, there is increasing interest in diagnosing “at-risk” or “high-risk” MASH, which represents patients with steatohepatitis and F2-F3 fibrosis. This is in keeping with the patient population in most clinical trials in MASH and is indeed the target population for treatment once a drug comes to the market, as exemplified by the recent conditional approval of resmetirom. The relative roles of MASH and fibrosis biomarkers in patient management and clinical care pathways should be determined.
In addition, most of the prognostic data on specific fibrosis biomarkers came from secondary and tertiary centers, while the prognostic performance of simple fibrosis scores such as FIB-4 has been extensively evaluated in the general population or primary care settings due to the possibility of retrospective score calculation based on existing laboratory results. Determining the performance of specific fibrosis biomarkers in primary care should be a research priority, especially as most patients with MASLD are seen there [166]. Adoption is also key. Table 4 summarizes the regional guidelines/guidance on the use of noninvasive tests in MASLD.
In the past 16 years, a number of gene polymorphisms (e.g., PNPLA3, TM6SF2, MBOAT7, HSD17B13) were identified to be associated with MASLD, its histological severity, and future risk of HCC and hepatic decompensation [173]. Although polygenic risk scores have been proposed for the prediction of HCC using these genetic markers [174], the added value is marginal over existing clinical prediction models. Nonetheless, new genetic markers are being identified in recent years using better-powered cohorts [175]. Besides, the identification of epigenetic markers and multiomic approaches will likely refine noninvasive assessments further. Currently, such an approach is limited by low availability and prohibitive costs of the tests. However, the field should keep an open mind as technological advances and innovative approaches may drive down the cost and improve test availability. Notably, molecular tests are already routinely used in oncology practice. With concerted effort, we can produce noninvasive tests and clinical care pathways that can make precision medicine in MASLD a reality.
Notes
Authors’ contribution
All authors were responsible for the writing plan, content, drafting and critical revision of the manuscript for important intellectual content.
Conflicts of Interest
Jimmy Lai has served as an advisory committee member for Gilead Sciences and Boehringer Ingelheim, and a speaker for Gilead Sciences. Grace Wong has served as an advisory committee member for Gilead Sciences and Janssen, as a speaker for Abbott, Abbvie, Bristol-Myers Squibb, Echosens, Furui, Gilead Sciences, Janssen and Roche, and received research grant from Gilead Sciences. Vincent Wong has served as a consultant or advisory board member for AbbVie, Boehringer Ingelheim, Echosens, Gilead Sciences, Intercept, Inventiva, Novo Nordisk, Pfizer, Sagimet Biosciences, TARGET PharmaSolutions, and Visirna; and a speaker for Abbott, AbbVie, Echosens, Gilead Sciences, Novo Nordisk, and Unilab. He has received a research grant from Gilead Sciences, and is a co-founder of Illuminatio Medical Technology. Terry Yip has served as a speaker and an advisory committee member for Gilead Sciences. The other authors declare that they have no competing interests.
Acknowledgements
Vincent Wong’s research in steatotic liver disease is supported in part by the General Research Fund from the Hong Kong SAR Government (project reference 1410 6923).
Abbreviations
MASLD
metabolic dysfunction-associated steatotic liver disease
NAFLD
nonalcoholic fatty liver disease
HCC
hepatocellular carcinoma
MASH
metabolic dysfunction-associated steatohepatitis
LREs
liver-related events
CSPH
clinically significant portal hypertension
AST
aspartate aminotransferase
APRI
AST-to-platelet ratio index
FIB-4
Fibrosis-4 index
NFS
NAFLD fibrosis score
AUROC
area under the receiver-operating characteristic curve
NPV
negative predictive value
VCTE
vibration-controlled transient elastography
SAFE
steatosis-associated fibrosis estimator
PPV
positive predictive value
ELF
enhanced liver fibrosis
LSM
liver stiffness measurement
CAP
controlled attenuation parameter
FAST
FibroScan-AST
SWE
shear wave elastography
pSWE
point SWE
2D-SWE
two-dimensional SWE
DLRE
deep learning radiomics of elastography
MRE
magnetic resonance elastography
MRI
magnetic resonance imaging
MRI-PDFF
MRI-proton density fat fraction
HVPG
hepatic venous pressure gradient
WHVP
wedged hepatic venous pressure
FHVP
free hepatic venous pressure
cACLD
compensated advanced chronic liver disease
VNT
varices needing treatment
BMI
body mass index
SSM
spleen stiffness measurement