Overview and recent trends of systematic reviews and meta-analyses in hepatology
Article information
Abstract
A systematic review (SR) is a research methodology that involves a comprehensive search for and analysis of relevant studies on a specific topic. A strict and objective research process is conducted that comprises a systematic and comprehensive literature search in accordance with predetermined inclusion/exclusion criteria, and an assessment of the risk of bias of the selected literature. SRs require a multidisciplinary approach that necessitates cooperation with clinical experts, methodologists, other experts, and statisticians. A meta-analysis (MA) is a statistical method of quantitatively synthesizing data, where possible, from the primary literature selected for the SR. Review articles differ from SRs in that they lack a systematic methodology such as a literature search, selection of studies according to strict criteria, assessment of risk bias, and synthesis of the study results. The importance of evidence-based medicine (EBM) in the decision-making for public policy has recently been increasing thanks to the realization that it should be based on scientific research data. SRs and MAs are essential for EBM strategy and evidence-based clinical practice guidelines. This review addresses the current trends in SRs and MAs in the field of hepatology via a search of recently published articles in the Cochrane Library and Ovid-MEDLINE.
INTRODUCTION
With the development of electronic publication, the volume of medical literature published yearly exceeds what can be reviewed by experts, and studies with conflicting results on the same topic are commonplace. This situation can make it difficult to draw definitive conclusions, and so a systematic review (SR) provides the best and most trustworthy objective analysis of the existing evidence.
In an SR, the primary research within a specific topic is collected, the evidence is filtered based on systematically established criteria, the information quality is evaluated, the data are analyzed, and a comprehensive conclusion is drawn with minimized bias. SRs were originally considered to lack originality and to be of little significance, but the advent of evidence-based medicine (EBM) has led to evolution of the field to focus on the therapeutic effects, intervention effects, diagnostic accuracy, and expansion into diverse topics such as disease prognosis, incidence, policy, and qualitative research.
The popularity of SRs increased markedly with the establishment of the Cochrane Center to celebrate the Scottish scholar Archie Cochrane. Established in the UK in 1992, this center stressed the necessity of EBM, and approximately 8,000 reviews had been published by 2010, a dramatic increase from the several hundred yearly publications up to 1990. SRs have been increasingly used in Korea since 2008, not only in the scientific literature but also in investigations of national healthcare policy and organization to guide the inclusion of new health technology assessment within insurance coverage.
While Korean health-care experts are increasingly aware of SRs, they remain poorly used in the clinical field compared to other countries. The findings are reviewed by the Evidence-Based Practice Center of the Agency for Healthcare Research and Quality within the US Department of Health and Human Services, and are applied to clinical practice guidelines following consensus development conferences held by the National Institutes for Health; results produced in the UK by the Scottish Intercollegiate Guidelines Network (SIGN), National Institute for Health and Clinical Excellence, and National Coordinating Centre for Health Technology Assessment are also applied clinically.
This article introduces the general concept and overall process of the SR with the aim of encouraging SR generation and clinical application. In addition, the trends of SRs in the hepatology field during 2013 and 2014 are considered, using the Cochrane Library and Ovid-MEDLINE databases (DBs).
Overview of the SR and meta-analysis
An SR involves searching for studies that are appropriate to the specific topic of the research, choosing them using clear and reproducible methods, and then designing and characterizing the individual primary studies. A strategy is then used to minimize bias and random error in individual studies, and then a summary of all of the included primary studies is generated. While a narrative review considers the existing literature based on experts' subjective viewpoints, the SR applies a distinct methodology to existing studies that is scientific and objective, and which has particularly narrowed subjects to suggest generalized estimations. In this process, when researchers can use a quantitative method, metaanalysis (MA), for synthesizing the primary studies. Statistical synthesis of the data is an optional part of an SR; in other words, statistical pooling is not always needed, or indeed may not be possible, and qualitative synthesis or descriptive (narrative) synthesis is used instead.
The SR requires a process of defining the review question, searching for studies, selecting studies and collecting (retrieving) data, assessing the risk of bias in the included studies, analyzing data, undertaking an MA, and interpreting results. The procedure is described in detail below.
In common with other studies, the subject of an SR of a clinical topic must be medically meaningful. Furthermore, the field of the subject must include unmet needs such as uncertain evidence or differential views on specific subjects. Once a review question for the study topic has been chosen and a protocol for the study has been written, the researchers need to clarify and focus the review question using PICO: P-patients, populations, problems (how the focus patient groups will be managed); I-interventions, index test (which interventions or results from the diagnostic test will be evaluated); C-comparators, comparison, control (what parameters will be compared); and O-outcomes (which outcome variables will be researched).
After the specific review question has been chosen, the researchers must select the main search terms following the PICO process, and then browse the literature by establishing search strategies such as the range of the literature search and deciding which DB(s) to use. The three main (core) bibliographic DBs-Cochrane Library, Ovid-MEDLINE, and EMBASE-are generally considered to be the most important sources for literature searches.1 Depending on the characteristics of the review topic and DBs, the following may also be used: Web of Science, DARE, PsycINFO, ERIC, CANCERLIT, AMED, and CINAHL. The main domestic search providers in Korea, which may also be used, are KoreaMed, KMbase, NDSL, KSITI, and KISS. The researchers must also establish a sensitive and specific search strategy. Most DBs can be searched using standardized subject terms (e.g., MeSH and EMTREE) assigned by indexers, which are useful. In order to be as comprehensive as possible, it is necessary when designing a search strategy to include a wide range of free-text terms for each of the selected concepts. A search strategy should build up controlled vocabulary terms, text words, synonyms, Boolean operators (AND, OR, and NOT), truncation or wild cards (*, ?, and $), and a suitable search filter. A standardized process is thus performed for selecting studies with suitable inclusion/exclusion criteria. The assessment of study eligibility and extraction of data from the included studies should be conducted by at least two people, independently.
After selecting the studies, their validity should be assessed by evaluating the risk of bias in their results (i.e., the risk that they will overestimate or underestimate the true intervention effect). Studies for which the conclusions are not based on valid and objective evidence, or for which the validity of the methodology is not robust, cannot provide reliable answers to the SR question. Various types of tool are used to evaluate study quality. The representative quality-assessment tools are the risk of bias (Cochrane Library) and checklists (SIGN); however, Quality Assessment of Diagnostic Accuracy Studies and the Newcastle-Ottawa Scale are also used.
Data should be collected for analysis from the individual studies, and should include not only the general information of publish year, authors, study design, and participant characteristics, but also interventions, outcome results, main findings, and conclusions.
After extracting the data, MA methods allow quantification of the direction, size, and consistency of any effects. If suitable numerical data are not available for MA, or if MA is considered inappropriate, then these may be formulated by narrative, descriptive, or qualitative synthesis. MA is a process of composing comprehensive reasoning by a weighted averaged summary estimation of interventions by integrating effect sizes (e.g., mean difference, relative risk, odds ratio, and number needed to treat) that were extracted from the primary study results. Researchers can conduct an MA using the Review Manager program provided by the Cochrane group. They may also use other statistical software such as SAS, STATA, comprehensive MA, and R.
When conducting the MA, the researchers should examine whether quantitative pooling is appropriate for the analysis by considering the heterogeneity of the individual studies. If there is a high degree of heterogeneity, the researchers could perform a subgroup analysis or meta-regression for reasons of the heterogeneity. An SR assumes homogeneity of the primary studies, and so if there is heterogeneity in the included primary studies, integration of the effects will be inappropriate. Furthermore, if the reason for the heterogeneity cannot be fully explained, the validity of the findings of the SR could be more limited.
Furthermore, it is possible that only statistically significant results are reported in the searched literature. The researchers therefore need to examine the publication bias using, for example, funnel plots. Methodological corrections can sometimes be applied if there is a high risk of publication bias.
SRs and MAs should be performed using the processes described above, by referring either to the 'reporting guidelines' presented by the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) group,2 or the quality-evaluating Measurement Tool to Assess the Methodological Quality or Systematic Review (AMSTAR).3
Recent trends of SRs and MAs in hepatology
A search for the SRs and MAs written in the field of hepatology during 2013 and 2014 yielded about 30 studies published in the Cochrane Library, about 120 studies from Ovid-MEDLINE, or about 100 studies if those on the pancreas and biliary tract are excluded. Hepatology is being actively studied using SR and MA methodology in the UK, China, the USA, the Netherlands, and Germany, and similar methodologies are being used in Canada, Ireland, Greece, Thailand, Switzerland, France, Saudi Arabia, Italy, Romania, UK, Egypt, Canada, Denmark, Australia, Brazil, Mexico, Qatar, Japan, Taiwan, Iran, Poland, and Croatia. The most reviewed interventions/index tests in SRs and MAs are the comparisons between treatment medicines or interventions, the accuracy of diagnosis, and prevalence. The main populations of the enrolled patients are hepatocellular carcinoma (HCC), hepatitis C virus (HCV), and liver transplantation, followed by hepatic fibrosis, liver cirrhosis, hepatitis B virus (HBV), nonalcoholic fatty liver disease (NAFLD), and liver failure. The literature is summarized herein according to the disorder.
HBV
Nine of 11 SRs related to HBV were conducted in China, with the remainder performed in Germany and the UK. Almost all compared the effects of medicine, safety, and efficacy. Only four studies used a core DB (Table 1).
The review topics comprised the relative effects of telbivudine and entecavir in chronic HBV patients,4 the effects of lamivudine and telbivudine,5 the effects of nucleos(t)ide analogs,6 the effects of tenofovir in HBV/HIV coinfected patients,7 the effects of glucocorticoids,8 the effects of immunoglobulin administration and antiviral treatment in mother-to-child transmission (MTCT) interruption,9 and the efficacy of telbivudine in MTCT interruption.10 There were also several reviews about determinants of long-term protection after hepatitis B vaccination in infancy,11 and the association between persistent HBV infection and cytotoxic T-lymphocyte-associated antigen 4 +49A/G polymorphism in an Asian population.12
HCV
There were 31 studies about HCV infection, comprising 11 studies about the safety and efficacy of medicines, 6 studies about prevalence or epidemiology, 3 studies of the predictors of patient compliance or response to treatment, 7 on the effects of education, and 4 studies about relationships with other disorders (e.g., thyroid dysfunction, HIV, retinopathy, and Schistosomiasis mansoni). Unlike HBV, there were only two studies from China, while eight were performed in the USA, and the rest were from Canada, the UK, Germany, the Netherlands, Brazil, France, Ireland, Denmark, Qatar, Saudi Arabia, Romania, Australia, and Thailand. Only 26.6% (8/30) of the studies used a core DB, and only 16.6% (5/30) of the studies were published in the Cochrane Library, Hepatology, Hepatogastroenterology, or Liver International. Furthermore, some of the studies used inadequate methodologies in that they did not refer to the PRISMA reporting guidelines or AMSTAR (Table 2).
The current gold standard of HCV treatment is the combination therapy with pegylated interferon-α (a weekly subcutaneous injection) and ribavirin (administered orally).15 There were SR and MA reviews comparing the benefits and harms of pegylated interferon plus ribavirin and interferon plus ribavirin,16 the safety and effects of pegylated interferon α-2a or α-2b plus ribavirin,17,18 the efficacy of adding statins to interferon α plus ribavirin,19 the relative efficacy among boceprevir and telaprevir,20 comparison of the efficacy and safety of telaprevir,21 the comparative effectiveness of antiviral treatment,22 antiviral therapy (interferon-α plus cyclosporine A and tacrolimus) for recurrent HCV after liver transplantation,23 and interferon for interferon-nonresponding and relapsing patients with chronic HCV.24 Moreover, there were studies on the effects of nitazoxanide (a synthetic nitrothiazolyl-salicylamide derivative and an antiprotozoal agent),25 and the association between 25-hydroxyvitamin D and a sustained virological response (SVR).26
With regard to HCV prevalence or epidemiology, new estimates of age-specific antibodies to HCV seroprevalence27 and HCV prevalence in prisons and other closed settings28 have been published (in Global Epidemiology and Hepatology, respectively). Studies of HCV prevalence have also been conducted in Europe, Egypt, Asia, and Saudi Arabia.29,30,31,32
In addition, there were reviews of interleukin-28B polymorphism as a predictor of SVR in patients with chronic HCV treated with triple therapy,33 treatment of HCV among people who inject drugs,34 and the positive ratio of specific antibodies to F protein in serum samples from chronic HCV using an enzyme-linked immunosorbent assay.35 Reviews of research into psychological, lifestyle, and social predictors of HCV treatment response36 have been published in Liver International, and reviews about patient adherence to antiviral treatment37 and the effects of education interventions38 have also been published.
There were reviews of the relationship with other disorders such as thyroid dysfunction during single and combination therapy of interferon-α,39 outcomes of treatment-naïve HCV patients coinfected with HIV,40 interferon-associated retinopathy during the treatment of chronic HCV,41 and the association between Schistosomiasis mansoni infection and HCV.42 Moreover, there were studies about the determinants of HCV treatment completion and efficacy in drug users,43 the effects of mode of delivery, labor management strategies, and breastfeeding practices on risk reduction for mother-to-infant transmission of HCV,44 the survival advantage of kidney transplantation over dialysis in patients with HCV,45 and the benefit and harms of HCV screening in asymptomatic adults.46
NAFLD/AFLD
The search for reviews of NAFLD and nonalcoholic steatohepatitis yielded three reviews from China, and one each from the UK, Iran, Italy, and New Zealand. Only two (28.6%) of these studies used a core DB (Table 3). Four (57.1%) of the reviews observed at the efficacy of medicines such as of statins (lovastatin, atorvastatin, simvastatin, pravastatin, rosuvastatin, and fluvastatin),47 ursodeoxycholic acid,48 probiotics,49 and pentoxifylline,50 and one (14.3%) compared the benefits and harms of herbal medicines in NAFLD/AFLD (alcoholic fatty liver disease) patients.51 The remaining two reviews focused on the association between obstructive sleep apnea and the presence and severity of NAFLD52 and the relationship between hepatic steatosis and hepatic ischemia reperfusion injury.53
Liver cirrhosis and hepatic fibrosis
There were more reviews on the accuracy of diagnosis than on the effects of intervention for cirrhosis patients. Three reviews were from China, two were from the USA, and there was one each from Romania and Egypt. Two of the reviews (28.6%) were published in a major journal, and two (28.6%) used core DB (Table 4).
Reviews on acoustic-radiation-force impulse elastography and transient elastography were published in Liver International.54,55 In addition, there were reviews on the diagnostic accuracy of using the Fibroscan device (for transient elastography) and the aspartate aminotransferase/platelet ratio index,56 the effects of entecavir and lamivudine for hepatitis-B-decompensated cirrhosis,57 and a comparison of laparoscopic splenectomy with or without devascularization of the stomach for liver cirrhosis and portal hypertension.58
Hepatic encephalopathy
There were three reviews focusing on hepatic encephalopathy patients, two of which were from Denmark (Table 5). These reviews were on the effects of dopamine agents and placebo or no intervention,59 the effects of oral zinc,60 and the effects of oral branched-chain amino acids (BCAA).61
Liver failure
There were three reviews on liver failure patients, among which only one was published in a major journal; none of the reviews used a core DB (Table 6). The three reviews concerned the prognostic indicators of acute-on-chronic liver failure and their predictive value for mortality,62 echocardiography in chronic liver disease,63 and the efficacy and safety of nucleos(t)ide analogs in the treatment of HBV-related acute-on-chronic liver failure.64
HCC
There were 34 SRs about HCC patients, comprising 11 reviews on the effects of surgical method, 6 reviews on the diagnostic accuracy of various tools [e.g., contrast-enhanced ultrasound (CEUS) and diffusion-weighted imaging (DWI)], 8 reviews of transcatheter arterial chemoembolization (TACE), and 6 reviews about the efficacy of medicines such as statins and steroids. Twelve of the reviews came from China, 5 from the UK, 4 from the USA, 3 from the Netherlands, 2 each from Germany and Canada, and 1 each from Italy, Australia, Egypt, and Switzerland. Of the 34 reviews, 9 (26.5%) were published in a major journal and 12 (35.3%) used a core DB (Table 7).
A review of the diagnostic accuracy of CEUS for the differentiation of benign and malignant focal liver lesions was published in Liver International,65 and a review of the accuracy of DWI compared with conventional contrast-enhanced magnetic resonance imaging for the detection of HCC, aimed at chronic liver disease patients, was published in the Journal of Gastroenterology and Hepatology.66 In addition, there were reviews about the prognostic usability of microvascular invasion after resection or liver transplantation in HCC67 and the diagnostic accuracy of Ras-association domain family 1A promoter methylation,68 the diagnostic accuracy of circulating tumor-cell detection in gastric cancer and HCC,69 and the effects of three common functional polymorphisms in microRNA-encoding genes on the susceptibility to HCC.70
Some of the reviews compared the surgical method of radiofrequency (thermal) ablation and other interventions,71 surgical resection and radiofrequency ablation,72 radiofrequency ablation and hepatic resection for the treatment of early-stage HCC,73 and laparoscopic and open liver resection,74,75,76 while others looked at robotic liver resection,77 the benefits and harms of surgical resection vs liver transplant,78 living-donor vs deceased-donor liver transplantation for HCC,82 and the relationship between the responses to HCV therapy and the development of HCC.83
With regard to interventions, there are reviews about TACE: comparison of the effects of TACE combined with local ablative therapy vs monotherapy,84 the effects of combination TACE plus sorafenib for the management of unresectable HCC,85 and the indications, technique, and outcome of portal vein embolization before liver resection/preoperative portal vein embolization.86
Liver transplantation
There were six reviews about liver transplant patients, 50% of which were published in a major journal. However, all of these studies were written by the same author (Table 8). They covered the benefits and harms of antiviral interventions on liver transplant patients with recurrent graft infection due to HCV,100 the benefits and harms of prophylactic antiviral therapy for the prevention of chronic HCV while undergoing liver transplantation,101 comparisons of methods of preventing bacterial sepsis and wound complications after liver transplantation,102 the use of high genetic barrier nucleos(t)ide analog(s) for prophylaxis from HBV recurrence after liver transplantation,103 as well as the association between cytokine gene polymorphisms and graft rejection in liver transplantation104 and a comparison of the Celsior and Custodiol solutions for liver transplantation in adults.105
CONCLUSION
The SR constitutes the most important research method in EBM, and SRs are widely used worldwide for establishing national health-care policies. SRs can relay information with minimal bias by summarizing the available evidence, and by collecting small-scale studies, enhancing the reliability of the conclusions, increasing the potential for general application, increasing the sample number, and reducing the risk of type 2 errors. Individual studies are limited in scope, whereas SRs can determine the treatment efficacy among various population groups. Expanding the subject scope allows generalization to larger populations and determination of the appropriate target group. In addition, by verifying the treatment effectiveness without requiring additional studies, SRs can speed up the clinical introduction of effective remedies. For these reasons, in the last 20 years the SR has become a critical component of both quantitative and methodological medical research.
However, there remain many challenges in incorporating SR clinically. Although SRs are viewed as providing a higher level of evidence and with greater confidence than individual studies, since the quality of individual studies included varies, SRs and their clinical applications require close inspection. Over 50% of SRs and MAs that are published in major journals have employed inappropriate methodology.
Accepting the results of a statistical analysis without also examining the clinical implications is a grave error; data assessment and the clinical significance must be determined through the collaboration between clinical experts, and any conclusions must fully reflect the consensus of the entire field. The clinical utility is enhanced when the decision to incorporate new treatments is made after weighing both the benefits and risks relative to existing technology.
The future production and utilization of objective evidence concerning medical technology, intervention, and diagnosis requires SRs and MAs that are performed through guided methodology when unclear or controversial evidence exists. Despite this limitation, high-quality SRs and MAs can facilitate EBM, minimize financial waste, and assist in policy decision-making. The evidence assessment and the evidence-securing process by SR and MA as detailed in this review will increase in the field of hepatology.
Notes
The authors have no conflicts to disclose.
Abbreviations
AHRQ
Agency for healthcare research and quality
AMSTAR
Assess the methodological quality or systematic review
CDC
Consensus development conference
CPG
Clinical practice guideline
DB
Database
EBM
Evidence based medicine
HBV
Hepatitis B virus
HCC
Hepatocellular carcinoma
HCV
Hepatitis C virus
MA
Meta-analysis
NAFLD
non-alcoholic fatty liver disease
PICO
Patient, intervention, comparator, outcome
PRISMA
Preferred reporting items for systematic review and meta-analysis
QUADAS
Quality assessment of diagnostic accuracy studies
RoB
risk of bias
SIGN
Scottish intercollegiate guideline network
SR
Systematic review