Introduction
It is estimated that 30% of the European population is affected, at least once in their lives, by gastrointestinal (GI) diseases [1]. Among the most prevalent pathologies in the Portuguese population, we highlight Helicobacter pylori infection (60-84%), dyspepsia (20-40%), gastroesophageal reflux disease (35%), and irritable bowel syndrome (estimated to affect between 500,000 and 1 million Portuguese) [1, 2]. Symptoms may be harmless or can result from more serious conditions, such as inflammatory bowel disease (IBD) or GI tumors. IBD is thought to affect 15,000-20,000 people, most of them young and active individuals [1]. GI tumors are responsible for approximately 10% of deaths in Portugal. Colorectal cancer (CRC) is one of the most common cancers, and in 2020 it had a prevalence of 17.4% and a mortality rate of 14.2% in the Portuguese population [3].
Multiple GI societies have been developing many clinical practice guidelines (CPG) to guide physicians in the diagnosis of and therapeutic approach toward patients with GI diseases. These CPG systematically combine scientific evidence and clinical judgment, culminating in recommendations that have been shown to improve patient care [4, 5]. The use of CPG is far-reaching, i.e., assisting in clinical decisions, furthering education, assessing the quality of care, guiding resource allocation, and prioritizing research [6].
There are different methodologies to classify the strength of evidence for a particular recommendation, but the GRADE (Grading of Recommendations Assessment, Development and Evaluation) system is one of the most accepted and used in international CPG [7]. Compared to other systems used to classify evidence, GRADE is the most comprehensive, considering a wider range of information [8]. GRADE is a system developed to classify the quality of evidence and the strength of health recommendations, assigning levels of evidence and classifying the robustness of recommendations for health issues. The level of evidence represents the confidence in the information that is used for that purpose. This approach provides a universal and comprehensive system for rating the quality of evidence, which is increasingly being adopted worldwide. In addition, it allows physicians and patients a way to quickly and confidently assess the quality behind recommendations. With the increasing number of CPG, systematic reviews, and randomized controlled trials (RCT), it is reasonable to assume that CPG recommendations would be based on a greater degree of high-quality evidence [9]. Unfortunately, as has been shown in studies related to other specialties, this is often not true [10, 11]. There have been no studies surveying the quality of evidence in the gastroenterology guidelines behind different GI societies. Therein lies the rationale for our investigation: examine the proportion of high-, moderate-, low- and very-low-quality evidence to draw conclusions about the availability of evidence to gastroenterologists. Therefore, the aim of this research was to review the guidelines published by some of the most relevant GI societies between 2018 and 2019. Our goal was to report the level of evidence supporting their recommendations and to identify areas where evidence can be improved with additional research.
Materials and Methods
A list of CPG published between 2018 and 2019, in the gastroenterology area, were obtained by searching the websites from some of the international societies that follow our daily clinical practice. European CPG were obtained from the United European Gastroenterology (UEG) database [12]. American CPG data were collected from the American Gastroenterological Association (AGA), the American College of Gastroenterology (ACG), the American Association for the Study of Liver Diseases (ASSLD), and the American Society for Gastrointestinal Endoscopy (ASGE) [13-16]. A total of 89 European and North American CPG published in the gastroenterology field between 2018 and 2019 were analyzed. In order to standardize the collected sample, CPG that did not use the GRADE system to rate the quality of evidence and CPG that were still in execution or ones without levels of evidence for their recommendations were excluded. Considering this, only 29 guidelines were considered for inclusion. This proportion of CPG analyzed and included in the study is shown in Figure 1. All of the guidelines included in the research are presented in online supplementary material (for all online suppl. material, see www.karger.com/doi/10.1159/000518322).
The quality of the evidence is classified into the following 4 levels: high, moderate, low, and very low (Table 1) [7]. A total of 1,233 recommendations from the 29 guidelines were analyzed. Afterwards, all 1,233 recommendations were extracted into an SPSS spreadsheet with its associated level of evidence. We also stratified the list by the guideline in which they were published along with the year of publication.
The statistical analysis was performed using statistical software (SPSS version 23). Frequencies and proportions were calculated to describe the collected data.
Results
A total of 29 guidelines (7 from American societies and 22 from European societies) were included in this study, representing 1,233 recommendations. Of the 1,233 recommendations collected, 324 (26.3%) were based on a low level of evidence and 127 (10.3%) were based on a very low level of evidence, indicating poor evidence or expert opinion. Four hundred forty-six (36.2%) were based on a moderate level of evidence and 336 (27.3%) were based on high levels of evidence, with 277 (82.44%) of these being related to liver disease. These results are detailed in Table 2.
Only 2 guidelines - the European Association for the Study of the Liver (EASL) Recommendations on Treatment of Hepatitis C (2018) [17] and the EASL Clinical Practice Guidelines: Management of Alcohol-Related Liver Disease [18] - had over 50% of recommendations supported by high-level evidence.
Of the 29 publications analyzed, 14 (48.3%) did not present any recommendation with a high level of evidence, 15 contained high-level recommendations, and only 2 (6.7%) did not present recommendations with a low or very low level of evidence (Fig. 2).
Of the recommendations evaluated, 77 were from North American societies and the remaining 1,156 were European recommendations. In relation to the first group, only 3 (3.9%) had a high level of evidence, belonging to the same guideline - Guidelines for Sedation and Anesthesia in GI Endoscopy. The analysis of recommendations per field showed that only the ones related to liver and bowel diseases (not IBD) had a proportion with a high level of evidence higher than 30% (34.37 and 33.3%, respectively). Table 3 reports the proportion of recommendations with high-level evidence per “field” of gastroenterology.
The year 2018 had the highest percentage of recommendations supported by a high level of evidence (88.7%, 298 out of 336). Between 2018 and 2019, the percentage of recommendations supported by a low level of evidence increased from 20.8 to 37.3%.
Discussion
The 29 guidelines analyzed contained 1,233 recommendations. More than a quarter (26.3%) were based on a low evidence level and only 27.3% were supported by high evidence levels.
A similar study performed by the American College of Emergency Physicians found that less than 10% of their recommendations were based on high-quality evidence, and the majority were based on expert opinion [19]. Similarly, the American College of Chest Physicians found that only 0.4% of recommendations for the treatment of thromboembolism were based on high-level evidence [20]. In the field of gastroenterology, a similar study done by Meyer et al. [21], analyzing the scientific evidence underlying the American College of Gastroenterology (ACG) CPG also concluded that very few recommendations made by the AGA are supported by high levels of evidence. More than half of all recommendations made by the AGA are based on low-quality evidence or expert opinion [21]. Feuerstein et al. [22] also referred that when the gastroenterology guidelines rate the quality of evidence for their recommendations most recommendations are based on lower-quality evidence. A systematic analysis and critical appraisal of the quality of the scientific evidence in Practice Guidelines for Barrett’s Esophagus revealed that nearly 50% of the recommendations are based on expert opinion or poor-quality evidence [23]. A critical review of scientific evidence and evolving recommendations of AASLD CPG also concluded that, despite significant increases in the numbers of recommendations within AASLD practice guidelines over time, only a minority are supported by grade I evidence, highlighting the need for developing well-designed investigations to provide evidence for areas of uncertainty and improving the quality of future guidelines in hepatobiliary diseases [24].
Although the amount of high-level evidence supporting the CPG included in our study (26.3%) compares favorably to the previously mentioned papers, this is far from what would be expected. When considering the proportion of recommendations with a high level of evidence demonstrated in Tables 2 and 3, it is notable that these low percentages of recommendations with a high level of evidence will affect the overall quality of the guidelines reported. These results confirm that, although guidelines should be followed, interpretation with caution and careful clinical judgment is mandatory.
A limitation of our study was the impossibility to analyze how evidence levels can differ from recommendations related with the same topic in different guidelines. In this research we only included guidelines published in 2018 and 2019, and consequently the topics covered are pretty much all different, even within the same area (online suppl. material). Considering this limitation of our research, the only 2 guidelines which address the same topic are the ACG Clinical Guideline: Diagnosis and Management of Pancreatic Cysts and European evidence-based guidelines on pancreatic cystic neoplasms [25, 26]. In this case, there were no big discrepancies between them - they were published in the same year (2018) and addressed the same topic. In both, there were no recommendations with a high level of evidence. Regarding recommendations with a low and very low level of evidence, the American guideline and the European guideline presented a proportion of 100 and 81.7%, respectively.
RCT are the cornerstone of clinical decision making, and the field of gastroenterology has a poor history of producing influential RCT [27]. Nearly 25,000 RCT are published each year but, given that 14 of the analyzed guidelines do not contain recommendations with a high level of evidence, it appears that few RCT find their way into CPG in the field of gastroenterology [28].
The discrepancy between the total number of gastroenterology RCT and those supporting guideline recommendations may be due to 2 factors: overlap between RCT and practical barriers to conducting RCT. The potential overlap between RCT, also known as research waste, may delay the advent of treatments for patients with preventable diseases [29, 30]. For example, in our study we found that, of the 14 guidelines with no high-quality evidence, almost all were focused on either prevention, treatment, or management. While some of these individual recommendations may not be subjectable to an RCT due to ethical or practical concerns, some of these may be tested in a randomized fashion.
Koh et al. [24] reported a 36% increase in recommendation number for the AASLD since their development in 1998. Nonetheless, despite this substantial increase, less than 15% were based on high-grade evidence. Since 2003, the National Institutes of Health budget for digestive disease research reached a plateau, while corporate funding for gastroenterology research dropped by more than 60% since 2008 [31, 32]. This translates into increased competition for grant applications, which are being awarded at the lowest rate in decades [33]. RCT are among the most time-consuming and expensive research studies, despite producing the highest level of evidence. We suggest that the multiple gastroenterology organizations should encourage future research to strengthen recommendations that are currently supported by expert opinion or a low evidence level.
Some studies have estimated that 54-70% of physicians consistently use CPG in practice, so their quality is of the greatest importance [5, 34]. Therefore, the scarcity of high-quality evidence affects physicians seeking evidence-based treatment options and patients seeking evidence-based care. Recommendations that are based on low levels of evidence are important areas for research as they may expose patients to unnecessary risks and inflate health care costs [35]. CPG can give physicians a false sense of security, causing them to rely more on the guideline than on critical-thinking and updated research [28]. This shows the importance of basing guideline recommendations on high-level evidence. When creating guidelines from expert consensus they are subject to bias. Conflicts of interest are potential sources of bias in the development of CPG [36]. Another major problem in basing recommendations on expert consensus is the fact that opinions vary between experts. This is illustrated in a study where Marras et al. [37] found the highest percent of expert agreement on any recommendation was 81%. Without further evidence validating one opinion over another, physicians will use their judgment to treat patients, leading to variability in care. The number of gastroenterology recommendations supported by low-level evidence and expert opinion highlights the need for further research leading to better evidence and improved patient outcomes.
Limitations
A limitation of our study is the use of guideline repositories, which are disposed to more variations than public databases. The reduce number of guidelines and recommendations included, in some fields, should lead to careful interpretation of results.
Because the guidelines were published before the current year, they may not be an accurate reflection of the current levels of evidence in gastroenterology literature and therefore our study may underestimate the current research quality in the field.
Conclusion
More than a quarter of all recommendations are based on low-quality evidence or expert opinion. Considering the CPG included, 14 of the 29 CPG that guide our practice do not contain recommendations supported by high-level evidence. All CPG recommendations should be considered equally relevant. The recommendations contain supporting evidence ranging from high quality (RCT) to low quality (expert opinion). Research should focus on the development of RCT and systematic reviews to improve the evidence supporting the CPG that guide our daily practice.