Article Text
Abstract
Background Evaluation of the quality of evidence in systematic reviews (SRs) is essential for assertive decision-making. Although Grading of Recommendations Assessment, Development and Evaluation (GRADE) affords a consolidated approach for rating the level of evidence, its application is complex and time-consuming. Artificial intelligence (AI) can be used to overcome these barriers.
Design Analytical experimental study.
Objective The objective is to develop and appraise a proof-of-concept AI-powered tool for the semiautomation of an adaptation of the GRADE classification system to determine levels of evidence in SRs with meta-analyses compiled from randomised clinical trials.
Methods The URSE-automated system was based on an algorithm created to enhance the objectivity of the GRADE classification. It was developed using the Python language and the React library to create user-friendly interfaces. Evaluation of the URSE-automated system was performed by analysing 115 SRs from the Cochrane Library and comparing the predicted levels of evidence with those generated by human evaluators.
Results The open-source URSE code is available on GitHub (http://www.github.com/alisson-mfc/urse). The agreement between the URSE-automated GRADE system and human evaluators regarding the quality of evidence was 63.2% with a Cohen’s kappa coefficient of 0.44. The metrics of the GRADE domains evaluated included accuracy and F1-scores, which were 0.97 and 0.94 for imprecision (number of participants), 0.73 and 0.7 for risk of bias, 0.9 and 0.9 for I2 values (heterogeneity) and 0.98 and 0.99 for quality of methodology (A Measurement Tool to Assess Systematic Reviews), respectively.
Conclusion The results demonstrate the potential use of AI in assessing the quality of evidence. However, in consideration of the emphasis of the GRADE approach on subjectivity and understanding the context of evidence production, full automation of the classification process is not opportune. Nevertheless, the combination of the URSE-automated system with human evaluation or the integration of this tool into other platforms represents interesting directions for the future.
- Evidence-Based Practice
- Systematic Reviews as Topic
Data availability statement
Data are available in a public, open access repository. All data relevant to the study are included in the article or uploaded as online supplemental information. All data relevant to the study are included in the article or uploaded as online supplemental information. The open-source code for the URSE-automated system is also available at https://doi.org/10.5281/ZENODO.13916887. Data are available from the corresponding author on reasonable request.
Statistics from Altmetric.com
Data availability statement
Data are available in a public, open access repository. All data relevant to the study are included in the article or uploaded as online supplemental information. All data relevant to the study are included in the article or uploaded as online supplemental information. The open-source code for the URSE-automated system is also available at https://doi.org/10.5281/ZENODO.13916887. Data are available from the corresponding author on reasonable request.
Footnotes
X @alisson_ufop
Contributors AOdS conceptualised and designed the study. AOdS and TMM created AI protocols. AOdS and VSB performed data analysis. AOdS and TMM wrote the original draft, and VSB and ESdS reviewed and edited the manuscript. AOdS was responsible for the overall content as the guarantor. All authors read and approved the final manuscript.
Funding This work was supported by Universidade Federal de São João del-Rei, Universidade Federal de Mato Grosso do Sul and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior, Brazil, Finance Code 001. Two authors are productivity fellowships of the Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brazil: ESdS, process number 305665/2023-5, and VSB, process number 311309/2023-2.
Competing interests None declared.
Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.
Provenance and peer review Not commissioned; externally peer reviewed.
Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.