8 research outputs found

    Systematic literature review (SLR) automation: a systematic literature review

    Get PDF
    Context: A systematic literature review(SLR) is a methodology used to find and aggregate all relevant studies about a specific research question or topic of interest. Most of the SLR processes are manually conducted. Automating these processes can reduce the workload and time consumed by human. Method: we use SLR as a methodology to survey the literature about the technologies used to automate SLR processes. Result: from the collected data we found many work done to automate the study selection process but there is no evidence about automation of the planning and reporting process. Most of the authors use machine learning classifiers to automate the study selection process. From our survey, there are processes that are similar to the SLR process for which there are automatic techniques to perform them. Conclusion: Because of these results, we concluded that there should be more research done on the planning, reporting, data extraction and synthesizing processes of SLR

    Reporting Statistical Validity and Model Complexity in Machine Learning based Computational Studies

    Get PDF
    Background:: Statistical validity and model complexity are both important concepts to enhanced understanding and correctness assessment of computational models. However, information about these are often missing from publications applying machine learning. Aim: The aim of this study is to show the importance of providing details that can indicate statistical validity and complexity of models in publications. This is explored in the context of citation screening automation using machine learning techniques. Method: We built 15 Support Vector Machine (SVM) models, each developed using word2vec (average word) features --- and data for 15 review topics from the Drug Evaluation Review Program (DERP) of the Agency for Healthcare Research and Quality (AHRQ). Results: The word2vec features were found to be sufficiently linearly separable by the SVM and consequently we used the linear kernels. In 11 of the 15 models, the negative (majority) class used over 80% of its training data as support vectors (SVs) and approximately 45% of the positive training data. Conclusions: In this context, exploring the SVs revealed that the models are overly complex against ideal expectations of not more than 2%-5% (and preferably much less) of the training vectors

    Artificial Intelligence in Systematic Reviews: Uncharted Waters for Librarians

    Get PDF
    Background: For over 10 years, informatics journals have published articles about the potential for artificial intelligence (AI) tools to automate parts of the time-consuming and labor-intensive systematic review process. Despite this vision, automation of the systematic review process is still uncommon today for many research teams. This study will investigate why these potentially time-saving tools have not been incorporated into librarians’ systematic review workflow, especially given the recent surge in systematic review requests. Methods: This study will involve a review of the literature which will explore the proposed uses of AI to facilitate the systematic review process. In particular, the review will identify tools that can incorporate AI into the systematic review screening process, as well as resources and best practices for the use of these tools. Finally, the study will examine the facilitators and barriers to librarians’ adoption of AI tools in systematic reviews. Results: The researchers will present key findings from the literature review. They will explain types of AI and available options for librarians’ use to expedite the systematic review screening process. The researchers will share a list of available tools and recommendations for librarians to begin to incorporate AI into their systematic review workflow.Conclusions: Explaining the history, uses, and potential benefits and challenges of AI in systematic reviews may help librarians understand how best to incorporate these tools into their expert searching services

    Dynamic summarization of bibliographic-based data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Traditional information retrieval techniques typically return excessive output when directed at large bibliographic databases. Natural Language Processing applications strive to extract salient content from the excessive data. Semantic MEDLINE, a National Library of Medicine (NLM) natural language processing application, highlights relevant information in PubMed data. However, Semantic MEDLINE implements manually coded schemas, accommodating few information needs. Currently, there are only five such schemas, while many more would be needed to realistically accommodate all potential users. The aim of this project was to develop and evaluate a statistical algorithm that automatically identifies relevant bibliographic data; the new algorithm could be incorporated into a dynamic schema to accommodate various information needs in Semantic MEDLINE, and eliminate the need for multiple schemas.</p> <p>Methods</p> <p>We developed a flexible algorithm named Combo that combines three statistical metrics, the Kullback-Leibler Divergence (KLD), Riloff's RlogF metric (RlogF), and a new metric called PredScal, to automatically identify salient data in bibliographic text. We downloaded citations from a PubMed search query addressing the genetic etiology of bladder cancer. The citations were processed with SemRep, an NLM rule-based application that produces semantic predications. SemRep output was processed by Combo, in addition to the standard Semantic MEDLINE genetics schema and independently by the two individual KLD and RlogF metrics. We evaluated each summarization method using an existing reference standard within the task-based context of genetic database curation.</p> <p>Results</p> <p>Combo asserted 74 genetic entities implicated in bladder cancer development, whereas the traditional schema asserted 10 genetic entities; the KLD and RlogF metrics individually asserted 77 and 69 genetic entities, respectively. Combo achieved 61% recall and 81% precision, with an F-score of 0.69. The traditional schema achieved 23% recall and 100% precision, with an F-score of 0.37. The KLD metric achieved 61% recall, 70% precision, with an F-score of 0.65. The RlogF metric achieved 61% recall, 72% precision, with an F-score of 0.66.</p> <p>Conclusions</p> <p>Semantic MEDLINE summarization using the new Combo algorithm outperformed a conventional summarization schema in a genetic database curation task. It potentially could streamline information acquisition for other needs without having to hand-build multiple saliency schemas.</p

    Doctor of Philosophy

    Get PDF
    dissertationThe objective of this work is to examine the efficacy of natural language processing (NLP) in summarizing bibliographic text for multiple purposes. Researchers have noted the accelerating growth of bibliographic databases. Information seekers using traditional information retrieval techniques when searching large bibliographic databases are often overwhelmed by excessive, irrelevant data. Scientists have applied natural language processing technologies to improve retrieval. Text summarization, a natural language processing approach, simplifies bibliographic data while filtering it to address a user's need. Traditional text summarization can necessitate the use of multiple software applications to accommodate diverse processing refinements known as "points-of-view." A new, statistical approach to text summarization can transform this process. Combo, a statistical algorithm comprised of three individual metrics, determines which elements within input data are relevant to a user's specified information need, thus enabling a single software application to summarize text for many points-of-view. In this dissertation, I describe this algorithm, and the research process used in developing and testing it. Four studies comprised the research process. The goal of the first study was to create a conventional schema accommodating a genetic disease etiology point-of-view, and an evaluative reference standard. This was accomplished through simulating the task of secondary genetic database curation. The second study addressed the development iv and initial evaluation of the algorithm, comparing its performance to the conventional schema using the previously established reference standard, again within the task of secondary genetic database curation. The third and fourth studies evaluated the algorithm's performance in accommodating additional points-of-view in a simulated clinical decision support task. The third study explored prevention, while the fourth evaluated performance for prevention and drug treatment, comparing results to a conventional treatment schema's output. Both summarization methods identified data that were salient to their tasks. The conventional genetic disease etiology and treatment schemas located salient information for database curation and decision support, respectively. The Combo algorithm located salient genetic disease etiology, treatment, and prevention data, for the associated tasks. Dynamic text summarization could potentially serve additional purposes, such as consumer health information delivery, systematic review creation, and primary research. This technology may benefit many user groups

    Revues systématiques et méta-analyses en chirurgie cardiaque : défis et solutions

    Full text link
    Objectif: Explorer, adapter et développer de nouvelles méthodologies permettant de réaliser des revues systématiques et méta-analyses en chirurgie cardiaque. Méthodes: Le text mining et la citation chasing ont été utilisés pour l’optimisation de l’efficience et de la sensibilité de la recherche. Nous avons participé à l’évaluation des nouveaux outils (Risk of Bias 2.0 et Risk of bias in non-randomized studies of interventions) pour l’évaluation de la qualité des études randomisées et non randomisées et qui ont été adoptés pour nos projets futurs. Une nouvelle méthodologie graphique a été développée pour la réalisation des méta-analyses de données de survie. Résultats: Ces approches ont été utilisées pour répondre à diverses questions de recherche touchants différents aspects de la chirurgie cardiaque : 1) la rédaction des premières lignes directrices de l’Enhanced Recovery After Cardiac Surgery, 2) une revue systématique des résultats de la chirurgie valvulaire et aortique chez le transplanté cardiaque, démontrant les bons résultats de ces procédures dans une population à haut risque et l’émergence des techniques trans-cathéters dans la prise en charge de ces pathologies, 3) une méta-analyse portant sur les arythmies supra-ventriculaires chez les patients ayant eu une intervention de Fontan, concluant à un effet bénéfique de la technique du conduit extra-cardiaque et 4) une méta-analyse portant sur l’insuffisance aortique chez les patients porteurs d’assistance ventriculaire gauche, objectivant une incidence sous-estimée de cette situation clinique avec un impact significatif sur la survie de cette population de patients. Conclusion: Cette thèse aborde certaines contraintes de la littérature en chirurgie cardiaque comme la sensibilité sous optimale de la recherche systématique et les méta-analyses de données de survie, et a proposé des solutions. D’autres contraintes telles que les comparaisons multiples subsistent. Des recherches futures axées sur de nouvelles approches comme le network meta-analysis ou l’approche bayésienne pourraient offrir des solutions.Objective: To explore, adapt and develop new methodologies for performing systematic reviews and meta-analyses in cardiac surgery. Methods: Text mining and citation chasing were used to optimize the efficiency and sensitivity of search process. We participated in the evaluation of new tools (Risk of Bias 2.0 and Risk of bias in non-randomized studies of interventions) for quality assessment of randomized and nonrandomized studies and we have adopted them for our future projects. A new graphic methodology has been developped for the performance of meta-analyses of time-to-event data. Results: These approaches have been used to answer various research questions touching different aspects of cardiac surgery: 1) writing the first guidelines of enhanced recovery after cardiac surgery, 2) a systematic review of the results of valvular surgery and aortic in cardiac transplantation, demonstrating good results of these procedures in a high-risk population and the emergence of trans-catheter techniques in the management of these pathologies, 3) a meta-analysis of supra-ventricular arrhythmias in patients who had a Fontan intervention, finding a beneficial effects of the extracardiac conduct technique and 4) a meta-analysis of aortic insufficiency in patients with left ventricular assist device, showing an under-estimated incidence of this clinical entity with a significant impact on the survival of this population of patients. Conclusion: This thesis addresses some of the short comings of the heart surgery literature such as the sensitivity of the systematic search and time-to-event data meta-anlysis and proposed novel solutions. Other issues such as the need to summarize a comprehensive and coherent set of comparisons remain. Future researchs focused on new approaches such as the network meta-analysis or the Bayesian approach can solve these issues

    Information Behaviour bei der Erstellung systematischer Reviews. Informationsverhalten von Information Professionals bei der DurchfĂĽhrung systematischer Ăśbersichtsarbeiten im Kontext der evidenzbasierten Medizin

    Get PDF
    Die Durchführung systematischer Übersichtsarbeiten (systematic reviews) ist ein zentrales Element der evidenzbasierten Medizin. Die dafür nötige Entwicklung von reproduzierbarer Suchstrategien ist eine kritische und ressourcenintensive Aufgabe. Sie hat zum Ziel, eine initiale klinische Fragestellung optimal, fehlerrobust und datenbankspezifisch in Form einer erschöpfenden Booleschen Suchanfrage zu repräsentieren. Der exponentielle Zuwachs an medizinischer Primärliteratur, Biases bei Studiendesign und Publikation sowie die zunehmende Diversität an Informationsressourcen erschweren diesen Prozesse. Eine detaillierte Analyse zugrundeliegender Aufgabenstrukturen, Informationsbedarfe und dem daraus resultierenden, kontextuell geprägten Informationsverhalten können helfen, Teilprozesse der systematischen Übersichtsarbeit zu automatisieren und Empfehlungen für die Gestaltung bibliographischer Fachdatenbanken und Recherchesystemen zu geben. Die vorliegende Arbeit analysiert Aufgabenstrukturen und Informationsverhalten anhand qualitativer Verfahren und setzt diese unter anderem in den Kontext der zugrundeliegenden Methodenlehre und mündet in einem kontextuellen Modell zur iterativen Entwicklung von Suchstrategien. Abschließende Empfehlungen geben Denkanstöße für die weitere Entwicklung zur technischen Unterstützung dieser zeitintensiven Rechercheaufgabe
    corecore