67 research outputs found

    Practices and Infrastructures for Machine Learning Systems: An Interview Study in Finnish Organizations

    Get PDF
    Using interviews, we investigated the practices and toolchains for machine learning (ML)-enabled systems from 16 organizations across various domains in Finland. We observed some well-established artificial intelligence engineering approaches, but practices and tools are still needed for the testing and monitoring of ML-enabled systems.Peer reviewe

    Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

    Get PDF
    No abstract available

    Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

    Get PDF
    No abstract available

    Natural Language Processing – Finding the Missing Link for Oncologic Data, 2022

    Get PDF
    Oncology like most medical specialties, is undergoing a data revolution at the center of which lie vast and growing amounts of clinical data in unstructured, semi-structured and structed formats. Artificial intelligence approaches are widely employed in research endeavors in an attempt to harness electronic medical records data to advance patient outcomes. The use of clinical oncologic data, although collected on large scale, particularly with the increased implementation of electronic medical records, remains limited due to missing, incorrect or manually entered data in registries and the lack of resource allocation to data curation in real world settings. Natural Language Processing (NLP) may provide an avenue to extract data from electronic medical records and as a result has grown considerably in medicine to be employed for documentation, outcome analysis, phenotyping and clinical trial eligibility. Barriers to NLP persist with inability to aggregate findings across studies due to use of different methods and significant heterogeneity at all levels with important parameters such as patient comorbidities and performance status lacking implementation in AI approaches. The goal of this review is to provide an updated overview of natural language processing (NLP) and the current state of its application in oncology for clinicians and researchers that wish to implement NLP to augment registries and/or advance research projects

    Addendum to Informatics for Health 2017: Advancing both science and practice

    Get PDF
    This article presents presentation and poster abstracts that were mistakenly omitted from the original publication

    Mining clinical attributes of genomic variants through assisted literature curation in Egas

    Get PDF
    The veritable deluge of biological data over recent years has led to the establishment of a considerable number of knowledge resources that compile curated information extracted from the literature and store it in structured form, facilitating its use and exploitation. In this article, we focus on the curation of inherited genetic variants and associated clinical attributes, such as zygosity, penetrance or inheritance mode, and describe the use of Egas for this task. Egas is a web-based platform for text-mining assisted literature curation that focuses on usability through modern design solutions and simple user interactions. Egas offers a flexible and customizable tool that allows defining the concept types and relations of interest for a given annotation task, as well as the ontologies used for normalizing each concept type. Further, annotations may be performed on raw documents or on the results of automated concept identification and relation extraction tools. Users can inspect, correct or remove automatic text-mining results, manually add new annotations, and export the results to standard formats. Egas is compatible with the most recent versions of Google Chrome, Mozilla Firefox, Internet Explorer and Safari and is available for use at https://demo.bmd-software.com/egas/

    Infrastructure for Semantic Annotation in the Genomics Domain

    Get PDF
    We describe a novel super-infrastructure for biomedical text mining which incorporates an end-to-end pipeline for the collection, annotation, storage, retrieval and analysis of biomedical and life sciences literature, combining NLP and corpus linguistics methods. The infrastructure permits extreme-scale research on the open access PubMed Central archive. It combines an updatable Gene Ontology Semantic Tagger (GOST) for entity identification and semantic markup in the literature, with a NLP pipeline scheduler (Buster) to collect and process the corpus, and a bespoke columnar corpus database (LexiDB) for indexing. The corpus database is distributed to permit fast indexing, and provides a simple web front-end with corpus linguistics methods for sub-corpus comparison and retrieval. GOST is also connected as a service in the Language Application (LAPPS) Grid, in which context it is interoperable with other NLP tools and data in the Grid and can be combined with them in more complex workflows. In a literature based discovery setting, we have created an annotated corpus of 9,776 papers with 5,481,543 words
    • …
    corecore