16 research outputs found

    Improving Term Extraction with Terminological Resources

    Full text link
    Studies of different term extractors on a corpus of the biomedical domain revealed decreasing performances when applied to highly technical texts. The difficulty or impossibility of customising them to new domains is an additional limitation. In this paper, we propose to use external terminologies to influence generic linguistic data in order to augment the quality of the extraction. The tool we implemented exploits testified terms at different steps of the process: chunking, parsing and extraction of term candidates. Experiments reported here show that, using this method, more term candidates can be acquired with a higher level of reliability. We further describe the extraction process involving endogenous disambiguation implemented in the term extractor YaTeA

    Characteristics, Outcomes, and Severity Risk Factors Associated with SARS-CoV-2 Infection among Children in the US National COVID Cohort Collaborative

    Get PDF
    Importance: Understanding of SARS-CoV-2 infection in US children has been limited by the lack of large, multicenter studies with granular data. Objective: To examine the characteristics, changes over time, outcomes, and severity risk factors of children with SARS-CoV-2 within the National COVID Cohort Collaborative (N3C). Design, Setting, and Participants: A prospective cohort study of encounters with end dates before September 24, 2021, was conducted at 56 N3C facilities throughout the US. Participants included children younger than 19 years at initial SARS-CoV-2 testing. Main Outcomes and Measures: Case incidence and severity over time, demographic and comorbidity severity risk factors, vital sign and laboratory trajectories, clinical outcomes, and acute COVID-19 vs multisystem inflammatory syndrome in children (MIS-C), and Delta vs pre-Delta variant differences for children with SARS-CoV-2. Results: A total of 1068410 children were tested for SARS-CoV-2 and 167262 test results (15.6%) were positive (82882 [49.6%] girls; median age, 11.9 [IQR, 6.0-16.1] years). Among the 10245 children (6.1%) who were hospitalized, 1423 (13.9%) met the criteria for severe disease: mechanical ventilation (796 [7.8%]), vasopressor-inotropic support (868 [8.5%]), extracorporeal membrane oxygenation (42 [0.4%]), or death (131 [1.3%]). Male sex (odds ratio [OR], 1.37; 95% CI, 1.21-1.56), Black/African American race (OR, 1.25; 95% CI, 1.06-1.47), obesity (OR, 1.19; 95% CI, 1.01-1.41), and several pediatric complex chronic condition (PCCC) subcategories were associated with higher severity disease. Vital signs and many laboratory test values from the day of admission were predictive of peak disease severity. Variables associated with increased odds for MIS-C vs acute COVID-19 included male sex (OR, 1.59; 95% CI, 1.33-1.90), Black/African American race (OR, 1.44; 95% CI, 1.17-1.77), younger than 12 years (OR, 1.81; 95% CI, 1.51-2.18), obesity (OR, 1.76; 95% CI, 1.40-2.22), and not having a pediatric complex chronic condition (OR, 0.72; 95% CI, 0.65-0.80). The children with MIS-C had a more inflammatory laboratory profile and severe clinical phenotype, with higher rates of invasive ventilation (117 of 707 [16.5%] vs 514 of 8241 [6.2%]; P <.001) and need for vasoactive-inotropic support (191 of 707 [27.0%] vs 426 of 8241 [5.2%]; P <.001) compared with those who had acute COVID-19. Comparing children during the Delta vs pre-Delta eras, there was no significant change in hospitalization rate (1738 [6.0%] vs 8507 [6.2%]; P =.18) and lower odds for severe disease (179 [10.3%] vs 1242 [14.6%]) (decreased by a factor of 0.67; 95% CI, 0.57-0.79; P <.001). Conclusions and Relevance: In this cohort study of US children with SARS-CoV-2, there were observed differences in demographic characteristics, preexisting comorbidities, and initial vital sign and laboratory values between severity subgroups. Taken together, these results suggest that early identification of children likely to progress to severe disease could be achieved using readily available data elements from the day of admission. Further work is needed to translate this knowledge into improved outcomes

    A Simple Standard for Sharing Ontological Mappings (SSSOM)

    Get PDF
    Despite progress in the development of standards for describing and exchanging scientific information, the lack of easy-to-use standards for mapping between different representations of the same or similar objects in different databases poses a major impediment to data integration and interoperability. Mappings often lack the metadata needed to be correctly interpreted and applied. For example, are two terms equivalent or merely related? Are they narrow or broad matches? Or are they associated in some other way? Such relationships between the mapped terms are often not documented, which leads to incorrect assumptions and makes them hard to use in scenarios that require a high degree of precision (such as diagnostics or risk prediction). Furthermore, the lack of descriptions of how mappings were done makes it hard to combine and reconcile mappings, particularly curated and automated ones. We have developed the Simple Standard for Sharing Ontological Mappings (SSSOM) which addresses these problems by: (i) Introducing a machine-readable and extensible vocabulary to describe metadata that makes imprecision, inaccuracy and incompleteness in mappings explicit. (ii) Defining an easy-to-use simple table-based format that can be integrated into existing data science pipelines without the need to parse or query ontologies, and that integrates seamlessly with Linked Data principles. (iii) Implementing open and community-driven collaborative workflows that are designed to evolve the standard continuously to address changing requirements and mapping practices. (iv) Providing reference tools and software libraries for working with the standard. In this paper, we present the SSSOM standard, describe several use cases in detail and survey some of the existing work on standardizing the exchange of mappings, with the goal of making mappings Findable, Accessible, Interoperable and Reusable (FAIR). The SSSOM specification can be found at http://w3id.org/sssom/spec

    The GA4GH Phenopacket schema defines a computable representation of clinical data.

    No full text
    n the clinical domain, substantial work has been dedicated to the development of computational phenotypes.1 Traditionally, these approaches have largely relied on rule-based methods and large sources of clinical data to identify cohorts of patients with or without a specific disease.2–5 However, they were not developed to enable deep phenotyping of abnormalities, to facilitate computational analysis of interpatient phenotypic similarity, or to support computational decision support. To address this, the Global Alliance for Genomics and Health6 (GA4GH) has developed the Phenopacket schema, which supports exchange of computable longitudinal case-level phenotypic information for diagnosis of and research on all types of disease, including Mendelian and complex genetic diseases, cancer, and infectious diseases. A Phenopacket characterizes an individual person or biosample, linking that individual to detailed phenotypic descriptions, genetic information, diagnoses, and treatments (Fig 1). The Phenopacket software is available at https://github.com/phenopackets/

    Characterizing Long COVID: Deep Phenotype of a Complex Condition

    No full text
    Background: Numerous publications describe the clinical manifestations of post-acute sequelae of SARS-CoV-2 (PASC or “long COVID”), but they are difficult to integrate because of heterogeneous methods and the lack of a standard for denoting the many phenotypic manifestations. Patient-led studies are of particular importance for understanding the natural history of COVID-19, but integration is hampered because they often use different terms to describe the same symptom or condition. This significant disparity in patient versus clinical characterization motivated the proposed ontological approach to specifying manifestations, which will improve capture and integration of future long COVID studies. Methods: The Human Phenotype Ontology (HPO) is a widely used standard for exchange and analysis of phenotypic abnormalities in human disease but has not yet been applied to the analysis of COVID-19. Findings: We identified 303 articles published before April 29, 2021, curated 59 relevant manuscripts that described clinical manifestations in 81 cohorts three weeks or more following acute COVID-19, and mapped 287 unique clinical findings to HPO terms. We present layperson synonyms and definitions that can be used to link patient self-report questionnaires to standard medical terminology. Long COVID clinical manifestations are not assessed consistently across studies, and most manifestations have been reported with a wide range of synonyms by different authors. Across at least 10 cohorts, authors reported 31 unique clinical features corresponding to HPO terms; the most commonly reported feature was Fatigue (median 45.1%) and the least commonly reported was Nausea (median 3.9%), but the reported percentages varied widely between studies. Interpretation: Translating long COVID manifestations into computable HPO terms will improve analysis, data capture, and classification of long COVID patients. If researchers, clinicians, and patients share a common language, then studies can be compared/pooled more effectively. Furthermore, mapping lay terminology to HPO will help patients assist clinicians and researchers in creating phenotypic characterizations that are computationally accessible, thereby improving the stratification, diagnosis, and treatment of long COVID. Funding: U24TR002306; UL1TR001439; P30AG024832; GBMF4552; R01HG010067; UL1TR002535; K23HL128909; UL1TR002389; K99GM145411. © 2021 The Author(s
    corecore