1,331 research outputs found

    Desiderata for the development of next-generation electronic health record phenotype libraries

    Get PDF
    Background High-quality phenotype definitions are desirable to enable the extraction of patient cohorts from large electronic health record repositories and are characterized by properties such as portability, reproducibility, and validity. Phenotype libraries, where definitions are stored, have the potential to contribute significantly to the quality of the definitions they host. In this work, we present a set of desiderata for the design of a next-generation phenotype library that is able to ensure the quality of hosted definitions by combining the functionality currently offered by disparate tooling. Methods A group of researchers examined work to date on phenotype models, implementation, and validation, as well as contemporary phenotype libraries developed as a part of their own phenomics communities. Existing phenotype frameworks were also examined. This work was translated and refined by all the authors into a set of best practices. Results We present 14 library desiderata that promote high-quality phenotype definitions, in the areas of modelling, logging, validation, and sharing and warehousing. Conclusions There are a number of choices to be made when constructing phenotype libraries. Our considerations distil the best practices in the field and include pointers towards their further development to support portable, reproducible, and clinically valid phenotype design. The provision of high-quality phenotype definitions enables electronic health record data to be more effectively used in medical domains

    Desiderata for the development of next-generation electronic health record phenotype libraries

    Get PDF
    BackgroundHigh-quality phenotype definitions are desirable to enable the extraction of patient cohorts from large electronic health record repositories and are characterized by properties such as portability, reproducibility, and validity. Phenotype libraries, where definitions are stored, have the potential to contribute significantly to the quality of the definitions they host. In this work, we present a set of desiderata for the design of a next-generation phenotype library that is able to ensure the quality of hosted definitions by combining the functionality currently offered by disparate tooling.MethodsA group of researchers examined work to date on phenotype models, implementation, and validation, as well as contemporary phenotype libraries developed as a part of their own phenomics communities. Existing phenotype frameworks were also examined. This work was translated and refined by all the authors into a set of best practices.ResultsWe present 14 library desiderata that promote high-quality phenotype definitions, in the areas of modelling, logging, validation, and sharing and warehousing.ConclusionsThere are a number of choices to be made when constructing phenotype libraries. Our considerations distil the best practices in the field and include pointers towards their further development to support portable, reproducible, and clinically valid phenotype design. The provision of high-quality phenotype definitions enables electronic health record data to be more effectively used in medical domains

    Unmanned Aerial Vehicles for High-Throughput Phenotyping and Agronomic Research

    Get PDF
    Advances in automation and data science have led agriculturists to seek real-time, high-quality, high-volume crop data to accelerate crop improvement through breeding and to optimize agronomic practices. Breeders have recently gained massive data-collection capability in genome sequencing of plants. Faster phenotypic trait data collection and analysis relative to genetic data leads to faster and better selections in crop improvement. Furthermore, faster and higher-resolution crop data collection leads to greater capability for scientists and growers to improve precision-agriculture practices on increasingly larger farms; e.g., site-specific application of water and nutrients. Unmanned aerial vehicles (UAVs) have recently gained traction as agricultural data collection systems. Using UAVs for agricultural remote sensing is an innovative technology that differs from traditional remote sensing in more ways than strictly higher-resolution images; it provides many new and unique possibilities, as well as new and unique challenges. Herein we report on processes and lessons learned from year 1-the summer 2015 and winter 2016 growing seasons-of a large multidisciplinary project evaluating UAV images across a range of breeding and agronomic research trials on a large research farm. Included are team and project planning, UAV and sensor selection and integration, and data collection and analysis workflow. The study involved many crops and both breeding plots and agronomic fields. The project's goal was to develop methods for UAVs to collect high-quality, high-volume crop data with fast turnaround time to field scientists. The project included five teams: Administration, Flight Operations, Sensors, Data Management, and Field Research. Four case studies involving multiple crops in breeding and agronomic applications add practical descriptive detail. Lessons learned include critical information on sensors, air vehicles, and configuration parameters for both. As the first and most comprehensive project of its kind to date, these lessons are particularly salient to researchers embarking on agricultural research with UAVs

    Approaches to three-dimensional reconstruction of plant shoot topology and geometry

    Get PDF
    There are currently 805 million people classified as chronically undernourished, and yet the World’s population is still increasing. At the same time, global warming is causing more frequent and severe flooding and drought, thus destroying crops and reducing the amount of land available for agriculture. Recent studies show that without crop climate adaption, crop productivity will deteriorate. With access to 3D models of real plants it is possible to acquire detailed morphological and gross developmental data that can be used to study their ecophysiology, leading to an increase in crop yield and stability across hostile and changing environments. Here we review approaches to the reconstruction of 3D models of plant shoots from image data, consider current applications in plant and crop science, and identify remaining challenges. We conclude that although phenotyping is receiving an increasing amount of attention – particularly from computer vision researchers – and numerous vision approaches have been proposed, it still remains a highly interactive process. An automated system capable of producing 3D models of plants would significantly aid phenotyping practice, increasing accuracy and repeatability of measurements

    High-Throughput System for the Early Quantification of Major Architectural Traits in Olive Breeding Trials Using UAV Images and OBIA Techniques

    Get PDF
    The need for the olive farm modernization have encouraged the research of more efficient crop management strategies through cross-breeding programs to release new olive cultivars more suitable for mechanization and use in intensive orchards, with high quality production and resistance to biotic and abiotic stresses. The advancement of breeding programs are hampered by the lack of efficient phenotyping methods to quickly and accurately acquire crop traits such as morphological attributes (tree vigor and vegetative growth habits), which are key to identify desirable genotypes as early as possible. In this context, an UAV-based high-throughput system for olive breeding program applications was developed to extract tree traits in large-scale phenotyping studies under field conditions. The system consisted of UAV-flight configurations, in terms of flight altitude and image overlaps, and a novel, automatic, and accurate object-based image analysis (OBIA) algorithm based on point clouds, which was evaluated in two experimental trials in the framework of a table olive breeding program, with the aim to determine the earliest date for suitable quantifying of tree architectural traits. Two training systems (intensive and hedgerow) were evaluated at two very early stages of tree growth: 15 and 27 months after planting. Digital Terrain Models (DTMs) were automatically and accurately generated by the algorithm as well as every olive tree identified, independently of the training system and tree age. The architectural traits, specially tree height and crown area, were estimated with high accuracy in the second flight campaign, i.e. 27 months after planting. Differences in the quality of 3D crown reconstruction were found for the growth patterns derived from each training system. These key phenotyping traits could be used in several olive breeding programs, as well as to address some agronomical goals. In addition, this system is cost and time optimized, so that requested architectural traits could be provided in the same day as UAV flights. This high-throughput system may solve the actual bottleneck of plant phenotyping of "linking genotype and phenotype," considered a major challenge for crop research in the 21st century, and bring forward the crucial time of decision making for breeders

    AD-BERT: Using Pre-trained contextualized embeddings to Predict the Progression from Mild Cognitive Impairment to Alzheimer's Disease

    Full text link
    Objective: We develop a deep learning framework based on the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model using unstructured clinical notes from electronic health records (EHRs) to predict the risk of disease progression from Mild Cognitive Impairment (MCI) to Alzheimer's Disease (AD). Materials and Methods: We identified 3657 patients diagnosed with MCI together with their progress notes from Northwestern Medicine Enterprise Data Warehouse (NMEDW) between 2000-2020. The progress notes no later than the first MCI diagnosis were used for the prediction. We first preprocessed the notes by deidentification, cleaning and splitting, and then pretrained a BERT model for AD (AD-BERT) based on the publicly available Bio+Clinical BERT on the preprocessed notes. The embeddings of all the sections of a patient's notes processed by AD-BERT were combined by MaxPooling to compute the probability of MCI-to-AD progression. For replication, we conducted a similar set of experiments on 2563 MCI patients identified at Weill Cornell Medicine (WCM) during the same timeframe. Results: Compared with the 7 baseline models, the AD-BERT model achieved the best performance on both datasets, with Area Under receiver operating characteristic Curve (AUC) of 0.8170 and F1 score of 0.4178 on NMEDW dataset and AUC of 0.8830 and F1 score of 0.6836 on WCM dataset. Conclusion: We developed a deep learning framework using BERT models which provide an effective solution for prediction of MCI-to-AD progression using clinical note analysis

    InfraPhenoGrid: A scientific workflow infrastructure for Plant Phenomics on the Grid

    Get PDF
    International audiencePlant phenotyping consists in the observation of physical and biochemical traits of plant genotypes in response to environmental conditions. Challenges , in particular in context of climate change and food security, are numerous. High-throughput platforms have been introduced to observe the dynamic growth of a large number of plants in different environmental conditions. Instead of considering a few genotypes at a time (as it is the case when phenomic traits are measured manually), such platforms make it possible to use completely new kinds of approaches. However, the data sets produced by such widely instrumented platforms are huge, constantly augmenting and produced by increasingly complex experiments, reaching a point where distributed computation is mandatory to extract knowledge from data. In this paper, we introduce InfraPhenoGrid, the infrastructure we designed and deploy to efficiently manage data sets produced by the PhenoArch plant phenomics platform in the context of the French Phenome Project. Our solution consists in deploying scientific workflows on a Grid using a middle-ware to pilot workflow executions. Our approach is user-friendly in the sense that despite the intrinsic complexity of the infrastructure, running scientific workflows and understanding results obtained (using provenance information) is kept as simple as possible for end-users
    • 

    corecore