77 research outputs found

    Labeling poststorm coastal imagery for machine learning: measurement of interrater agreement

    Get PDF
    © The Author(s), 2021. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Goldstein, E. B., Buscombe, D., Lazarus, E. D., Mohanty, S. D., Rafique, S. N., Anarde, K. A., Ashton, A. D., Beuzen, T., Castagno, K. A., Cohn, N., Conlin, M. P., Ellenson, A., Gillen, M., Hovenga, P. A., Over, J.-S. R., Palermo, R., Ratliff, K. M., Reeves, I. R. B., Sanborn, L. H., Straub, J. A., Taylor, L. A., Wallace E. J., Warrick, J., Wernette, P., Williams, H. E. Labeling poststorm coastal imagery for machine learning: measurement of interrater agreement. Earth and Space Science, 8(9), (2021): e2021EA001896, https://doi.org/10.1029/2021EA001896.Classifying images using supervised machine learning (ML) relies on labeled training data—classes or text descriptions, for example, associated with each image. Data-driven models are only as good as the data used for training, and this points to the importance of high-quality labeled data for developing a ML model that has predictive skill. Labeling data is typically a time-consuming, manual process. Here, we investigate the process of labeling data, with a specific focus on coastal aerial imagery captured in the wake of hurricanes that affected the Atlantic and Gulf Coasts of the United States. The imagery data set is a rich observational record of storm impacts and coastal change, but the imagery requires labeling to render that information accessible. We created an online interface that served labelers a stream of images and a fixed set of questions. A total of 1,600 images were labeled by at least two or as many as seven coastal scientists. We used the resulting data set to investigate interrater agreement: the extent to which labelers labeled each image similarly. Interrater agreement scores, assessed with percent agreement and Krippendorff's alpha, are higher when the questions posed to labelers are relatively simple, when the labelers are provided with a user manual, and when images are smaller. Experiments in interrater agreement point toward the benefit of multiple labelers for understanding the uncertainty in labeling data for machine learning research.The authors gratefully acknowledge support from the U.S. Geological Survey (G20AC00403 to EBG and SDM), NSF (1953412 to EBG and SDM; 1939954 to EBG), Microsoft AI for Earth (to EBG and SDM), The Leverhulme Trust (RPG-2018-282 to EDL and EBG), and an Early Career Research Fellowship from the Gulf Research Program of the National Academies of Sciences, Engineering, and Medicine (to EBG). U.S. Geological Survey researchers (DB, J-SRO, JW, and PW) were supported by the U.S. Geological Survey Coastal and Marine Hazards and Resources Program as part of the response and recovery efforts under congressional appropriations through the Additional Supplemental Appropriations for Disaster Relief Act, 2019 (Public Law 116-20; 133 Stat. 871)

    Hereditary leiomyomatosis and renal cell cancer presenting as metastatic kidney cancer at 18 years of age: implications for surveillance

    Get PDF
    Hereditary leiomyomatosis and renal cell cancer (HLRCC) is an autosomal dominant syndrome characterized by skin piloleiomyomas, uterine leiomyomas and papillary type 2 renal cancer caused by germline mutations in the fumarate hydratase (FH) gene. Previously, we proposed renal imaging for FH mutation carriers starting at the age of 20 years. However, recently an 18-year-old woman from a Dutch family with HLRCC presented with metastatic renal cancer. We describe the patient and family data, evaluate current evidence on renal cancer risk and surveillance in HLRCC and consider the advantages and disadvantages of starting surveillance for renal cancer in childhood. We also discuss the targeted therapies administered to our patient

    Evaluation of a joint Bioinformatics and Medical Informatics international course in Peru

    Get PDF
    Background: New technologies that emerge at the interface of computational and biomedical science could drive new advances in global health, therefore more training in technology is needed among health care workers. To assess the potential for informatics training using an approach designed to foster interaction at this interface, the University of Washington and the Universidad Peruana Cayetano Heredia developed and assessed a one-week course that included a new Bioinformatics (BIO) track along with an established Medical/Public Health Informatics track (MI) for participants in Peru. Methods: We assessed the background of the participants, and measured the knowledge gained by track-specific (MI or BIO) 30-minute pre- and post-tests. Participants' attitudes were evaluated both by daily evaluations and by an end-course evaluation. Results: Forty-three participants enrolled in the course - 20 in the MI track and 23 in the BIO track. Of 20 questions, the mean % score for the MI track increased from 49.7 pre-test (standard deviation or SD = 17.0) to 59.7 (SD = 15.2) for the post-test (P = 0.002, n = 18). The BIO track mean score increased from 33.6 pre-test to 51.2 post-test (P less than 0.001, n = 21). Most comments (76%) about any aspect of the course were positive. The main perceived strength of the course was the quality of the speakers, and the main perceived weakness was the short duration of the course. Overall, the course acceptability was very good to excellent with a rating of 4.1 (scale 1-5), and the usefulness of the course was rated as very good. Most participants (62.9%) expressed a positive opinion about having had the BIO and MI tracks come together for some of the lectures. Conclusion: Pre- and post-test results and the positive evaluations by the participants indicate that this first joint Bioinformatics and Medical/Public Health Informatics (MI and BIO) course was a success.The University of Washington AMAUTA Global Training in Health Informatics, a Fogarty International Center/NIH funded grant (5D43TW007551), and the AMAUTA Research Practica Program, a Puget Sound Partners for Global Health-funded grant

    No outcome disparities in patients with diffuse large B-cell lymphoma and a low socioeconomic status

    Get PDF
    Introduction: In patients with diffuse large B-cell lymphoma (DLBCL) socioeconomic status (SES) is associated with outcome in several population-based studies. The aim of this study was to further investigate the existence of disparities in treatment and survival. Methods: A population-based cohort study was performed including 343 consecutive patients with DLBCL, diagnosed between 2005 and 2012, in the North-west of the Netherlands. SES was based on the socioeconomic position within the Netherlands by use of postal code and categorized as low, intermediate or high. With multivariable logistic regression and Cox proportional hazard models the association between SES and respectively treatment and overall survival (OS) was evaluated. Results: Two-third of patients was positioned in low SES. Irrespective of SES an equal proportion of patients received standard immunochemotherapy. SES was not a significant risk indicator for OS (intermediate versus low SES: hazard ratio (HR) 1.31 (95% CI 0.78-2.18); high versus low SES: HR 0.83 (95% CI 0.48-1.46)). The mortality risk remained significantly increased with higher age, advanced performance status, elevated LDH and presence of comorbidity. Conclusion: Within the setting of free access to health care, in this cohort of patients with DLBCL no disparities in treatment and survival were seen in those with lower SES. (C) 2017 Elsevier Ltd. All rights reserved

    Biomedical informatics and translational medicine

    Get PDF
    Biomedical informatics involves a core set of methodologies that can provide a foundation for crossing the "translational barriers" associated with translational medicine. To this end, the fundamental aspects of biomedical informatics (e.g., bioinformatics, imaging informatics, clinical informatics, and public health informatics) may be essential in helping improve the ability to bring basic research findings to the bedside, evaluate the efficacy of interventions across communities, and enable the assessment of the eventual impact of translational medicine innovations on health policies. Here, a brief description is provided for a selection of key biomedical informatics topics (Decision Support, Natural Language Processing, Standards, Information Retrieval, and Electronic Health Records) and their relevance to translational medicine. Based on contributions and advancements in each of these topic areas, the article proposes that biomedical informatics practitioners ("biomedical informaticians") can be essential members of translational medicine teams
    corecore