18 research outputs found

    Enhancing the Performance of the MtCNN for the Classification of Cancer Pathology Reports: From Data Annotation to Model Deployment

    Get PDF
    Information contained in electronic health records (EHR) combined with the latest advances in machine learning (ML) have the potential to revolutionize the medical sciences. In particular, information contained in cancer pathology reports is essential to investigate cancer trends across the country. Unfortunately, large parts of information in EHRs are stored in the form of unstructured, free-text which limit their usability and research potential. To overcome this accessibility barrier, cancer registries depend on expert personnel who read, interpret, and extract relevant information. Naturally, as the number of stored pathology reports increases every day, depending on human experts presents scalability challenges. Recently, researchers have attempted to automate the information extraction process from cancer pathology reports using ML techniques commonly found in natural language processing (NLP). However, clinical text is inherently different than other common forms of text, and state-of-the-art NLP approaches often exhibit mediocre performance. In this study, we narrow the literature gap by investigating methods to tackle overfitting and improve the performance of ML models for the classification of cancer pathology reports so that we can reduce the dependency on human expert annotators. We (1) show that using active learning can mitigate extreme class imbalance by increasing the representation of documents belonging to rare cancer types, (2) investigated the feasibility of ensemble learning and a mixture-of-expert variant to boost minority class performance, and (3) demonstrated that ensemble model distillation provides a strategy for quantifying the uncertainty inherent in labeled data, offering an effective low-resource solution that can be easily deployed by cancer registries

    Modeling the impact of wild harvest on plant-disperser mutualisms: Plant and disperser co-harvest model

    Get PDF
    Across the tropics, millions of rural families rely on non-timber forest products for protein, subsistence, and other financial or cultural uses. Often, communities exploit biotically dispersed trees and their mammalian or avian seed disperser. Empirical findings have indicated that many plant and animal resources are overexploited, presenting challenges for biodiversity conservation and sustainable rural livelihoods. However, there has been limited research investigating the impacts of harvest that targets both seed dispersers and zoochoric trees. We formulated a discrete-time model for interacting seed dispersers and plants under harvest. We found that the more dependent species will dictate the sustainable threshold level of harvest, and that higher levels of dependence could drive the species pair to local extinction. We illustrated the application of sensitivity analysis to our modeling framework in order to facilitate future analyses and applications using this approach

    Deep Active Learning for Classifying Cancer Pathology Reports

    Get PDF
    Background: Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often difficult and expensive. Active learning techniques may mitigate this challenge by reducing the amount of labelled data required to effectively train a model. In this study, we analyze the effectiveness of 11 active learning algorithms on classifying subsite and histology from cancer pathology reports using a Convolutional Neural Network as the text classification model. Results: We compare the performance of each active learning strategy using two differently sized datasets and two different classification tasks. Our results show that on all tasks and dataset sizes, all active learning strategies except diversity-sampling strategies outperformed random sampling, i.e., no active learning. On our large dataset (15K initial labelled samples, adding 15K additional labelled samples each iteration of active learning), there was no clear winner between the different active learning strategies. On our small dataset (1K initial labelled samples, adding 1K additional labelled samples each iteration of active learning), marginal and ratio uncertainty sampling performed better than all other active learning techniques. We found that compared to random sampling, active learning strongly helps performance on rare classes by focusing on underrepresented classes. Conclusions: Active learning can save annotation cost by helping human annotators efficiently and intelligently select which samples to label. Our results show that a dataset constructed using effective active learning techniques requires less than half the amount of labelled data to achieve the same performance as a dataset constructed using random sampling

    LSST: from Science Drivers to Reference Design and Anticipated Data Products

    Get PDF
    (Abridged) We describe here the most ambitious survey currently planned in the optical, the Large Synoptic Survey Telescope (LSST). A vast array of science will be enabled by a single wide-deep-fast sky survey, and LSST will have unique survey capability in the faint time domain. The LSST design is driven by four main science themes: probing dark energy and dark matter, taking an inventory of the Solar System, exploring the transient optical sky, and mapping the Milky Way. LSST will be a wide-field ground-based system sited at Cerro Pach\'{o}n in northern Chile. The telescope will have an 8.4 m (6.5 m effective) primary mirror, a 9.6 deg2^2 field of view, and a 3.2 Gigapixel camera. The standard observing sequence will consist of pairs of 15-second exposures in a given field, with two such visits in each pointing in a given night. With these repeats, the LSST system is capable of imaging about 10,000 square degrees of sky in a single filter in three nights. The typical 5σ\sigma point-source depth in a single visit in rr will be ∌24.5\sim 24.5 (AB). The project is in the construction phase and will begin regular survey operations by 2022. The survey area will be contained within 30,000 deg2^2 with ÎŽ<+34.5∘\delta<+34.5^\circ, and will be imaged multiple times in six bands, ugrizyugrizy, covering the wavelength range 320--1050 nm. About 90\% of the observing time will be devoted to a deep-wide-fast survey mode which will uniformly observe a 18,000 deg2^2 region about 800 times (summed over all six bands) during the anticipated 10 years of operations, and yield a coadded map to r∌27.5r\sim27.5. The remaining 10\% of the observing time will be allocated to projects such as a Very Deep and Fast time domain survey. The goal is to make LSST data products, including a relational database of about 32 trillion observations of 40 billion objects, available to the public and scientists around the world.Comment: 57 pages, 32 color figures, version with high-resolution figures available from https://www.lsst.org/overvie

    Molecular mechanisms of cell death: recommendations of the Nomenclature Committee on Cell Death 2018.

    Get PDF
    Over the past decade, the Nomenclature Committee on Cell Death (NCCD) has formulated guidelines for the definition and interpretation of cell death from morphological, biochemical, and functional perspectives. Since the field continues to expand and novel mechanisms that orchestrate multiple cell death pathways are unveiled, we propose an updated classification of cell death subroutines focusing on mechanistic and essential (as opposed to correlative and dispensable) aspects of the process. As we provide molecularly oriented definitions of terms including intrinsic apoptosis, extrinsic apoptosis, mitochondrial permeability transition (MPT)-driven necrosis, necroptosis, ferroptosis, pyroptosis, parthanatos, entotic cell death, NETotic cell death, lysosome-dependent cell death, autophagy-dependent cell death, immunogenic cell death, cellular senescence, and mitotic catastrophe, we discuss the utility of neologisms that refer to highly specialized instances of these processes. The mission of the NCCD is to provide a widely accepted nomenclature on cell death in support of the continued development of the field

    Agaricus subrufescens : A review

    Get PDF
    International audienceMedicinal mushrooms have currently become a hot issue due to their various therapeutic properties. Of these, Agaricus subrufescens, also known as the "almond mushroom", has long been valued by many societies (i.e., Brazil, China, France, and USA). Since its discovery in 1893, this mushroom has been cultivated throughout the world, especially in Brazil where several strains of A. subrufescens have been developed and used as health food and alternative medicine. This article presents up-to-date information on this mushroom including its taxonomy and health promoting benefits. Medicinal properties of A. subrufescens are emphasized in several studies which are reviewed here. In addition, safety issues concerning the use of this fungus will be discussed
    corecore