65 research outputs found

    New insights into the classification and nomenclature of cortical GABAergic interneurons.

    Get PDF
    A systematic classification and accepted nomenclature of neuron types is much needed but is currently lacking. This article describes a possible taxonomical solution for classifying GABAergic interneurons of the cerebral cortex based on a novel, web-based interactive system that allows experts to classify neurons with pre-determined criteria. Using Bayesian analysis and clustering algorithms on the resulting data, we investigated the suitability of several anatomical terms and neuron names for cortical GABAergic interneurons. Moreover, we show that supervised classification models could automatically categorize interneurons in agreement with experts' assignments. These results demonstrate a practical and objective approach to the naming, characterization and classification of neurons based on community consensus

    Crowd Learning with Candidate Labeling: an EM-based Solution

    Get PDF
    Crowdsourcing is widely used nowadays in machine learning for data labeling. Although in the traditional case annotators are asked to provide a single label for each instance, novel approaches allow annotators, in case of doubt, to choose a subset of labels as a way to extract more information from them. In both the traditional and these novel approaches, the reliability of the labelers can be modeled based on the collections of labels that they provide. In this paper, we propose an Expectation-Maximization-based method for crowdsourced data with candidate sets. Iteratively the likelihood of the parameters that model the reliability of the labelers is maximized, while the ground truth is estimated. The experimental results suggest that the proposed method performs better than the baseline aggregation schemes in terms of estimated accuracy.BES-2016-078095 SVP-2014-068574 IT609-13 TIN2016-78365-

    The Extended Dawid-Skene Model:Fusing Information from Multiple Data Schemas

    Get PDF
    While label fusion from multiple noisy annotations is a well understood concept in data wrangling (tackled for example by the Dawid-Skene (DS) model), we consider the extended problem of carrying out learning when the labels themselves are not consistently annotated with the same schema. We show that even if annotators use disparate, albeit related, label-sets, we can still draw inferences for the underlying full label-set. We propose the Inter-Schema AdapteR (ISAR) to translate the fully-specified label-set to the one used by each annotator, enabling learning under such heterogeneous schemas, without the need to re-annotate the data. We apply our method to a mouse behavioural dataset, achieving significant gains (compared with DS) in out-of-sample log-likelihood (-3.40 to -2.39) and F1-score (0.785 to 0.864).Comment: Updated with Author-Preprint version following Publication in P. Cellier and K. Driessens (Eds.): ECML PKDD 2019 Workshops, CCIS 1167, pp. 121 - 136, 202

    Learning from crowds in digital pathology using scalable variational Gaussian processes

    Get PDF
    This work was supported by the Agencia Estatal de Investigacion of the Spanish Ministerio de Ciencia e Innovacion under contract PID2019-105142RB-C22/AEI/10.13039/501100011033, and the United States National Institutes of Health National Cancer Institute Grants U01CA220401 and U24CA19436201. P.M. contribution was mostly before joining Microsoft Research, when he was supported by La Caixa Banking Foundation (ID 100010434, Barcelona, Spain) through La Caixa Fellowship for Doctoral Studies LCF/BQ/ES17/11600011.The volume of labeled data is often the primary determinant of success in developing machine learning algorithms. This has increased interest in methods for leveraging crowds to scale data labeling efforts, and methods to learn from noisy crowd-sourced labels. The need to scale labeling is acute but particularly challenging in medical applications like pathology, due to the expertise required to generate quality labels and the limited availability of qualified experts. In this paper we investigate the application of Scalable Variational Gaussian Processes for Crowdsourcing (SVGPCR) in digital pathology. We compare SVGPCR with other crowdsourcing methods using a large multi-rater dataset where pathologists, pathology residents, and medical students annotated tissue regions breast cancer. Our study shows that SVGPCR is competitive with equivalent methods trained using goldstandard pathologist generated labels, and that SVGPCR meets or exceeds the performance of other crowdsourcing methods based on deep learning. We also show how SVGPCR can effectively learn the class-conditional reliabilities of individual annotators and demonstrate that Gaussian-process classifiers have comparable performance to similar deep learning methods. These results suggest that SVGPCR can meaningfully engage non-experts in pathology labeling tasks, and that the classconditional reliabilities estimated by SVGPCR may assist in matching annotators to tasks where they perform well.Agencia Estatal de Investigacion of the Spanish Ministerio de Ciencia e Innovacion PID2019-105142RB-C22/AEI/10.13039/501100011033United States Department of Health & Human ServicesNational Institutes of Health (NIH) - USANIH National Cancer Institute (NCI) U01CA220401 U24CA19436201La Caixa Banking Foundation (Barcelona, Spain) Barcelona, Spain) through La Caixa Fellowship 100010434 LCF/BQ/ES17/1160001

    Learning to segment when experts disagree

    Get PDF
    Recent years have seen an increasing use of supervised learning methods for segmentation tasks. However, the predictive performance of these algorithms depend on the quality of labels, especially in medical image domain, where both the annotation cost and inter-observer variability are high. In a typical annotation collection process, different clinical experts provide their estimates of the “true” segmentation labels under the influence of their levels of expertise and biases. Treating these noisy labels blindly as the ground truth can adversely affect the performance of supervised segmentation models. In this work, we present a neural network architecture for jointly learning, from noisy observations alone, both the reliability of individual annotators and the true segmentation label distributions. The separation of the annotators’ characteristics and true segmentation label is achieved by encouraging the estimated annotators to be maximally unreliable while achieving high fidelity with the training data. Our method can also be viewed as a translation of STAPLE, an established label aggregation framework proposed in Warfield et al. [1] to the supervised learning paradigm. We demonstrate first on a generic segmentation task using MNIST data and then adapt for usage with MRI scans of multiple sclerosis (MS) patients for lesion labelling. Our method shows considerable improvement over the relevant baselines on both datasets in terms of segmentation accuracy and estimation of annotator reliability, particularly when only a single label is available per image. An open-source implementation of our approach can be found at https://github.com/UCLBrain/MSLS

    Improving a gold standard: treating human relevance judgments of MEDLINE document pairs

    Get PDF
    Given prior human judgments of the condition of an object it is possible to use these judgments to make a maximal likelihood estimate of what future human judgments of the condition of that object will be. However, if one has a reasonably large collection of similar objects and the prior human judgments of a number of judges regarding the condition of each object in the collection, then it is possible to make predictions of future human judgments for the whole collection that are superior to the simple maximal likelihood estimate for each object in isolation. This is possible because the multiple judgments over the collection allow an analysis to determine the relative value of a judge as compared with the other judges in the group and this value can be used to augment or diminish a particular judge’s influence in predicting future judgments. Here we study and compare five different methods for making such improved predictions and show that each is superior to simple maximal likelihood estimates

    Finding the “Dark Matter” in Human and Yeast Protein Network Prediction and Modelling

    Get PDF
    Accurate modelling of biological systems requires a deeper and more complete knowledge about the molecular components and their functional associations than we currently have. Traditionally, new knowledge on protein associations generated by experiments has played a central role in systems modelling, in contrast to generally less trusted bio-computational predictions. However, we will not achieve realistic modelling of complex molecular systems if the current experimental designs lead to biased screenings of real protein networks and leave large, functionally important areas poorly characterised. To assess the likelihood of this, we have built comprehensive network models of the yeast and human proteomes by using a meta-statistical integration of diverse computationally predicted protein association datasets. We have compared these predicted networks against combined experimental datasets from seven biological resources at different level of statistical significance. These eukaryotic predicted networks resemble all the topological and noise features of the experimentally inferred networks in both species, and we also show that this observation is not due to random behaviour. In addition, the topology of the predicted networks contains information on true protein associations, beyond the constitutive first order binary predictions. We also observe that most of the reliable predicted protein associations are experimentally uncharacterised in our models, constituting the hidden or “dark matter” of networks by analogy to astronomical systems. Some of this dark matter shows enrichment of particular functions and contains key functional elements of protein networks, such as hubs associated with important functional areas like the regulation of Ras protein signal transduction in human cells. Thus, characterising this large and functionally important dark matter, elusive to established experimental designs, may be crucial for modelling biological systems. In any case, these predictions provide a valuable guide to these experimentally elusive regions

    The gene normalization task in BioCreative III

    Get PDF
    BACKGROUND: We report the Gene Normalization (GN) challenge in BioCreative III where participating teams were asked to return a ranked list of identifiers of the genes detected in full-text articles. For training, 32 fully and 500 partially annotated articles were prepared. A total of 507 articles were selected as the test set. Due to the high annotation cost, it was not feasible to obtain gold-standard human annotations for all test articles. Instead, we developed an Expectation Maximization (EM) algorithm approach for choosing a small number of test articles for manual annotation that were most capable of differentiating team performance. Moreover, the same algorithm was subsequently used for inferring ground truth based solely on team submissions. We report team performance on both gold standard and inferred ground truth using a newly proposed metric called Threshold Average Precision (TAP-k). RESULTS: We received a total of 37 runs from 14 different teams for the task. When evaluated using the gold-standard annotations of the 50 articles, the highest TAP-k scores were 0.3297 (k=5), 0.3538 (k=10), and 0.3535 (k=20), respectively. Higher TAP-k scores of 0.4916 (k=5, 10, 20) were observed when evaluated using the inferred ground truth over the full test set. When combining team results using machine learning, the best composite system achieved TAP-k scores of 0.3707 (k=5), 0.4311 (k=10), and 0.4477 (k=20) on the gold standard, representing improvements of 12.4%, 21.8%, and 26.6% over the best team results, respectively. CONCLUSIONS: By using full text and being species non-specific, the GN task in BioCreative III has moved closer to a real literature curation task than similar tasks in the past and presents additional challenges for the text mining community, as revealed in the overall team results. By evaluating teams using the gold standard, we show that the EM algorithm allows team submissions to be differentiated while keeping the manual annotation effort feasible. Using the inferred ground truth we show measures of comparative performance between teams. Finally, by comparing team rankings on gold standard vs. inferred ground truth, we further demonstrate that the inferred ground truth is as effective as the gold standard for detecting good team performance

    Decentralized Learning Framework of Meta-Survival Analysis for Developing Robust Prognostic Signatures

    No full text
    corecore