10 research outputs found

    Deep learning from crowds

    Get PDF
    Over the last few years, deep learning has revolutionized the field of machine learning by dramatically improving the state-of-the-art in various domains. However, as the size of supervised artificial neural networks grows, typically so does the need for larger labeled datasets. Recently, crowdsourcing has established itself as an efficient and cost-effective solution for labeling large sets of data in a scalable manner, but it often requires aggregating labels from multiple noisy contributors with different levels of expertise. In this paper, we address the problem of learning deep neural networks from crowds. We begin by describing an EM algorithm for jointly learning the parameters of the network and the reliabilities of the annotators. Then, a novel general-purpose crowd layer is proposed, which allows us to train deep neural networks end-to-end, directly from the noisy labels of multiple annotators, using only backpropagation. We empirically show that the proposed approach is able to internally capture the reliability and biases of different annotators and achieve new state-of-the-art results for various crowdsourced datasets across different settings, namely classification, regression and sequence labeling.Comment: 10 pages, The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), 201

    A methodology for peripheral nerve segmentation using a multiple annotators approach based on Centered Kernel Alignment

    Get PDF
    Peripheral Nerve Blocking (PNB) is a technique commonly used to perform regional anesthesia and for pain management. The success of PNB procedures depends on the accurate location of the target nerve. Recently, ultrasound imaging has been widely used to locate nerve structures to carry out PNB, due to it enables a non-invasive visualization of the target nerve and the anatomical structures around it. However, the ultrasound images are affected by several artifacts making difficult the accurate delimitation of nerves. In the literature, several approaches have been proposed to carry out automatic or semi-automatic segmentation. Nevertheless, these methods are designed assuming that the gold standard is available, and for this segmentation problem this gold standard can not be obtained considering that it corresponds to subjective interpretation. In this sense, for building those segmentation models, we do not have access to the actual label but an amount of subjective annotations provided by multiple experts. To deal with this drawback we use the concepts of a relatively new area of machine learning known as “Learning from crowds”, this area deals with supervised learning problems considering the case when the gold standard is not available. In this project, we develop a nerve segmentation system that includes: a preprocessing stage, feature extraction methodology based on adaptive methods, and a Centered Kernel Alignment (CKA) based representation to measure the annotators performance for building a classifier with multiple annotators in order to support peripheral nerve segmentation. Our approach to classification with multiple annotators based on CKA is tested on both simulated data and real data; similarly, the methodology of automatic segmentation proposed in this work was tested over ultrasound images labeled by a set of specialists who give their opinion about the location of nerve structures. According to the results, we conclude that our methodology can be used to locate nerve structures in ultrasound images even if the gold standard (the actual location of nerve structures) is not available in the training stage. Moreover, we determine that the approach proposed in this work could be implemented as a guiding tool for the anesthesiologist to carry out PNB procedures assisted by ultrasound imaging

    A methodology for peripheral nerve segmentation using a multiple annotators approach based on Centered Kernel Alignment

    Get PDF
    Peripheral Nerve Blocking (PNB) is a technique commonly used to perform regional anesthesia and for pain management. The success of PNB procedures depends on the accurate location of the target nerve. Recently, ultrasound imaging has been widely used to locate nerve structures to carry out PNB, due to it enables a non-invasive visualization of the target nerve and the anatomical structures around it. However, the ultrasound images are affected by several artifacts making difficult the accurate delimitation of nerves. In the literature, several approaches have been proposed to carry out automatic or semi-automatic segmentation. Nevertheless, these methods are designed assuming that the gold standard is available, and for this segmentation problem this gold standard can not be obtained considering that it corresponds to subjective interpretation. In this sense, for building those segmentation models, we do not have access to the actual label but an amount of subjective annotations provided by multiple experts. To deal with this drawback we use the concepts of a relatively new area of machine learning known as “Learning from crowds”, this area deals with supervised learning problems considering the case when the gold standard is not available. In this project, we develop a nerve segmentation system that includes: a preprocessing stage, feature extraction methodology based on adaptive methods, and a Centered Kernel Alignment (CKA) based representation to measure the annotators performance for building a classifier with multiple annotators in order to support peripheral nerve segmentation. Our approach to classification with multiple annotators based on CKA is tested on both simulated data and real data; similarly, the methodology of automatic segmentation proposed in this work was tested over ultrasound images labeled by a set of specialists who give their opinion about the location of nerve structures. According to the results, we conclude that our methodology can be used to locate nerve structures in ultrasound images even if the gold standard (the actual location of nerve structures) is not available in the training stage. Moreover, we determine that the approach proposed in this work could be implemented as a guiding tool for the anesthesiologist to carry out PNB procedures assisted by ultrasound imaging

    Scalable and Ensemble Learning for Big Data

    Get PDF
    University of Minnesota Ph.D. dissertation. May 2019. Major: Electrical/Computer Engineering. Advisor: Georgios Giannakis. 1 computer file (PDF); xi, 126 pages.The turn of the decade has trademarked society and computing research with a ``data deluge.'' As the number of smart, highly accurate and Internet-capable devices increases, so does the amount of data that is generated and collected. While this sheer amount of data has the potential to enable high quality inference, and mining of information, it introduces numerous challenges in the processing and pattern analysis, since available statistical inference and machine learning approaches do not necessarily scale well with the number of data and their dimensionality. In addition to the challenges related to scalability, data gathered are often noisy, dynamic, contaminated by outliers or corrupted to specifically inhibit the inference task. Moreover, many machine learning approaches have been shown to be susceptible to adversarial attacks. At the same time, the cost of cloud and distributed computing is rapidly declining. Therefore, there is a pressing need for statistical inference and machine learning tools that are robust to attacks and scale with the volume and dimensionality of the data, by harnessing efficiently the available computational resources. This thesis is centered on analytical and algorithmic foundations that aim to enable statistical inference and data analytics from large volumes of high-dimensional data. The vision is to establish a comprehensive framework based on state-of-the-art machine learning, optimization and statistical inference tools to enable truly large-scale inference, which can tap on the available (possibly distributed) computational resources, and be resilient to adversarial attacks. The ultimate goal is to both analytically and numerically demonstrate how valuable insights from signal processing can lead to markedly improved and accelerated learning tools. To this end, the present thesis investigates two main research thrusts: i) Large-scale subspace clustering; and ii) unsupervised ensemble learning. The aforementioned research thrusts introduce novel algorithms that aim to tackle the issues of large-scale learning. The potential of the proposed algorithms is showcased by rigorous theoretical results and extensive numerical tests

    Physician Participation in Crowdsourcing: Effect of Intrinsic and Extrinsic Motivation

    Get PDF
    Physicians must participate in developing medical protocols to ensure that medical best practices are adopted for patients\u27 social benefit. Healthcare leaders have struggled to gain sufficient physician participation in developing medical protocols. Using technology-based crowdsourcing to assimilate knowledge from physicians may help healthcare managers improve medical protocol development. Using self-determination theory, this quantitative causal-comparative design aimed to determine whether differences in intrinsic and extrinsic motivation existed among the 132 participating physicians who did or did not participate in developing medical protocols in a crowdsourcing environment. Participants were recruited by e-mail through an independent physician association. Motivation levels were measured by the Aspirations Index via an online survey. A total of 55.3% of respondents participated in developing medical protocols. Differences were anticipated in the levels of participation in developing medical protocols between intrinsically and extrinsically motivated physicians. Rank correlations were computed between the number of protocols completed and all of the motivation scores. Personal growth and community contribution were significantly correlated with the number of addressed protocols. Positive social change may occur through improving medical protocols and healthcare outcomes by informing healthcare leaders about physicians\u27 motivation to participate in developing medical protocols. By understanding these motivators, leaders can highlight the benefits of protocol development to encourage physician participation. If participation is enhanced, protocol quality and healthcare effectiveness may be improved, benefitting patients and healthy individuals

    Gamifying Language Resource Acquisition

    Get PDF
    PhD ThesisNatural Language Processing, is an important collection of methods for processing the vast amounts of available natural language text we continually produce. These methods make use of supervised learning, an approach that learns from large amounts of annotated data. As humans, we’re able to provide information about text that such systems can learn from. Historically, this was carried out by small groups of experts. However, this did not scale. This led to various crowdsourcing approaches being taken that used large pools of non-experts. The traditional form of crowdsourcing was to pay users small amounts of money to complete tasks. As time progressed, gamification approaches such as GWAPs, showed various benefits over the micro-payment methods used before. These included a cost saving, worker training opportunities, increased worker engagement and potential to far exceed the scale of crowdsourcing. While these were successful in domains such as image labelling, they struggled in the domain of text annotation, which wasn’t such a natural fit. Despite many challenges, there were also clearly many opportunities and benefits to applying this approach to text annotation. Many of these are demonstrated by Phrase Detectives. Based on lessons learned from Phrase Detectives and investigations into other GWAPs, in this work, we attempt to create full GWAPs for NLP, extracting the benefits of the methodology. This includes training, high quality output from non-experts and a truly game-like GWAP design that players are happy to play voluntarily

    Sequence labeling with multiple annotators

    No full text
    The increasingly popular use of Crowdsourcing as a resource to obtain labeled data has been contributing to the wide awareness of the machine learning community to the problem of supervised learning from multiple annotators. Several approaches have been proposed to deal with this issue, but they disregard sequence labeling problems. However, these are very common, for example, among the Natural Language Processing and Bioinformatics communities. In this paper, we present a probabilistic approach for sequence labeling using Conditional Random Fields (CRF) for situations where label sequences from multiple annotators are available but there is no actual ground truth. The approach uses the Expectation-Maximization algorithm to jointly learn the CRF model parameters, the reliability of the annotators and the estimated ground truth. When it comes to performance, the proposed method (CRF-MA) significantly outperforms typical approaches such as majority voting
    corecore