106 research outputs found

    Differentially Private Learning with Noisy Labels

    Get PDF
    Supervised machine learning tasks require large labelled datasets. However, obtaining such datasets is a difficult task and often leads to noisy labels due to human errors or adversarial perturbation. Recent studies have shown multiple methods to tackle this problem in the non-private scenario, yet this remains an unsolved problem when the dataset is private. In this work, we aim to train a model on a sensitive dataset that contains noisy labels such that (i) the model has high test accuracy and (ii) the training process satisfies (ε,δ)-differential privacy. Noisy labels, as studied in our work, are generated by flipping labels in the training set, from the true source label(s) to other targets (s). Our approach, Diffindo, constructs a differentially private stochastic gradient descent algorithm which removes suspicious points based on their noisy gradients. We show experiments on datasets across multiple domains with different class balance properties. Our results show that the proposed algorithm can remove up to 100% of the points with noisy labels in the private scenario while restoring the precision of the targeted label and testing accuracy to its no-noise counterparts

    Eliciting and Leveraging Input Diversity in Crowd-Powered Intelligent Systems

    Full text link
    Collecting high quality annotations plays a crucial role in supporting machine learning algorithms, and thus, the creation of intelligent systems. Over the past decade, crowdsourcing has become a widely adopted means of manually creating annotations for various intelligent tasks, spanning from object boundary detection in images to sentiment understanding in text. This thesis presents new crowdsourcing workflows and answer aggregation algorithms that can effectively and efficiently improve collective annotation quality from crowd workers. While conventional microtask crowdsourcing approaches generally focus on improving annotation quality by promoting consensus among workers, this thesis proposes a novel concept of a diversity-driven approach. We show that leveraging diversity in workers' responses is effective in improving the accuracy of aggregate annotations because it compensates for biases or uncertainty caused by the system, tool, or the data. We then present techniques that elicit the diversity in workers' responses. These techniques are orthogonal to other quality control methods, such as filtering, training or incentives, which means they can be used in combination with existing methods. The crowd-powered intelligent systems presented in this thesis are evaluated through visual perception tasks in order to demonstrate the effectiveness of our proposed approach. The advantage of our approach is an improvement in collective quality even in settings where worker skill may vary widely, potentially lowering barriers to entry for novice workers and making it easier for requesters to find workers who can make productive contributions. This thesis demonstrates that crowd workers' input diversity can be a useful property that yields better aggregate performance than any homogeneous set of input.PHDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/153428/1/jyskwon_1.pd

    Colour morphological sieves for scale-space image processing

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Exploring variability in medical imaging

    Get PDF
    Although recent successes of deep learning and novel machine learning techniques improved the perfor- mance of classification and (anomaly) detection in computer vision problems, the application of these methods in medical imaging pipeline remains a very challenging task. One of the main reasons for this is the amount of variability that is encountered and encapsulated in human anatomy and subsequently reflected in medical images. This fundamental factor impacts most stages in modern medical imaging processing pipelines. Variability of human anatomy makes it virtually impossible to build large datasets for each disease with labels and annotation for fully supervised machine learning. An efficient way to cope with this is to try and learn only from normal samples. Such data is much easier to collect. A case study of such an automatic anomaly detection system based on normative learning is presented in this work. We present a framework for detecting fetal cardiac anomalies during ultrasound screening using generative models, which are trained only utilising normal/healthy subjects. However, despite the significant improvement in automatic abnormality detection systems, clinical routine continues to rely exclusively on the contribution of overburdened medical experts to diagnosis and localise abnormalities. Integrating human expert knowledge into the medical imaging processing pipeline entails uncertainty which is mainly correlated with inter-observer variability. From the per- spective of building an automated medical imaging system, it is still an open issue, to what extent this kind of variability and the resulting uncertainty are introduced during the training of a model and how it affects the final performance of the task. Consequently, it is very important to explore the effect of inter-observer variability both, on the reliable estimation of model’s uncertainty, as well as on the model’s performance in a specific machine learning task. A thorough investigation of this issue is presented in this work by leveraging automated estimates for machine learning model uncertainty, inter-observer variability and segmentation task performance in lung CT scan images. Finally, a presentation of an overview of the existing anomaly detection methods in medical imaging was attempted. This state-of-the-art survey includes both conventional pattern recognition methods and deep learning based methods. It is one of the first literature surveys attempted in the specific research area.Open Acces

    Anomaly detection in brain imaging

    Get PDF
    Modern healthcare systems employ a variety of medical imaging technologies, such as X-ray, MRI and CT, to improve patient outcomes, time and cost efficiency, and enable further research. Artificial intelligence and machine learning have shown promise in enhancing medical image analysis systems, leading to a proliferation of research in the field. However, many proposed approaches, such as image classification or segmentation, require large amounts of professional annotations, which are costly and time-consuming to acquire. Anomaly detection is an approach that requires less manual effort and thus can benefit from scaling to datasets of ever-increasing size. In this thesis, we focus on anomaly localisation for pathology detection with models trained on healthy data without dense annotations. We identify two key weaknesses of current image reconstruction-based anomaly detection methods: poor image reconstruction and overdependency on pixel/voxel intensity for identification of anomalies. To address these weaknesses, we develop two novel methods: denoising autoencoder and context-tolocal feature matching, respectively. Finally, we apply both methods to in-hospital data in collaboration with NHS Greater Glasgow and Clyde. We discuss the issues of data collection, filtering, processing, and evaluation arising in applying anomaly detection methods beyond curated datasets. We design and run a clinical evaluation contrasting our proposed methods and revealing difficulties in gauging performance of anomaly detection systems. Our findings suggest that further research is needed to fully realise the potential of anomaly detection for practical medical imaging applications. Specifically, we suggest investigating anomaly detection methods that are able to take advantage of more types of supervision (e.g. weak-labels), more context (e.g. prior scans) and make structured end-to-end predictions (e.g. bounding boxes)

    Data-driven 3D Reconstruction and View Synthesis of Dynamic Scene Elements

    Get PDF
    Our world is filled with living beings and other dynamic elements. It is important to record dynamic things and events for the sake of education, archeology, and culture inheritance. From vintage to modern times, people have recorded dynamic scene elements in different ways, from sequences of cave paintings to frames of motion pictures. This thesis focuses on two key computer vision techniques by which dynamic element representation moves beyond video capture: towards 3D reconstruction and view synthesis. Although previous methods on these two aspects have been adopted to model and represent static scene elements, dynamic scene elements present unique and difficult challenges for the tasks. This thesis focuses on three types of dynamic scene elements, namely 1) dynamic texture with static shape, 2) dynamic shapes with static texture, and 3) dynamic illumination of static scenes. Two research aspects will be explored to represent and visualize them: dynamic 3D reconstruction and dynamic view synthesis. Dynamic 3D reconstruction aims to recover the 3D geometry of dynamic objects and, by modeling the objects’ movements, bring 3D reconstructions to life. Dynamic view synthesis, on the other hand, summarizes or predicts the dynamic appearance change of dynamic objects – for example, the daytime-to-nighttime illumination of a building or the future movements of a rigid body. We first target the problem of reconstructing dynamic textures of objects that have (approximately) fixed 3D shape but time-varying appearance. Examples of such objects include waterfalls, fountains, and electronic billboards. Since the appearance of dynamic-textured objects can be random and complicated, estimating the 3D geometry of these objects from 2D images/video requires novel tools beyond the appearance-based point correspondence methods of traditional 3D computer vision. To perform this 3D reconstruction, we introduce a method that simultaneously 1) segments dynamically textured scene objects in the input images and 2) reconstructs the 3D geometry of the entire scene, assuming a static 3D shape for the dynamically textured objects. Compared to dynamic textures, the appearance change of dynamic shapes is due to physically defined motions like rigid body movements. In these cases, assumptions can be made about the object’s motion constraints in order to identify corresponding points on the object at different timepoints. For example, two points on a rigid object have constant distance between them in the 3D space, no matter how the object moves. Based on this assumption of local rigidity, we propose a robust method to correctly identify point correspondences of two images viewing the same moving object from different viewpoints and at different times. Dense 3D geometry could be obtained from the computed point correspondences. We apply this method on unsynchronized video streams, and observe that the number of inlier correspondences found by this method can be used as indicator for frame alignment among the different streams. To model dynamic scene appearance caused by illumination changes, we propose a framework to find a sequence of images that have similar geometric composition as a single reference image and also show a smooth transition in illumination throughout the day. These images could be registered to visualize patterns of illumination change from a single viewpoint. The final topic of this thesis involves predicting the movements of dynamic shapes in the image domain. Towards this end, we propose deep neural network architectures to predict future views of dynamic motions, such as rigid body movements and flowers blooming. Instead of predicting image pixels from the network, my methods predict pixel offsets and iteratively synthesize future views.Doctor of Philosoph

    Kimera: from SLAM to Spatial Perception with 3D Dynamic Scene Graphs

    Full text link
    Humans are able to form a complex mental model of the environment they move in. This mental model captures geometric and semantic aspects of the scene, describes the environment at multiple levels of abstractions (e.g., objects, rooms, buildings), includes static and dynamic entities and their relations (e.g., a person is in a room at a given time). In contrast, current robots' internal representations still provide a partial and fragmented understanding of the environment, either in the form of a sparse or dense set of geometric primitives (e.g., points, lines, planes, voxels) or as a collection of objects. This paper attempts to reduce the gap between robot and human perception by introducing a novel representation, a 3D Dynamic Scene Graph(DSG), that seamlessly captures metric and semantic aspects of a dynamic environment. A DSG is a layered graph where nodes represent spatial concepts at different levels of abstraction, and edges represent spatio-temporal relations among nodes. Our second contribution is Kimera, the first fully automatic method to build a DSG from visual-inertial data. Kimera includes state-of-the-art techniques for visual-inertial SLAM, metric-semantic 3D reconstruction, object localization, human pose and shape estimation, and scene parsing. Our third contribution is a comprehensive evaluation of Kimera in real-life datasets and photo-realistic simulations, including a newly released dataset, uHumans2, which simulates a collection of crowded indoor and outdoor scenes. Our evaluation shows that Kimera achieves state-of-the-art performance in visual-inertial SLAM, estimates an accurate 3D metric-semantic mesh model in real-time, and builds a DSG of a complex indoor environment with tens of objects and humans in minutes. Our final contribution shows how to use a DSG for real-time hierarchical semantic path-planning. The core modules in Kimera are open-source.Comment: 34 pages, 25 figures, 9 tables. arXiv admin note: text overlap with arXiv:2002.0628

    Interactional Slingshots: Providing Support Structure to User Interactions in Hybrid Intelligence Systems

    Full text link
    The proliferation of artificial intelligence (AI) systems has enabled us to engage more deeply and powerfully with our digital and physical environments, from chatbots to autonomous vehicles to robotic assistive technology. Unfortunately, these state-of-the-art systems often fail in contexts that require human understanding, are never-before-seen, or complex. In such cases, though the AI-only approaches cannot solve the full task, their ability to solve a piece of the task can be combined with human effort to become more robust to handling complexity and uncertainty. A hybrid intelligence system—one that combines human and machine skill sets—can make intelligent systems more operable in real-world settings. In this dissertation, we propose the idea of using interactional slingshots as a means of providing support structure to user interactions in hybrid intelligence systems. Much like how gravitational slingshots provide boosts to spacecraft en route to their final destinations, so do interactional slingshots provide boosts to user interactions en route to solving tasks. Several challenges arise: What does this support structure look like? How much freedom does the user have in their interactions? How is user expertise paired with that of the machine’s? To do this as a tractable socio-technical problem, we explore this idea in the context of data annotation problems, especially in those domains where AI methods fail to solve the overall task. Getting annotated (labeled) data is crucial for successful AI methods, and becomes especially more difficult in domains where AI fails, since problems in such domains require human understanding to fully solve, but also present challenges related to annotator expertise, annotation freedom, and context curation from the data. To explore data annotation problems in this space, we develop techniques and workflows whose interactional slingshot support structure harnesses the user’s interaction with data. First, we explore providing support in the form of nudging non-expert users’ interactions as they annotate text data for the task of creating conversational memory. Second, we add support structure in the form of assisting non-expert users during the annotation process itself for the task of grounding natural language references to objects in 3D point clouds. Finally, we supply support in the form of guiding expert and non-expert users both before and during their annotations for the task of conversational disentanglement across multiple domains. We demonstrate that building hybrid intelligence systems with each of these interactional slingshot support mechanisms—nudging, assisting, and guiding a user’s interaction with data—improves annotation outcomes, such as annotation speed, accuracy, effort level, even when annotators’ expertise and skill levels vary. Thesis Statement: By providing support structure that nudges, assists, and guides user interactions, it is possible to create hybrid intelligence systems that enable more efficient (faster and/or more accurate) data annotation.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163138/1/sairohit_1.pd
    • …
    corecore