20,627 research outputs found

    Active learning in annotating micro-blogs dealing with e-reputation

    Full text link
    Elections unleash strong political views on Twitter, but what do people really think about politics? Opinion and trend mining on micro blogs dealing with politics has recently attracted researchers in several fields including Information Retrieval and Machine Learning (ML). Since the performance of ML and Natural Language Processing (NLP) approaches are limited by the amount and quality of data available, one promising alternative for some tasks is the automatic propagation of expert annotations. This paper intends to develop a so-called active learning process for automatically annotating French language tweets that deal with the image (i.e., representation, web reputation) of politicians. Our main focus is on the methodology followed to build an original annotated dataset expressing opinion from two French politicians over time. We therefore review state of the art NLP-based ML algorithms to automatically annotate tweets using a manual initiation step as bootstrap. This paper focuses on key issues about active learning while building a large annotated data set from noise. This will be introduced by human annotators, abundance of data and the label distribution across data and entities. In turn, we show that Twitter characteristics such as the author's name or hashtags can be considered as the bearing point to not only improve automatic systems for Opinion Mining (OM) and Topic Classification but also to reduce noise in human annotations. However, a later thorough analysis shows that reducing noise might induce the loss of crucial information.Comment: Journal of Interdisciplinary Methodologies and Issues in Science - Vol 3 - Contextualisation digitale - 201

    Multi-Atlas Segmentation using Partially Annotated Data: Methods and Annotation Strategies

    Get PDF
    Multi-atlas segmentation is a widely used tool in medical image analysis, providing robust and accurate results by learning from annotated atlas datasets. However, the availability of fully annotated atlas images for training is limited due to the time required for the labelling task. Segmentation methods requiring only a proportion of each atlas image to be labelled could therefore reduce the workload on expert raters tasked with annotating atlas images. To address this issue, we first re-examine the labelling problem common in many existing approaches and formulate its solution in terms of a Markov Random Field energy minimisation problem on a graph connecting atlases and the target image. This provides a unifying framework for multi-atlas segmentation. We then show how modifications in the graph configuration of the proposed framework enable the use of partially annotated atlas images and investigate different partial annotation strategies. The proposed method was evaluated on two Magnetic Resonance Imaging (MRI) datasets for hippocampal and cardiac segmentation. Experiments were performed aimed at (1) recreating existing segmentation techniques with the proposed framework and (2) demonstrating the potential of employing sparsely annotated atlas data for multi-atlas segmentation

    Crowdsourcing in Computer Vision

    Full text link
    Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts. Crowdsourcing platforms offer an inexpensive method to capture human knowledge and understanding, for a vast number of visual perception tasks. In this survey, we describe the types of annotations computer vision researchers have collected using crowdsourcing, and how they have ensured that this data is of high quality while annotation effort is minimized. We begin by discussing data collection on both classic (e.g., object recognition) and recent (e.g., visual story-telling) vision tasks. We then summarize key design decisions for creating effective data collection interfaces and workflows, and present strategies for intelligently selecting the most important data instances to annotate. Finally, we conclude with some thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in Computer Graphics and Vision, 201

    DoctorEye: A clinically driven multifunctional platform, for accurate processing of tumors in medical images

    Get PDF
    Copyright @ Skounakis et al.This paper presents a novel, open access interactive platform for 3D medical image analysis, simulation and visualization, focusing in oncology images. The platform was developed through constant interaction and feedback from expert clinicians integrating a thorough analysis of their requirements while having an ultimate goal of assisting in accurately delineating tumors. It allows clinicians not only to work with a large number of 3D tomographic datasets but also to efficiently annotate multiple regions of interest in the same session. Manual and semi-automatic segmentation techniques combined with integrated correction tools assist in the quick and refined delineation of tumors while different users can add different components related to oncology such as tumor growth and simulation algorithms for improving therapy planning. The platform has been tested by different users and over large number of heterogeneous tomographic datasets to ensure stability, usability, extensibility and robustness with promising results. AVAILABILITY: THE PLATFORM, A MANUAL AND TUTORIAL VIDEOS ARE AVAILABLE AT: http://biomodeling.ics.forth.gr. It is free to use under the GNU General Public License

    Coronary Artery Centerline Extraction in Cardiac CT Angiography Using a CNN-Based Orientation Classifier

    Full text link
    Coronary artery centerline extraction in cardiac CT angiography (CCTA) images is a prerequisite for evaluation of stenoses and atherosclerotic plaque. We propose an algorithm that extracts coronary artery centerlines in CCTA using a convolutional neural network (CNN). A 3D dilated CNN is trained to predict the most likely direction and radius of an artery at any given point in a CCTA image based on a local image patch. Starting from a single seed point placed manually or automatically anywhere in a coronary artery, a tracker follows the vessel centerline in two directions using the predictions of the CNN. Tracking is terminated when no direction can be identified with high certainty. The CNN was trained using 32 manually annotated centerlines in a training set consisting of 8 CCTA images provided in the MICCAI 2008 Coronary Artery Tracking Challenge (CAT08). Evaluation using 24 test images of the CAT08 challenge showed that extracted centerlines had an average overlap of 93.7% with 96 manually annotated reference centerlines. Extracted centerline points were highly accurate, with an average distance of 0.21 mm to reference centerline points. In a second test set consisting of 50 CCTA scans, 5,448 markers in the coronary arteries were used as seed points to extract single centerlines. This showed strong correspondence between extracted centerlines and manually placed markers. In a third test set containing 36 CCTA scans, fully automatic seeding and centerline extraction led to extraction of on average 92% of clinically relevant coronary artery segments. The proposed method is able to accurately and efficiently determine the direction and radius of coronary arteries. The method can be trained with limited training data, and once trained allows fast automatic or interactive extraction of coronary artery trees from CCTA images.Comment: Accepted in Medical Image Analysi
    • …
    corecore