266 research outputs found

    Click Carving: Segmenting Objects in Video with Point Clicks

    Full text link
    We present a novel form of interactive video object segmentation where a few clicks by the user helps the system produce a full spatio-temporal segmentation of the object of interest. Whereas conventional interactive pipelines take the user's initialization as a starting point, we show the value in the system taking the lead even in initialization. In particular, for a given video frame, the system precomputes a ranked list of thousands of possible segmentation hypotheses (also referred to as object region proposals) using image and motion cues. Then, the user looks at the top ranked proposals, and clicks on the object boundary to carve away erroneous ones. This process iterates (typically 2-3 times), and each time the system revises the top ranked proposal set, until the user is satisfied with a resulting segmentation mask. Finally, the mask is propagated across the video to produce a spatio-temporal object tube. On three challenging datasets, we provide extensive comparisons with both existing work and simpler alternative methods. In all, the proposed Click Carving approach strikes an excellent balance of accuracy and human effort. It outperforms all similarly fast methods, and is competitive or better than those requiring 2 to 12 times the effort.Comment: A preliminary version of the material in this document was filed as University of Texas technical report no. UT AI16-0

    Combining crowd worker, algorithm, and expert efforts to find boundaries of objects in images

    Get PDF
    While traditional approaches to image analysis have typically relied upon either manual annotation by experts or purely-algorithmic approaches, the rise of crowdsourcing now provides a new source of human labor to create training data or perform computations at run-time. Given this richer design space, how should we utilize algorithms, crowds, and experts to better annotate images? To answer this question for the important task of finding the boundaries of objects or regions in images, I focus on image segmentation, an important precursor to solving a variety of fundamental image analysis problems, including recognition, classification, tracking, registration, retrieval, and 3D visualization. The first part of the work includes a detailed analysis of the relative strengths and weaknesses of three different approaches to demarcate object boundaries in images: by experts, by crowdsourced laymen, and by automated computer vision algorithms. The second part of the work describes three hybrid system designs that integrate computer vision algorithms and crowdsourced laymen to demarcate boundaries in images. Experiments revealed that hybrid system designs yielded more accurate results than relying on algorithms or crowd workers alone and could yield segmentations that are indistinguishable from those created by biomedical experts. To encourage community-wide effort to continue working on developing methods and systems for image-based studies which can have real and measurable impact that benefit society at large, datasets and code are publicly-shared (http://www.cs.bu.edu/~betke/BiomedicalImageSegmentation/)

    Combining crowd worker, algorithm, and expert efforts to find boundaries of objects in images

    Get PDF
    While traditional approaches to image analysis have typically relied upon either manual annotation by experts or purely-algorithmic approaches, the rise of crowdsourcing now provides a new source of human labor to create training data or perform computations at run-time. Given this richer design space, how should we utilize algorithms, crowds, and experts to better annotate images? To answer this question for the important task of finding the boundaries of objects or regions in images, I focus on image segmentation, an important precursor to solving a variety of fundamental image analysis problems, including recognition, classification, tracking, registration, retrieval, and 3D visualization. The first part of the work includes a detailed analysis of the relative strengths and weaknesses of three different approaches to demarcate object boundaries in images: by experts, by crowdsourced laymen, and by automated computer vision algorithms. The second part of the work describes three hybrid system designs that integrate computer vision algorithms and crowdsourced laymen to demarcate boundaries in images. Experiments revealed that hybrid system designs yielded more accurate results than relying on algorithms or crowd workers alone and could yield segmentations that are indistinguishable from those created by biomedical experts. To encourage community-wide effort to continue working on developing methods and systems for image-based studies which can have real and measurable impact that benefit society at large, datasets and code are publicly-shared (http://www.cs.bu.edu/~betke/BiomedicalImageSegmentation/)

    BUOCA: Budget-Optimized Crowd Worker Allocation

    Full text link
    Due to concerns about human error in crowdsourcing, it is standard practice to collect labels for the same data point from multiple internet workers. We here show that the resulting budget can be used more effectively with a flexible worker assignment strategy that asks fewer workers to analyze easy-to-label data and more workers to analyze data that requires extra scrutiny. Our main contribution is to show how the allocations of the number of workers to a task can be computed optimally based on task features alone, without using worker profiles. Our target tasks are delineating cells in microscopy images and analyzing the sentiment toward the 2016 U.S. presidential candidates in tweets. We first propose an algorithm that computes budget-optimized crowd worker allocation (BUOCA). We next train a machine learning system (BUOCA-ML) that predicts an optimal number of crowd workers needed to maximize the accuracy of the labeling. We show that the computed allocation can yield large savings in the crowdsourcing budget (up to 49 percent points) while maintaining labeling accuracy. Finally, we envisage a human-machine system for performing budget-optimized data analysis at a scale beyond the feasibility of crowdsourcing.First author draf

    BUOCA: Budget-Optimized Crowd Worker Allocation

    Full text link
    Due to concerns about human error in crowdsourcing, it is standard practice to collect labels for the same data point from multiple internet workers. We here show that the resulting budget can be used more effectively with a flexible worker assignment strategy that asks fewer workers to analyze easy-to-label data and more workers to analyze data that requires extra scrutiny. Our main contribution is to show how the allocations of the number of workers to a task can be computed optimally based on task features alone, without using worker profiles. Our target tasks are delineating cells in microscopy images and analyzing the sentiment toward the 2016 U.S. presidential candidates in tweets. We first propose an algorithm that computes budget-optimized crowd worker allocation (BUOCA). We next train a machine learning system (BUOCA-ML) that predicts an optimal number of crowd workers needed to maximize the accuracy of the labeling. We show that the computed allocation can yield large savings in the crowdsourcing budget (up to 49 percent points) while maintaining labeling accuracy. Finally, we envisage a human-machine system for performing budget-optimized data analysis at a scale beyond the feasibility of crowdsourcing

    Salient Object Detection for Images Taken by People With Vision Impairments

    Full text link
    Salient object detection is the task of producing a binary mask for an image that deciphers which pixels belong to the foreground object versus background. We introduce a new salient object detection dataset using images taken by people who are visually impaired who were seeking to better understand their surroundings, which we call VizWiz-SalientObject. Compared to seven existing datasets, VizWiz-SalientObject is the largest (i.e., 32,000 human-annotated images) and contains unique characteristics including a higher prevalence of text in the salient objects (i.e., in 68\% of images) and salient objects that occupy a larger ratio of the images (i.e., on average, ∼\sim50\% coverage). We benchmarked seven modern salient object detection methods on our dataset and found they struggle most with images featuring salient objects that are large, have less complex boundaries, and lack text as well as for lower quality images. We invite the broader community to work on our new dataset challenge by publicly sharing the dataset at https://vizwiz.org/tasks-and-datasets/salient-object .Comment: Computer Vision and Pattern Recognitio
    • …
    corecore