95,850 research outputs found

    Superpixels: An Evaluation of the State-of-the-Art

    Full text link
    Superpixels group perceptually similar pixels to create visually meaningful entities while heavily reducing the number of primitives for subsequent processing steps. As of these properties, superpixel algorithms have received much attention since their naming in 2003. By today, publicly available superpixel algorithms have turned into standard tools in low-level vision. As such, and due to their quick adoption in a wide range of applications, appropriate benchmarks are crucial for algorithm selection and comparison. Until now, the rapidly growing number of algorithms as well as varying experimental setups hindered the development of a unifying benchmark. We present a comprehensive evaluation of 28 state-of-the-art superpixel algorithms utilizing a benchmark focussing on fair comparison and designed to provide new insights relevant for applications. To this end, we explicitly discuss parameter optimization and the importance of strictly enforcing connectivity. Furthermore, by extending well-known metrics, we are able to summarize algorithm performance independent of the number of generated superpixels, thereby overcoming a major limitation of available benchmarks. Furthermore, we discuss runtime, robustness against noise, blur and affine transformations, implementation details as well as aspects of visual quality. Finally, we present an overall ranking of superpixel algorithms which redefines the state-of-the-art and enables researchers to easily select appropriate algorithms and the corresponding implementations which themselves are made publicly available as part of our benchmark at davidstutz.de/projects/superpixel-benchmark/

    Image processing for the extraction of nutritional information from food labels

    Get PDF
    Current techniques for tracking nutritional data require undesirable amounts of either time or man-power. People must choose between tediously recording and updating dietary information or depending on unreliable crowd-sourced or costly maintained databases. Our project looks to overcome these pitfalls by providing a programming interface for image analysis that will read and report the information present on a nutrition label directly. Our solution involves a C++ library that combines image pre-processing, optical character recognition, and post-processing techniques to pull the relevant information from an image of a nutrition label. We apply an understanding of a nutrition label\u27s content and data organization to approach the accuracy of traditional data-entry methods. Our system currently provides around 80% accuracy for most label images, and we will continue to work to improve our accuracy

    Weakly Supervised Domain-Specific Color Naming Based on Attention

    Full text link
    The majority of existing color naming methods focuses on the eleven basic color terms of the English language. However, in many applications, different sets of color names are used for the accurate description of objects. Labeling data to learn these domain-specific color names is an expensive and laborious task. Therefore, in this article we aim to learn color names from weakly labeled data. For this purpose, we add an attention branch to the color naming network. The attention branch is used to modulate the pixel-wise color naming predictions of the network. In experiments, we illustrate that the attention branch correctly identifies the relevant regions. Furthermore, we show that our method obtains state-of-the-art results for pixel-wise and image-wise classification on the EBAY dataset and is able to learn color names for various domains.Comment: Accepted at ICPR201

    Smart Photos

    Get PDF
    Recent technological leaps have been a great catalyst for changing how people interact with the world around us. Specifically, the field of Augmented Reality has led to many software and hardware advances that have formed a digital intermediary between humans and their environment. As of now, Augmented Reality is available to the select few with the means of obtaining Google Glass, Oculus Rifts, and other relatively expensive platforms. Be that as it may, the tech industry\u27s current goal has been integration of this technology into the public\u27s smartphones and everyday devices. One inhibitor of this goal is the difficulty of finding an Augmented Reality application whose usage could satisfy an everyday need or attraction. Augmented reality presents our world in a unique perspective that can be found nowhere else in the natural world. However, visual impact is weak without substance or meaning. The best technology is invisible, and what makes a good product is its ability to fill a void in a person\u27s life. The most important researchers in this field are those who have been augmenting the tasks that most would consider mundane, such as overlaying nutritional information directly onto a meal [4]. In the same vein, we hope to incorporate Augmented Reality into everyday life by unlocking the full potential of a technology often believed to have already have reached its peak. The humble photograph, a classic invention and unwavering enhancement to the human experience, captures moments in space and time and compresses them into a single permanent state. These two-dimensional assortments of pixels give us a physical representation of the memories we form in specific periods of our lives. We believe this representation can be further enhanced in what we like to call a Smart Photo. The idea behind a Smart Photo is to unlock the full potential in the way that people can interact with photographs. This same notion is explored in the field of Virtual Reality with inventions such as 3D movies, which provide a special appeal that ordinary 2D films cannot. The 3D technology places the viewer inside the film\u27s environment. We intend to marry this seemingly mutually exclusive dichotomy by processing 2D photos alongside their 3D counterparts

    RGB-D datasets using microsoft kinect or similar sensors: a survey

    Get PDF
    RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

    A graph-based mathematical morphology reader

    Full text link
    This survey paper aims at providing a "literary" anthology of mathematical morphology on graphs. It describes in the English language many ideas stemming from a large number of different papers, hence providing a unified view of an active and diverse field of research
    • …
    corecore