95,850 research outputs found
Superpixels: An Evaluation of the State-of-the-Art
Superpixels group perceptually similar pixels to create visually meaningful
entities while heavily reducing the number of primitives for subsequent
processing steps. As of these properties, superpixel algorithms have received
much attention since their naming in 2003. By today, publicly available
superpixel algorithms have turned into standard tools in low-level vision. As
such, and due to their quick adoption in a wide range of applications,
appropriate benchmarks are crucial for algorithm selection and comparison.
Until now, the rapidly growing number of algorithms as well as varying
experimental setups hindered the development of a unifying benchmark. We
present a comprehensive evaluation of 28 state-of-the-art superpixel algorithms
utilizing a benchmark focussing on fair comparison and designed to provide new
insights relevant for applications. To this end, we explicitly discuss
parameter optimization and the importance of strictly enforcing connectivity.
Furthermore, by extending well-known metrics, we are able to summarize
algorithm performance independent of the number of generated superpixels,
thereby overcoming a major limitation of available benchmarks. Furthermore, we
discuss runtime, robustness against noise, blur and affine transformations,
implementation details as well as aspects of visual quality. Finally, we
present an overall ranking of superpixel algorithms which redefines the
state-of-the-art and enables researchers to easily select appropriate
algorithms and the corresponding implementations which themselves are made
publicly available as part of our benchmark at
davidstutz.de/projects/superpixel-benchmark/
Image processing for the extraction of nutritional information from food labels
Current techniques for tracking nutritional data require undesirable amounts of either time or man-power. People must choose between tediously recording and updating dietary information or depending on unreliable crowd-sourced or costly maintained databases. Our project looks to overcome these pitfalls by providing a programming interface for image analysis that will read and report the information present on a nutrition label directly. Our solution involves a C++ library that combines image pre-processing, optical character recognition, and post-processing techniques to pull the relevant information from an image of a nutrition label. We apply an understanding of a nutrition label\u27s content and data organization to approach the accuracy of traditional data-entry methods. Our system currently provides around 80% accuracy for most label images, and we will continue to work to improve our accuracy
Weakly Supervised Domain-Specific Color Naming Based on Attention
The majority of existing color naming methods focuses on the eleven basic
color terms of the English language. However, in many applications, different
sets of color names are used for the accurate description of objects. Labeling
data to learn these domain-specific color names is an expensive and laborious
task. Therefore, in this article we aim to learn color names from weakly
labeled data. For this purpose, we add an attention branch to the color naming
network. The attention branch is used to modulate the pixel-wise color naming
predictions of the network. In experiments, we illustrate that the attention
branch correctly identifies the relevant regions. Furthermore, we show that our
method obtains state-of-the-art results for pixel-wise and image-wise
classification on the EBAY dataset and is able to learn color names for various
domains.Comment: Accepted at ICPR201
Smart Photos
Recent technological leaps have been a great catalyst for changing how people interact with the world around us. Specifically, the field of Augmented Reality has led to many software and hardware advances that have formed a digital intermediary between humans and their environment. As of now, Augmented Reality is available to the select few with the means of obtaining Google Glass, Oculus Rifts, and other relatively expensive platforms. Be that as it may, the tech industry\u27s current goal has been integration of this technology into the public\u27s smartphones and everyday devices. One inhibitor of this goal is the difficulty of finding an Augmented Reality application whose usage could satisfy an everyday need or attraction. Augmented reality presents our world in a unique perspective that can be found nowhere else in the natural world. However, visual impact is weak without substance or meaning. The best technology is invisible, and what makes a good product is its ability to fill a void in a person\u27s life. The most important researchers in this field are those who have been augmenting the tasks that most would consider mundane, such as overlaying nutritional information directly onto a meal [4].
In the same vein, we hope to incorporate Augmented Reality into everyday life by unlocking the full potential of a technology often believed to have already have reached its peak. The humble photograph, a classic invention and unwavering enhancement to the human experience, captures moments in space and time and compresses them into a single permanent state. These two-dimensional assortments of pixels give us a physical representation of the memories we form in specific periods of our lives. We believe this representation can be further enhanced in what we like to call a Smart Photo. The idea behind a Smart Photo is to unlock the full potential in the way that people can interact with photographs. This same notion is explored in the field of Virtual Reality with inventions such as 3D movies, which provide a special appeal that ordinary 2D films cannot. The 3D technology places the viewer inside the film\u27s environment. We intend to marry this seemingly mutually exclusive dichotomy by processing 2D photos alongside their 3D counterparts
RGB-D datasets using microsoft kinect or similar sensors: a survey
RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms
A graph-based mathematical morphology reader
This survey paper aims at providing a "literary" anthology of mathematical
morphology on graphs. It describes in the English language many ideas stemming
from a large number of different papers, hence providing a unified view of an
active and diverse field of research
- …