217 research outputs found
Image processing for the extraction of nutritional information from food labels
Current techniques for tracking nutritional data require undesirable amounts of either time or man-power. People must choose between tediously recording and updating dietary information or depending on unreliable crowd-sourced or costly maintained databases. Our project looks to overcome these pitfalls by providing a programming interface for image analysis that will read and report the information present on a nutrition label directly. Our solution involves a C++ library that combines image pre-processing, optical character recognition, and post-processing techniques to pull the relevant information from an image of a nutrition label. We apply an understanding of a nutrition label\u27s content and data organization to approach the accuracy of traditional data-entry methods. Our system currently provides around 80% accuracy for most label images, and we will continue to work to improve our accuracy
A method of insect recognition based on spectrogram
A novel approach to insect recognition is presented in this paper. The difference between the proposed method with traditional methods is that it starts from the perspective of image and combines voice processing algorithms with image processing algorithms. The classification is done based on voice activity detection (VAD) and spectrogram. We show, by means of example that this approach can recognize different insects correctly. However, despite the potential of correct recognition, further justification of the reliability of the method need to be provided by a larger scale of experiments. Hence, some improvements will be proposed latterly
Color inference from semantic labeling for person search in videos
We propose an explainable model to generate semantic color labels for person
search. In this context, persons are described from their semantic parts, such
as hat, shirt, etc. Person search consists in looking for people based on these
descriptions. In this work, we aim to improve the accuracy of color labels for
people. Our goal is to handle the high variability of human perception.
Existing solutions are based on hand-crafted features or learnt features that
are not explainable. Moreover most of them only focus on a limited set of
colors. We propose a method based on binary search trees and a large
peer-labelled color name dataset. This allows us to synthesize the human
perception of colors. Using semantic segmentation and our color labeling
method, we label segments of pedestrians with their associated colors. We
evaluate our solution on person search on datasets such as PCN, and show a
precision as high as 80.4%.Comment: 8 pages, 7 figures ICIAR 202
Deep Manifold Traversal: Changing Labels with Convolutional Features
Many tasks in computer vision can be cast as a "label changing" problem, where the goal is to make a semantic change to the appearance of an image or some subject in an image in order to alter the class membership. Although successful task-specific methods have been developed for some label changing applications, to date no general purpose method exists. Motivated by this we propose deep manifold traversal, a method that addresses the problem in its most general form: it first approximates the manifold of natural images then morphs a test image along a traversal path away from a source class and towards a target class while staying near the manifold throughout. The resulting algorithm is surprisingly effective and versatile. It is completely data driven, requiring only an example set of images from the desired source and target domains. We demonstrate deep manifold traversal on highly diverse label changing tasks: changing an individual's appearance (age and hair color), changing the season of an outdoor image, and transforming a city skyline towards nighttime
ATMSeer: Increasing Transparency and Controllability in Automated Machine Learning
To relieve the pain of manually selecting machine learning algorithms and
tuning hyperparameters, automated machine learning (AutoML) methods have been
developed to automatically search for good models. Due to the huge model search
space, it is impossible to try all models. Users tend to distrust automatic
results and increase the search budget as much as they can, thereby undermining
the efficiency of AutoML. To address these issues, we design and implement
ATMSeer, an interactive visualization tool that supports users in refining the
search space of AutoML and analyzing the results. To guide the design of
ATMSeer, we derive a workflow of using AutoML based on interviews with machine
learning experts. A multi-granularity visualization is proposed to enable users
to monitor the AutoML process, analyze the searched models, and refine the
search space in real time. We demonstrate the utility and usability of ATMSeer
through two case studies, expert interviews, and a user study with 13 end
users.Comment: Published in the ACM Conference on Human Factors in Computing Systems
(CHI), 2019, Glasgow, Scotland U
- …