88,110 research outputs found
A Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts
Most existing zero-shot learning methods consider the problem as a visual
semantic embedding one. Given the demonstrated capability of Generative
Adversarial Networks(GANs) to generate images, we instead leverage GANs to
imagine unseen categories from text descriptions and hence recognize novel
classes with no examples being seen. Specifically, we propose a simple yet
effective generative model that takes as input noisy text descriptions about an
unseen class (e.g.Wikipedia articles) and generates synthesized visual features
for this class. With added pseudo data, zero-shot learning is naturally
converted to a traditional classification problem. Additionally, to preserve
the inter-class discrimination of the generated features, a visual pivot
regularization is proposed as an explicit supervision. Unlike previous methods
using complex engineered regularizers, our approach can suppress the noise well
without additional regularization. Empirically, we show that our method
consistently outperforms the state of the art on the largest available
benchmarks on Text-based Zero-shot Learning.Comment: To appear in CVPR1
STEFANN: Scene Text Editor using Font Adaptive Neural Network
Textual information in a captured scene plays an important role in scene
interpretation and decision making. Though there exist methods that can
successfully detect and interpret complex text regions present in a scene, to
the best of our knowledge, there is no significant prior work that aims to
modify the textual information in an image. The ability to edit text directly
on images has several advantages including error correction, text restoration
and image reusability. In this paper, we propose a method to modify text in an
image at character-level. We approach the problem in two stages. At first, the
unobserved character (target) is generated from an observed character (source)
being modified. We propose two different neural network architectures - (a)
FANnet to achieve structural consistency with source font and (b) Colornet to
preserve source color. Next, we replace the source character with the generated
character maintaining both geometric and visual consistency with neighboring
characters. Our method works as a unified platform for modifying text in
images. We present the effectiveness of our method on COCO-Text and ICDAR
datasets both qualitatively and quantitatively.Comment: Accepted in The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR) 202
Matching Image Sets via Adaptive Multi Convex Hull
Traditional nearest points methods use all the samples in an image set to
construct a single convex or affine hull model for classification. However,
strong artificial features and noisy data may be generated from combinations of
training samples when significant intra-class variations and/or noise occur in
the image set. Existing multi-model approaches extract local models by
clustering each image set individually only once, with fixed clusters used for
matching with various image sets. This may not be optimal for discrimination,
as undesirable environmental conditions (eg. illumination and pose variations)
may result in the two closest clusters representing different characteristics
of an object (eg. frontal face being compared to non-frontal face). To address
the above problem, we propose a novel approach to enhance nearest points based
methods by integrating affine/convex hull classification with an adapted
multi-model approach. We first extract multiple local convex hulls from a query
image set via maximum margin clustering to diminish the artificial variations
and constrain the noise in local convex hulls. We then propose adaptive
reference clustering (ARC) to constrain the clustering of each gallery image
set by forcing the clusters to have resemblance to the clusters in the query
image set. By applying ARC, noisy clusters in the query set can be discarded.
Experiments on Honda, MoBo and ETH-80 datasets show that the proposed method
outperforms single model approaches and other recent techniques, such as Sparse
Approximated Nearest Points, Mutual Subspace Method and Manifold Discriminant
Analysis.Comment: IEEE Winter Conference on Applications of Computer Vision (WACV),
201
A versatile maskless microscope projection photolithography system and its application in light-directed fabrication of DNA microarrays
We present a maskless microscope projection lithography system (MPLS), in
which photomasks have been replaced by a Digital Micromirror Device type
spatial light modulator (DMD, Texas Instruments). Employing video projector
technology high resolution patterns, designed as bitmap images on the computer,
are displayed using a micromirror array consisting of about 786000 tiny
individually addressable tilting mirrors. The DMD, which is located in the
image plane of an infinity corrected microscope, is projected onto a substrate
placed in the focal plane of the microscope objective. With a 5x(0.25 NA) Fluar
microscope objective, a fivefold reduction of the image to a total size of 9
mm2 and a minimum feature size of 3.5 microns is achieved. Our system can be
used in the visible range as well as in the near UV (with a light intensity of
up to 76 mW/cm2 around the 365 nm Hg-line). We developed an inexpensive and
simple method to enable exact focusing and controlling of the image quality of
the projected patterns. Our MPLS has originally been designed for the
light-directed in situ synthesis of DNA microarrays. One requirement is a high
UV intensity to keep the fabrication process reasonably short. Another demand
is a sufficient contrast ratio over small distances (of about 5 microns). This
is necessary to achieve a high density of features (i.e. separated sites on the
substrate at which different DNA sequences are synthesized in parallel fashion)
while at the same time the number of stray light induced DNA sequence errors is
kept reasonably small. We demonstrate the performance of the apparatus in
light-directed DNA chip synthesis and discuss its advantages and limitations.Comment: 12 pages, 9 figures, journal articl
- …