12,274 research outputs found
Metric Learning for Projections Bias of Generalized Zero-shot Learning
Generalized zero-shot learning models (GZSL) aim to recognize samples from
seen or unseen classes using only samples from seen classes as training data.
During inference, GZSL methods are often biased towards seen classes due to the
visibility of seen class samples during training. Most current GZSL methods try
to learn an accurate projection function (from visual space to semantic space)
to avoid bias and ensure the effectiveness of GZSL methods. However, during
inference, the computation of distance will be important when we classify the
projection of any sample into its nearest class since we may learn a biased
projection function in the model. In our work, we attempt to learn a
parameterized Mahalanobis distance within the framework of VAEGAN (Variational
Autoencoder \& Generative Adversarial Networks), where the weight matrix
depends on the network's output. In particular, we improved the network
structure of VAEGAN to leverage the discriminative models of two branches to
separately predict the seen samples and the unseen samples generated by this
seen one. We proposed a new loss function with two branches to help us learn
the optimized Mahalanobis distance representation. Comprehensive evaluation
benchmarks on four datasets demonstrate the superiority of our method over the
state-of-the-art counterparts. Our codes are available at
https://anonymous.4open.science/r/111hxr.Comment: 9 pages, 2 figure
Using Deep Learning and Google Street View to Estimate the Demographic Makeup of the US
The United States spends more than $1B each year on initiatives such as the
American Community Survey (ACS), a labor-intensive door-to-door study that
measures statistics relating to race, gender, education, occupation,
unemployment, and other demographic factors. Although a comprehensive source of
data, the lag between demographic changes and their appearance in the ACS can
exceed half a decade. As digital imagery becomes ubiquitous and machine vision
techniques improve, automated data analysis may provide a cheaper and faster
alternative. Here, we present a method that determines socioeconomic trends
from 50 million images of street scenes, gathered in 200 American cities by
Google Street View cars. Using deep learning-based computer vision techniques,
we determined the make, model, and year of all motor vehicles encountered in
particular neighborhoods. Data from this census of motor vehicles, which
enumerated 22M automobiles in total (8% of all automobiles in the US), was used
to accurately estimate income, race, education, and voting patterns, with
single-precinct resolution. (The average US precinct contains approximately
1000 people.) The resulting associations are surprisingly simple and powerful.
For instance, if the number of sedans encountered during a 15-minute drive
through a city is higher than the number of pickup trucks, the city is likely
to vote for a Democrat during the next Presidential election (88% chance);
otherwise, it is likely to vote Republican (82%). Our results suggest that
automated systems for monitoring demographic trends may effectively complement
labor-intensive approaches, with the potential to detect trends with fine
spatial resolution, in close to real time.Comment: 41 pages including supplementary material. Under review at PNA
Online Visual Robot Tracking and Identification using Deep LSTM Networks
Collaborative robots working on a common task are necessary for many
applications. One of the challenges for achieving collaboration in a team of
robots is mutual tracking and identification. We present a novel pipeline for
online visionbased detection, tracking and identification of robots with a
known and identical appearance. Our method runs in realtime on the limited
hardware of the observer robot. Unlike previous works addressing robot tracking
and identification, we use a data-driven approach based on recurrent neural
networks to learn relations between sequential inputs and outputs. We formulate
the data association problem as multiple classification problems. A deep LSTM
network was trained on a simulated dataset and fine-tuned on small set of real
data. Experiments on two challenging datasets, one synthetic and one real,
which include long-term occlusions, show promising results.Comment: IEEE/RSJ International Conference on Intelligent Robots and Systems
(IROS), Vancouver, Canada, 2017. IROS RoboCup Best Paper Awar
- …