6 research outputs found
Leveraging Overhead Imagery for Localization, Mapping, and Understanding
Ground-level and overhead images provide complementary viewpoints of the world. This thesis proposes methods which leverage dense overhead imagery, in addition to sparsely distributed ground-level imagery, to advance traditional computer vision problems, such as ground-level image localization and fine-grained urban mapping. Our work focuses on three primary research areas: learning a joint feature representation between ground-level and overhead imagery to enable direct comparison for the task of image geolocalization, incorporating unlabeled overhead images by inferring labels from nearby ground-level images to improve image-driven mapping, and fusing ground-level imagery with overhead imagery to enhance understanding. The ultimate contribution of this thesis is a general framework for estimating geospatial functions, such as land cover or land use, which integrates visual evidence from both ground-level and overhead image viewpoints
Learning to Map the Visual and Auditory World
The appearance of the world varies dramatically not only from place to place but also from hour to hour and month to month. Billions of images that capture this complex relationship are uploaded to social-media websites every day and often are associated with precise time and location metadata. This rich source of data can be beneficial to improve our understanding of the globe. In this work, we propose a general framework that uses these publicly available images for constructing dense maps of different ground-level attributes from overhead imagery. In particular, we use well-defined probabilistic models and a weakly-supervised, multi-task training strategy to provide an estimate of the expected visual and auditory ground-level attributes consisting of the type of scenes, objects, and sounds a person can experience at a location. Through a large-scale evaluation on real data, we show that our learned models can be used for applications including mapping, image localization, image retrieval, and metadata verification
Modeling and Mapping Location-Dependent Human Appearance
Human appearance is highly variable and depends on individual preferences, such as fashion, facial expression, and makeup. These preferences depend on many factors including a person\u27s sense of style, what they are doing, and the weather. These factors, in turn, are dependent upon geographic location and time. In our work, we build computational models to learn the relationship between human appearance, geographic location, and time. The primary contributions are a framework for collecting and processing geotagged imagery of people, a large dataset collected by our framework, and several generative and discriminative models that use our dataset to learn the relationship between human appearance, location, and time. Additionally, we build interactive maps that allow for inspection and demonstration of what our models have learned
Recommended from our members
An Investigation into the Performance of Ethnicity Verification Between Humans and Machine Learning Algorithms
There has been a significant increase in the interest for the task of classifying
demographic profiles i.e. race and ethnicity. Ethnicity is a significant human
characteristic and applying facial image data for the discrimination of ethnicity is
integral to face-related biometric systems. Given the diversity in the application
of ethnicity-specific information such as face recognition and iris recognition, and
the availability of image datasets for more commonly available human
populations, i.e. Caucasian, African-American, Asians, and South-Asian Indians.
A gap has been identified for the development of a system which analyses the
full-face and its individual feature-components (eyes, nose and mouth), for the
Pakistani ethnic group. An efficient system is proposed for the verification of the
Pakistani ethnicity, which incorporates a two-tier (computer vs human) approach.
Firstly, hand-crafted features were used to ascertain the descriptive nature of a
frontal-image and facial profile, for the Pakistani ethnicity. A total of 26 facial
landmarks were selected (16 frontal and 10 for the profile) and by incorporating
2 models for redundant information removal, and a linear classifier for the binary
task. The experimental results concluded that the facial profile image of a
Pakistani face is distinct amongst other ethnicities. However, the methodology
consisted of limitations for example, low performance accuracy, the laborious
nature of manual data i.e. facial landmark, annotation, and the small facial image
dataset. To make the system more accurate and robust, Deep Learning models
are employed for ethnicity classification. Various state-of-the-art Deep models
are trained on a range of facial image conditions, i.e. full face and partial-face
images, plus standalone feature components such as the nose and mouth. Since
ethnicity is pertinent to the research, a novel facial image database entitled
Pakistani Face Database (PFDB), was created using a criterion-specific selection
process, to ensure assurance in each of the assigned class-memberships, i.e.
Pakistani and Non-Pakistani. Comparative analysis between 6 Deep Learning
models was carried out on augmented image datasets, and the analysis
demonstrates that Deep Learning yields better performance accuracy compared
to low-level features. The human phase of the ethnicity classification framework
tested the discrimination ability of novice Pakistani and Non-Pakistani
participants, using a computerised ethnicity task. The results suggest that
humans are better at discriminating between Pakistani and Non-Pakistani full
face images, relative to individual face-feature components (eyes, nose, mouth),
struggling the most with the nose, when making judgements of ethnicity. To
understand the effects of display conditions on ethnicity discrimination accuracy, two conditions were tested; (i) Two-Alternative Forced Choice (2-AFC) and (ii)
Single image procedure. The results concluded that participants perform
significantly better in trials where the target (Pakistani) image is shown alongside
a distractor (Non-Pakistani) image. To conclude the proposed framework,
directions for future study are suggested to advance the current understanding of
image based ethnicity verification.Acumé Forensi