44 research outputs found

    Multivariate Regression on the Grassmannian for Predicting Novel Domains

    Get PDF
    This work was supported by EPSRC (EP/L023385/1), and the European Union’s Horizon 2020 research and innovation program under grant agreement No 640891

    Zero-Shot Domain Adaptation via Kernel Regression on the Grassmannian

    Get PDF
    Most visual recognition methods implicitly assume the data distribution remains unchanged from training to testing. However, in practice domain shift often exists, where real-world factors such as lighting and sensor type change between train and test, and classifiers do not generalise from source to target domains. It is impractical to train separate models for all possible situations because collecting and labelling the data is expensive. Domain adaptation algorithms aim to ameliorate domain shift, allowing a model trained on a source to perform well on a different target domain. However, even for the setting of unsupervised domain adaptation, where the target domain is unlabelled, collecting data for every possible target domain is still costly. In this paper, we propose a new domain adaptation method that has no need to access either data or labels of the target domain when it can be described by a parametrised vector and there exits several related source domains within the same parametric space. It greatly reduces the burden of data collection and annotation, and our experiments show some promising results.Comment: Accepted to BMVC 2015 Workshop on Differential Geometry in Computer Vision (DIFF-CV

    AdaGraph: Unifying Predictive and Continuous Domain Adaptation through Graphs

    Full text link
    The ability to categorize is a cornerstone of visual intelligence, and a key functionality for artificial, autonomous visual machines. This problem will never be solved without algorithms able to adapt and generalize across visual domains. Within the context of domain adaptation and generalization, this paper focuses on the predictive domain adaptation scenario, namely the case where no target data are available and the system has to learn to generalize from annotated source images plus unlabeled samples with associated metadata from auxiliary domains. Our contributionis the first deep architecture that tackles predictive domainadaptation, able to leverage over the information broughtby the auxiliary domains through a graph. Moreover, we present a simple yet effective strategy that allows us to take advantage of the incoming target data at test time, in a continuous domain adaptation scenario. Experiments on three benchmark databases support the value of our approach.Comment: CVPR 2019 (oral

    Generalising Fine-Grained Sketch-Based Image Retrieval

    Get PDF
    Fine-grained sketch-based image retrieval (FG-SBIR) addresses matching specific photo instance using free-hand sketch as a query modality. Existing models aim to learn an embedding space in which sketch and photo can be directly compared. While successful, they require instance-level pairing within each coarse-grained category as annotated training data. Since the learned embedding space is domain-specific, these models do not generalise well across categories. This limits the practical applicability of FGSBIR. In this paper, we identify cross-category generalisation for FG-SBIR as a domain generalisation problem, and propose the first solution. Our key contribution is a novel unsupervised learning approach to model a universal manifold of prototypical visual sketch traits. This manifold can then be used to paramaterise the learning of a sketch/photo representation. Model adaptation to novel categories then becomes automatic via embedding the novel sketch in the manifold and updating the representation and retrieval function accordingly. Experiments on the two largest FG-SBIR datasets, Sketchy and QMUL-Shoe-V2, demonstrate the efficacy of our approach in enabling crosscategory generalisation of FG-SBIR

    Doctor of Philosophy

    Get PDF
    dissertationWith the ever-increasing amount of available computing resources and sensing devices, a wide variety of high-dimensional datasets are being produced in numerous fields. The complexity and increasing popularity of these data have led to new challenges and opportunities in visualization. Since most display devices are limited to communication through two-dimensional (2D) images, many visualization methods rely on 2D projections to express high-dimensional information. Such a reduction of dimension leads to an explosion in the number of 2D representations required to visualize high-dimensional spaces, each giving a glimpse of the high-dimensional information. As a result, one of the most important challenges in visualizing high-dimensional datasets is the automatic filtration and summarization of the large exploration space consisting of all 2D projections. In this dissertation, a new type of algorithm is introduced to reduce the exploration space that identifies a small set of projections that capture the intrinsic structure of high-dimensional data. In addition, a general framework for summarizing the structure of quality measures in the space of all linear 2D projections is presented. However, identifying the representative or informative projections is only part of the challenge. Due to the high-dimensional nature of these datasets, obtaining insights and arriving at conclusions based solely on 2D representations are limited and prone to error. How to interpret the inaccuracies and resolve the ambiguity in the 2D projections is the other half of the puzzle. This dissertation introduces projection distortion error measures and interactive manipulation schemes that allow the understanding of high-dimensional structures via data manipulation in 2D projections

    Image and Shape Analysis for Spatiotemporal Data

    Get PDF
    In analyzing brain development or identifying disease it is important to understand anatomical age-related changes and shape differences. Data for these studies is frequently spatiotemporal and collected from normal and/or abnormal subjects. However, images and shapes over time often have complex structures and are best treated as elements of non-Euclidean spaces. This dissertation tackles problems of uncovering time-varying changes and statistical group differences in image or shape time-series. There are three major contributions: 1) a framework of parametric regression models on manifolds to capture time-varying changes. These include a metamorphic geodesic regression approach for image time-series and standard geodesic regression, time-warped geodesic regression, and cubic spline regression on the Grassmann manifold; 2) a spatiotemporal statistical atlas approach, which augments a commonly used atlas such as the median with measures of data variance via a weighted functional boxplot; 3) hypothesis testing for shape analysis to detect group differences between populations. The proposed method for cross-sectional data uses shape ordering and hence does not require dense shape correspondences or strong distributional assumptions on the data. For longitudinal data, hypothesis testing is performed on shape trajectories which are estimated from individual subjects. Applications of these methods include 1) capturing brain development and degeneration; 2) revealing growth patterns in pediatric upper airways and the scoring of airway abnormalities; 3) detecting group differences in longitudinal corpus callosum shapes of subjects with dementia versus normal controls.Doctor of Philosoph

    Machine Learning in Aerodynamic Shape Optimization

    Get PDF
    Machine learning (ML) has been increasingly used to aid aerodynamic shape optimization (ASO), thanks to the availability of aerodynamic data and continued developments in deep learning. We review the applications of ML in ASO to date and provide a perspective on the state-of-the-art and future directions. We first introduce conventional ASO and current challenges. Next, we introduce ML fundamentals and detail ML algorithms that have been successful in ASO. Then, we review ML applications to ASO addressing three aspects: compact geometric design space, fast aerodynamic analysis, and efficient optimization architecture. In addition to providing a comprehensive summary of the research, we comment on the practicality and effectiveness of the developed methods. We show how cutting-edge ML approaches can benefit ASO and address challenging demands, such as interactive design optimization. Practical large-scale design optimizations remain a challenge because of the high cost of ML training. Further research on coupling ML model construction with prior experience and knowledge, such as physics-informed ML, is recommended to solve large-scale ASO problems

    Face Recognition and Facial Attribute Analysis from Unconstrained Visual Data

    Get PDF
    Analyzing human faces from visual data has been one of the most active research areas in the computer vision community. However, it is a very challenging problem in unconstrained environments due to variations in pose, illumination, expression, occlusion and blur between training and testing images. The task becomes even more difficult when only a limited number of images per subject is available for modeling these variations. In this dissertation, different techniques for performing classification of human faces as well as other facial attributes such as expression, age, gender, and head pose in uncontrolled settings are investigated. In the first part of the dissertation, a method for reconstructing the virtual frontal view from a given non-frontal face image using Markov Random Fields (MRFs) and an efficient variant of the Belief Propagation (BP) algorithm is introduced. In the proposed approach, the input face image is divided into a grid of overlapping patches and a globally optimal set of local warps is estimated to synthesize the patches at the frontal view. A set of possible warps for each patch is obtained by aligning it with images from a training database of frontal faces. The alignments are performed efficiently in the Fourier domain using an extension of the Lucas-Kanade (LK) algorithm that can handle illumination variations. The problem of finding the optimal warps is then formulated as a discrete labeling problem using an MRF. The reconstructed frontal face image can then be used with any face recognition technique. The two main advantages of our method are that it does not require manually selected facial landmarks as well as no head pose estimation is needed. In the second part, the task of face recognition in unconstrained settings is formulated as a domain adaptation problem. The domain shift is accounted for by deriving a latent subspace or domain, which jointly characterizes the multifactor variations using appropriate image formation models for each factor. The latent domain is defined as a product of Grassmann manifolds based on the underlying geometry of the tensor space, and recognition is performed across domain shift using statistics consistent with the tensor geometry. More specifically, given a face image from the source or target domain, multiple images of that subject are first synthesized under different illuminations, blur conditions, and 2D perturbations to form a tensor representation of the face. The orthogonal matrices obtained from the decomposition of this tensor, where each matrix corresponds to a factor variation, are used to characterize the subject as a point on a product of Grassmann manifolds. For cases with only one image per subject in the source domain, the identity of target domain faces is estimated using the geodesic distance on product manifolds. When multiple images per subject are available, an extension of kernel discriminant analysis is developed using a novel kernel based on the projection metric on product spaces. Furthermore, a probabilistic approach to the problem of classifying image sets on product manifolds is introduced. Understanding attributes such as expression, age class, and gender from face images has many applications in multimedia processing including content personalization, human-computer interaction, and facial identification. To achieve good performance in these tasks, it is important to be able to extract pertinent visual structures from the input data. In the third part of the dissertation, a fully automatic approach for performing classification of facial attributes based on hierarchical feature learning using sparse coding is presented. The proposed approach is generative in the sense that it does not use label information in the process of feature learning. As a result, the same feature representation can be applied for different tasks such as expression, age, and gender classification. Final classification is performed by linear SVM trained with the corresponding labels for each task. The last part of the dissertation presents an automatic algorithm for determining the head pose from a given face image. The face image is divided into a regular grid and represented by dense SIFT descriptors extracted from the grid points. Random Projection (RP) is then applied to reduce the dimension of the concatenated SIFT descriptor vector. Classification and regression using Support Vector Machine (SVM) are combined in order to obtain an accurate estimate of the head pose. The advantage of the proposed approach is that it does not require facial landmarks such as the eye and mouth corners, the nose tip to be extracted from the input face image as in many other methods
    corecore