17 research outputs found
An Analysis on Adversarial Machine Learning: Methods and Applications
Deep learning has witnessed astonishing advancement in the last decade and revolutionized many fields ranging from computer vision to natural language processing. A prominent field of research that enabled such achievements is adversarial learning, investigating the behavior and functionality of a learning model in presence of an adversary. Adversarial learning consists of two major trends. The first trend analyzes the susceptibility of machine learning models to manipulation in the decision-making process and aims to improve the robustness to such manipulations. The second trend exploits adversarial games between components of the model to enhance the learning process. This dissertation aims to provide an analysis on these two sides of adversarial learning and harness their potential for improving the robustness and generalization of deep models.
In the first part of the dissertation, we study the adversarial susceptibility of deep learning models. We provide an empirical analysis on the extent of vulnerability by proposing two adversarial attacks that explore the geometric and frequency-domain characteristics of inputs to manipulate deep decisions. Afterward, we formalize the susceptibility of deep networks using the first-order approximation of the predictions and extend the theory to the ensemble classification scheme. Inspired by theoretical findings, we formalize a reliable and practical defense against adversarial examples to robustify ensembles. We extend this part by investigating the shortcomings of \gls{at} and highlight that the popular momentum stochastic gradient descent, developed essentially for natural training, is not proper for optimization in adversarial training since it is not designed to be robust against the chaotic behavior of gradients in this setup. Motivated by these observations, we develop an optimization method that is more suitable for adversarial training. In the second part of the dissertation, we harness adversarial learning to enhance the generalization and performance of deep networks in discriminative and generative tasks. We develop several models for biometric identification including fingerprint distortion rectification and latent fingerprint reconstruction. In particular, we develop a ridge reconstruction model based on generative adversarial networks that estimates the missing ridge information in latent fingerprints. We introduce a novel modification that enables the generator network to preserve the ID information during the reconstruction process. To address the scarcity of data, {\it e.g.}, in latent fingerprint analysis, we develop a supervised augmentation technique that combines input examples based on their salient regions. Our findings advocate that adversarial learning improves the performance and reliability of deep networks in a wide range of applications
Recommended from our members
Application of Prior Information to Discriminative Feature Learning
Learning discriminative feature representations has attracted a great deal of attention since it is a critical step to facilitate the subsequent classification, retrieval and recommendation tasks. In this dissertation, besides incorporating prior knowledge about image labels into the image classification as most prevalent feature learning methods currently do, we also explore some other general-purpose priors and verify their effectiveness in the discriminant feature learning. As a more powerful representation can be learned by implementing such general priors, our approaches achieve state-of-the-art results on challenging benchmarks. We elaborate on these general-purpose priors and highlight where we have made novel contributions.
We apply sparsity and hierarchical priors to the explanatory factors that describe the data, in order to better discover the data structure. More specifically, in the first approach we propose that we only incorporate sparse priors into the feature learning. To this end, we present a support discrimination dictionary learning method, which finds a dictionary under which the feature representation of images from the same class have a common sparse structure while the size of the overlapped signal support of different classes is minimised. Then we incorporate sparse priors and hierarchical priors into a unified framework, that is capable of controlling the sparsity of the neuron activation in deep neural networks. Our proposed approach automatically selects the most useful low-level features and effectively combines them into more powerful and discriminative features for our specific image classification problem.
We also explore priors on the relationships between multiple factors. When multiple independent factors exist in the image generation process and only some of them are of interest to us, we propose a novel multi-task adversarial network to learn a disentangled feature which is optimized with respect to the factor of interest to us, while being distraction factors agnostic. When common factors exist in multiple tasks, leveraging common factors cannot only make the learned feature representation more robust, but also enable the model to generalise from very few labelled samples. More specifically, we address the domain adaptation problem and propose the re-weighted adversarial adaptation network to reduce the feature distribution divergence and adapt the classifier from source to target domains
Wafer defect recognition method based on multi-scale feature fusion
Wafer defect recognition is an important process of chip manufacturing. As different process flows can lead to different defect types, the correct identification of defect patterns is important for recognizing manufacturing problems and fixing them in good time. To achieve high precision identification of wafer defects and improve the quality and production yield of wafers, this paper proposes a Multi-Feature Fusion Perceptual Network (MFFP-Net) inspired by human visual perception mechanisms. The MFFP-Net can process information at various scales and then aggregate it so that the next stage can abstract features from the different scales simultaneously. The proposed feature fusion module can obtain higher fine-grained and richer features to capture key texture details and avoid important information loss. The final experiments show that MFFP-Net achieves good generalized ability and state-of-the-art results on real-world dataset WM-811K, with an accuracy of 96.71%, this provides an effective way for the chip manufacturing industry to improve the yield rate
Representation based regression for object distance estimation
In this study, we propose a novel approach to predict the distances of the detected objects in an observed scene. The proposed approach modifies the recently proposed Convolutional Support Estimator Networks (CSENs). CSENs are designed to compute a direct mapping for the Support Estimation (SE) task in a representation-based classification problem. We further propose and demonstrate that representation-based methods (sparse or collaborative representation) can be used in well-designed regression problems especially over scarce data. To the best of our knowledge, this is the first representation-based method proposed for performing a regression task by utilizing the modified CSENs; and hence, we name this novel approach as Representation-based Regression (RbR). The initial version of CSENs has a proxy mapping stage (i.e., a coarse estimation for the support set) that is required for the input. In this study, we improve the CSEN model by proposing Compressive Learning CSEN (CL-CSEN) that has the ability to jointly optimize the so-called proxy mapping stage along with convolutional layers. The experimental evaluations using the KITTI 3D Object Detection distance estimation dataset show that the proposed method can achieve a significantly improved distance estimation performance over all competing methods. Finally, the software implementations of the methods are publicly shared at https://github.com/meteahishali/CSENDistance.publishedVersionPeer reviewe
Recommended from our members
Depth-adaptive methodologies for 3D image caregorization.
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London.Image classification is an active topic of computer vision research. This topic
deals with the learning of patterns in order to allow efficient classification of visual
information. However, most research efforts have focused on 2D image classification.
In recent years, advances of 3D imaging enabled the development of applications and
provided new research directions. In this thesis, we present methodologies and techniques for image classification using 3D image data. We conducted our research focusing on the attributes and
limitations of depth information regarding possible uses. This research led us to the
development of depth feature extraction methodologies that contribute to the representation
of images thus enhancing the recognition efficiency. We proposed a new
classification algorithm that adapts to the need of image representations by implementing
a scale-based decision that exploits discriminant parts of representations.
Learning from the design of image representation methods, we introduced our own
which describes each image by its depicting content providing more discriminative image
representation. We also propose a dictionary learning method that exploits the
relation of training features by assessing the similarity of features originating from
similar context regions. Finally, we present our research on deep learning algorithms
combined with data and techniques used in 3D imaging. Our novel methods provide
state-of-the-art results, thus contributing to the research of 3D image classificatio
Discriminative Video Representation Learning
Representation learning is a fundamental research problem in the area of machine learning, refining the raw data to discover representations needed for various applications. However, real-world data, particularly video data, is neither mathematically nor computationally convenient to process due to its semantic redundancy and complexity. Video data, as opposed to images, includes temporal correlation and motion dynamics, but the ground truth label is normally limited to category labels, which makes the video representation learning a challenging problem. To this end, this thesis addresses the problem of video representation learning, specifically discriminative video representation learning, which focuses on capturing useful data distributions and reliable feature representations improving the performance of varied downstream tasks. We argue that neither all frames in one video nor all dimensions in one feature vector are useful and should be equally treated for video representation learning. Based on this argument, several novel algorithms are investigated in this thesis under multiple application scenarios, such as action recognition, action detection and one-class video anomaly detection. These proposed video representation learning methods produce discriminative video features in both deep and non-deep learning setups. Specifically, they are presented in the form of: 1) an early fusion layer that adopts a temporal ranking SVM formulation, agglomerating several optical flow images from consecutive frames into a novel compact representation, named as dynamic optical flow images; 2) an intermediate feature aggregation layer that applies weakly-supervised contrastive learning techniques, learning discriminative video representations via contrasting positive and negative samples from a sequence; 3) a new formulation for one-class feature learning that learns a set of discriminative subspaces with orthonormal hyperplanes to flexibly bound the one-class data distribution using Riemannian optimisation methods. We provide extensive experiments to gain intuitions into why the learned representations are discriminative and useful. All the proposed methods in this thesis are evaluated on standard publicly available benchmarks, demonstrating state-of-the-art performance
Learning Discriminative Feature Representations for Visual Categorization
Learning discriminative feature representations has attracted a great deal of attention due to its potential value and wide usage in a variety of areas, such as image/video recognition and retrieval, human activities analysis, intelligent surveillance and human-computer
interaction.
In this thesis we first introduce a new boosted key-frame selection scheme for action recognition. Specifically, we propose to select a subset of key poses for the representation of each action via AdaBoost and a new classifier, namely WLNBNN, is then developed for final classification. The experimental results of the proposed method are 0.6% - 13.2% better than previous work. After that, a domain-adaptive learning approach based on multiobjective genetic programming (MOGP) has been developed for image classification. In this method, a set of primitive 2-D operators are randomly combined to construct feature descriptors through the MOGP evolving and then evaluated by two objective fitness criteria,
i.e., the classification error and the tree complexity. Later, the (near-)optimal feature descriptor can be obtained. The proposed approach can achieve 0.9% ∼ 25.9% better performance compared with state-of-the-art methods. Moreover, effective dimensionality reduction algorithms have also been widely used for obtaining better representations. In this thesis, we have proposed a novel linear unsupervised algorithm, termed Discriminative Partition Sparsity Analysis (DPSA), explicitly considering different probabilistic distributions that exist over the data points, simultaneously preserving the natural locality relationship among the data. All these above methods have been systematically evaluated on several public datasets, showing their accurate and robust performance (0.44% - 6.69% better than the previous) for action and image categorization. Targeting efficient image classification
, we also introduce a novel unsupervised framework termed evolutionary compact embedding (ECE) which can automatically learn the task-specific binary hash codes. It is regarded as an optimization algorithm which combines the genetic programming (GP) and a boosting trick. The experimental results manifest ECE significantly outperform others by 1.58% - 2.19% for classification tasks. In addition, a supervised framework, bilinear local feature hashing (BLFH), has also been proposed to learn highly discriminative binary codes on the local descriptors for large-scale image similarity search. We address it as a nonconvex optimization problem to seek orthogonal projection matrices for hashing, which can successfully preserve the pairwise similarity between different local features and simultaneously take image-to-class (I2C) distances into consideration. BLFH produces outstanding results (0.017% - 0.149% better) compared to the state-of-the-art hashing techniques
A NOVEL APPROACH FOR IMPROVING THE QUALITY OF DATA USING AGGREGATION MECHANISM
Due to the inception of the big data applications, it is becoming increasingly important to manage and analyze large volumes of data. However, it is not always possible to efficiently analyze very big chunks of detailed data. Thus, data aggregation techniques emerged as an efficient solution for reducing the data size and providing summary of the key information in the original data. For example, yearly stock sales are used instead of daily sales to provide a general summary of the sales. Data aggregation aims to group raw data elements in order to facilitate the assessment of higher-level concepts. However, data aggregation can result in the loss of some important details in the original data, which means that the aggregation should be done in a creative manner in order to keep the data informative even if there is a loss in some details. In some cases, we may have only aggregated versions of the data due to the data collection constraints as well as high storage and processing requirements of the big data. In these cases, we need to find the relationship between aggregated datasets and original datasets. Data disaggregation is one solution for this issue. However, accurate disaggregation is not always possible and easy to utilize.
In this dissertation, we introduce a novel approach to improve the quality of data to be more informative without disaggregating the data. We propose information preserving signature based preprocessing strategy, as well as an aggregation-based information retrieval architecture using signatures. We compensate the loss of details in the raw data by highlighting the most informative parts in the aggregated data. Our approach can be used to assess similarity and correspondence between datasets and to link aggregated historical data with most related datasets. We extended our approach to be used with time series datasets. We also created hybrid signatures to be used at any aggregation level
Evaluation of optimal solutions in multicriteria models for intelligent decision support
La memoria se enmarca dentro de la optimización y su uso para la toma de decisiones. La secuencia lógica ha sido la modelación, implementación, resolución y validación que conducen a una decisión. Para esto, hemos utilizado herramientas del análisis multicrerio, optimización multiobjetivo y técnicas de inteligencia artificial.
El trabajo se ha estructurado en dos partes (divididas en tres capítulos cada una) que se corresponden con la parte teórica y con la parte experimental. En la primera parte se analiza el contexto del campo de estudio con un análisis del marco histórico y posteriormente se dedica un capítulo a la optimización multicriterio en el se recogen modelos conocidos, junto con aportaciones originales de este trabajo. En el tercer capítulo, dedicado a la inteligencia artificial, se presentan los fundamentos del aprendizaje estadístico , las técnicas de aprendizaje automático y de aprendizaje profundo necesarias para las aportaciones en la segunda parte.
La segunda parte contiene siete casos reales a los que se han aplicado las técnicas descritas. En el primer capítulo se estudian dos casos: el rendimiento académico de los estudiantes de la Universidad Industrial de Santander (Colombia) y un sistema objetivo para la asignación del premio MVP en la NBA. En el siguiente capítulo se utilizan técnicas de inteligencia artificial a la similitud musical (detección de plagios en Youtube), la predicción del precio de cierre de una empresa en el mercado bursátil de Nueva York y la clasificación automática de señales espaciales acústicas en entornos envolventes. En el último capítulo a la potencia de la inteligencia artificial se le incorporan técnicas de análisis multicriterio para detectar el fracaso escolar universitario de manera precoz (en la Universidad Industrial de Santander) y, para establecer un ranking de modelos de inteligencia artificial de se recurre a métodos multicriterio.
Para acabar la memoria, a pesar de que cada capítulo contiene una conclusión parcial, en el capítulo 8 se recogen las principales conclusiones de toda la memoria y una bibliografía bastante exhaustiva de los temas tratados. Además, el trabajo concluye con tres apéndices que contienen los programas y herramientas, que a pesar de ser útiles para la comprensión de la memoria, se ha preferido poner por separado para que los capítulos resulten más fluidos