17 research outputs found

    An Analysis on Adversarial Machine Learning: Methods and Applications

    Get PDF
    Deep learning has witnessed astonishing advancement in the last decade and revolutionized many fields ranging from computer vision to natural language processing. A prominent field of research that enabled such achievements is adversarial learning, investigating the behavior and functionality of a learning model in presence of an adversary. Adversarial learning consists of two major trends. The first trend analyzes the susceptibility of machine learning models to manipulation in the decision-making process and aims to improve the robustness to such manipulations. The second trend exploits adversarial games between components of the model to enhance the learning process. This dissertation aims to provide an analysis on these two sides of adversarial learning and harness their potential for improving the robustness and generalization of deep models. In the first part of the dissertation, we study the adversarial susceptibility of deep learning models. We provide an empirical analysis on the extent of vulnerability by proposing two adversarial attacks that explore the geometric and frequency-domain characteristics of inputs to manipulate deep decisions. Afterward, we formalize the susceptibility of deep networks using the first-order approximation of the predictions and extend the theory to the ensemble classification scheme. Inspired by theoretical findings, we formalize a reliable and practical defense against adversarial examples to robustify ensembles. We extend this part by investigating the shortcomings of \gls{at} and highlight that the popular momentum stochastic gradient descent, developed essentially for natural training, is not proper for optimization in adversarial training since it is not designed to be robust against the chaotic behavior of gradients in this setup. Motivated by these observations, we develop an optimization method that is more suitable for adversarial training. In the second part of the dissertation, we harness adversarial learning to enhance the generalization and performance of deep networks in discriminative and generative tasks. We develop several models for biometric identification including fingerprint distortion rectification and latent fingerprint reconstruction. In particular, we develop a ridge reconstruction model based on generative adversarial networks that estimates the missing ridge information in latent fingerprints. We introduce a novel modification that enables the generator network to preserve the ID information during the reconstruction process. To address the scarcity of data, {\it e.g.}, in latent fingerprint analysis, we develop a supervised augmentation technique that combines input examples based on their salient regions. Our findings advocate that adversarial learning improves the performance and reliability of deep networks in a wide range of applications

    Wafer defect recognition method based on multi-scale feature fusion

    Get PDF
    Wafer defect recognition is an important process of chip manufacturing. As different process flows can lead to different defect types, the correct identification of defect patterns is important for recognizing manufacturing problems and fixing them in good time. To achieve high precision identification of wafer defects and improve the quality and production yield of wafers, this paper proposes a Multi-Feature Fusion Perceptual Network (MFFP-Net) inspired by human visual perception mechanisms. The MFFP-Net can process information at various scales and then aggregate it so that the next stage can abstract features from the different scales simultaneously. The proposed feature fusion module can obtain higher fine-grained and richer features to capture key texture details and avoid important information loss. The final experiments show that MFFP-Net achieves good generalized ability and state-of-the-art results on real-world dataset WM-811K, with an accuracy of 96.71%, this provides an effective way for the chip manufacturing industry to improve the yield rate

    Representation based regression for object distance estimation

    Get PDF
    In this study, we propose a novel approach to predict the distances of the detected objects in an observed scene. The proposed approach modifies the recently proposed Convolutional Support Estimator Networks (CSENs). CSENs are designed to compute a direct mapping for the Support Estimation (SE) task in a representation-based classification problem. We further propose and demonstrate that representation-based methods (sparse or collaborative representation) can be used in well-designed regression problems especially over scarce data. To the best of our knowledge, this is the first representation-based method proposed for performing a regression task by utilizing the modified CSENs; and hence, we name this novel approach as Representation-based Regression (RbR). The initial version of CSENs has a proxy mapping stage (i.e., a coarse estimation for the support set) that is required for the input. In this study, we improve the CSEN model by proposing Compressive Learning CSEN (CL-CSEN) that has the ability to jointly optimize the so-called proxy mapping stage along with convolutional layers. The experimental evaluations using the KITTI 3D Object Detection distance estimation dataset show that the proposed method can achieve a significantly improved distance estimation performance over all competing methods. Finally, the software implementations of the methods are publicly shared at https://github.com/meteahishali/CSENDistance.publishedVersionPeer reviewe

    Discriminative Video Representation Learning

    Get PDF
    Representation learning is a fundamental research problem in the area of machine learning, refining the raw data to discover representations needed for various applications. However, real-world data, particularly video data, is neither mathematically nor computationally convenient to process due to its semantic redundancy and complexity. Video data, as opposed to images, includes temporal correlation and motion dynamics, but the ground truth label is normally limited to category labels, which makes the video representation learning a challenging problem. To this end, this thesis addresses the problem of video representation learning, specifically discriminative video representation learning, which focuses on capturing useful data distributions and reliable feature representations improving the performance of varied downstream tasks. We argue that neither all frames in one video nor all dimensions in one feature vector are useful and should be equally treated for video representation learning. Based on this argument, several novel algorithms are investigated in this thesis under multiple application scenarios, such as action recognition, action detection and one-class video anomaly detection. These proposed video representation learning methods produce discriminative video features in both deep and non-deep learning setups. Specifically, they are presented in the form of: 1) an early fusion layer that adopts a temporal ranking SVM formulation, agglomerating several optical flow images from consecutive frames into a novel compact representation, named as dynamic optical flow images; 2) an intermediate feature aggregation layer that applies weakly-supervised contrastive learning techniques, learning discriminative video representations via contrasting positive and negative samples from a sequence; 3) a new formulation for one-class feature learning that learns a set of discriminative subspaces with orthonormal hyperplanes to flexibly bound the one-class data distribution using Riemannian optimisation methods. We provide extensive experiments to gain intuitions into why the learned representations are discriminative and useful. All the proposed methods in this thesis are evaluated on standard publicly available benchmarks, demonstrating state-of-the-art performance

    Learning Discriminative Feature Representations for Visual Categorization

    Get PDF
    Learning discriminative feature representations has attracted a great deal of attention due to its potential value and wide usage in a variety of areas, such as image/video recognition and retrieval, human activities analysis, intelligent surveillance and human-computer interaction. In this thesis we first introduce a new boosted key-frame selection scheme for action recognition. Specifically, we propose to select a subset of key poses for the representation of each action via AdaBoost and a new classifier, namely WLNBNN, is then developed for final classification. The experimental results of the proposed method are 0.6% - 13.2% better than previous work. After that, a domain-adaptive learning approach based on multiobjective genetic programming (MOGP) has been developed for image classification. In this method, a set of primitive 2-D operators are randomly combined to construct feature descriptors through the MOGP evolving and then evaluated by two objective fitness criteria, i.e., the classification error and the tree complexity. Later, the (near-)optimal feature descriptor can be obtained. The proposed approach can achieve 0.9% ∼ 25.9% better performance compared with state-of-the-art methods. Moreover, effective dimensionality reduction algorithms have also been widely used for obtaining better representations. In this thesis, we have proposed a novel linear unsupervised algorithm, termed Discriminative Partition Sparsity Analysis (DPSA), explicitly considering different probabilistic distributions that exist over the data points, simultaneously preserving the natural locality relationship among the data. All these above methods have been systematically evaluated on several public datasets, showing their accurate and robust performance (0.44% - 6.69% better than the previous) for action and image categorization. Targeting efficient image classification , we also introduce a novel unsupervised framework termed evolutionary compact embedding (ECE) which can automatically learn the task-specific binary hash codes. It is regarded as an optimization algorithm which combines the genetic programming (GP) and a boosting trick. The experimental results manifest ECE significantly outperform others by 1.58% - 2.19% for classification tasks. In addition, a supervised framework, bilinear local feature hashing (BLFH), has also been proposed to learn highly discriminative binary codes on the local descriptors for large-scale image similarity search. We address it as a nonconvex optimization problem to seek orthogonal projection matrices for hashing, which can successfully preserve the pairwise similarity between different local features and simultaneously take image-to-class (I2C) distances into consideration. BLFH produces outstanding results (0.017% - 0.149% better) compared to the state-of-the-art hashing techniques

    A NOVEL APPROACH FOR IMPROVING THE QUALITY OF DATA USING AGGREGATION MECHANISM

    Get PDF
    Due to the inception of the big data applications, it is becoming increasingly important to manage and analyze large volumes of data. However, it is not always possible to efficiently analyze very big chunks of detailed data. Thus, data aggregation techniques emerged as an efficient solution for reducing the data size and providing summary of the key information in the original data. For example, yearly stock sales are used instead of daily sales to provide a general summary of the sales. Data aggregation aims to group raw data elements in order to facilitate the assessment of higher-level concepts. However, data aggregation can result in the loss of some important details in the original data, which means that the aggregation should be done in a creative manner in order to keep the data informative even if there is a loss in some details. In some cases, we may have only aggregated versions of the data due to the data collection constraints as well as high storage and processing requirements of the big data. In these cases, we need to find the relationship between aggregated datasets and original datasets. Data disaggregation is one solution for this issue. However, accurate disaggregation is not always possible and easy to utilize. In this dissertation, we introduce a novel approach to improve the quality of data to be more informative without disaggregating the data. We propose information preserving signature based preprocessing strategy, as well as an aggregation-based information retrieval architecture using signatures. We compensate the loss of details in the raw data by highlighting the most informative parts in the aggregated data. Our approach can be used to assess similarity and correspondence between datasets and to link aggregated historical data with most related datasets. We extended our approach to be used with time series datasets. We also created hybrid signatures to be used at any aggregation level

    Evaluation of optimal solutions in multicriteria models for intelligent decision support

    Get PDF
    La memoria se enmarca dentro de la optimización y su uso para la toma de decisiones. La secuencia lógica ha sido la modelación, implementación, resolución y validación que conducen a una decisión. Para esto, hemos utilizado herramientas del análisis multicrerio, optimización multiobjetivo y técnicas de inteligencia artificial. El trabajo se ha estructurado en dos partes (divididas en tres capítulos cada una) que se corresponden con la parte teórica y con la parte experimental. En la primera parte se analiza el contexto del campo de estudio con un análisis del marco histórico y posteriormente se dedica un capítulo a la optimización multicriterio en el se recogen modelos conocidos, junto con aportaciones originales de este trabajo. En el tercer capítulo, dedicado a la inteligencia artificial, se presentan los fundamentos del aprendizaje estadístico , las técnicas de aprendizaje automático y de aprendizaje profundo necesarias para las aportaciones en la segunda parte. La segunda parte contiene siete casos reales a los que se han aplicado las técnicas descritas. En el primer capítulo se estudian dos casos: el rendimiento académico de los estudiantes de la Universidad Industrial de Santander (Colombia) y un sistema objetivo para la asignación del premio MVP en la NBA. En el siguiente capítulo se utilizan técnicas de inteligencia artificial a la similitud musical (detección de plagios en Youtube), la predicción del precio de cierre de una empresa en el mercado bursátil de Nueva York y la clasificación automática de señales espaciales acústicas en entornos envolventes. En el último capítulo a la potencia de la inteligencia artificial se le incorporan técnicas de análisis multicriterio para detectar el fracaso escolar universitario de manera precoz (en la Universidad Industrial de Santander) y, para establecer un ranking de modelos de inteligencia artificial de se recurre a métodos multicriterio. Para acabar la memoria, a pesar de que cada capítulo contiene una conclusión parcial, en el capítulo 8 se recogen las principales conclusiones de toda la memoria y una bibliografía bastante exhaustiva de los temas tratados. Además, el trabajo concluye con tres apéndices que contienen los programas y herramientas, que a pesar de ser útiles para la comprensión de la memoria, se ha preferido poner por separado para que los capítulos resulten más fluidos
    corecore