14 research outputs found

    Identification of Anisomerous Motor Imagery EEG Signals Based on Complex Algorithms

    Get PDF
    Motor imagery (MI) electroencephalograph (EEG) signals are widely applied in brain-computer interface (BCI). However, classified MI states are limited, and their classification accuracy rates are low because of the characteristics of nonlinearity and nonstationarity. This study proposes a novel MI pattern recognition system that is based on complex algorithms for classifying MI EEG signals. In electrooculogram (EOG) artifact preprocessing, band-pass filtering is performed to obtain the frequency band of MI-related signals, and then, canonical correlation analysis (CCA) combined with wavelet threshold denoising (WTD) is used for EOG artifact preprocessing. We propose a regularized common spatial pattern (R-CSP) algorithm for EEG feature extraction by incorporating the principle of generic learning. A new classifier combining the K-nearest neighbor (KNN) and support vector machine (SVM) approaches is used to classify four anisomerous states, namely, imaginary movements with the left hand, right foot, and right shoulder and the resting state. The highest classification accuracy rate is 92.5%, and the average classification accuracy rate is 87%. The proposed complex algorithm identification method can significantly improve the identification rate of the minority samples and the overall classification performance

    Adaptive Similarity Measures for Material Identification in Hyperspectral Imagery

    Get PDF
    Remotely-sensed hyperspectral imagery has become one the most advanced tools for analyzing the processes that shape the Earth and other planets. Effective, rapid analysis of high-volume, high-dimensional hyperspectral image data sets demands efficient, automated techniques to identify signatures of known materials in such imagery. In this thesis, we develop a framework for automatic material identification in hyperspectral imagery using adaptive similarity measures. We frame the material identification problem as a multiclass similarity-based classification problem, where our goal is to predict material labels for unlabeled target spectra based upon their similarities to source spectra with known material labels. As differences in capture conditions affect the spectral representations of materials, we divide the material identification problem into intra-domain (i.e., source and target spectra captured under identical conditions) and inter-domain (i.e., source and target spectra captured under different conditions) settings. The first component of this thesis develops adaptive similarity measures for intra-domain settings that measure the relevance of spectral features to the given classification task using small amounts of labeled data. We propose a technique based on multiclass Linear Discriminant Analysis (LDA) that combines several distinct similarity measures into a single hybrid measure capturing the strengths of each of the individual measures. We also provide a comparative survey of techniques for low-rank Mahalanobis metric learning, and demonstrate that regularized LDA yields competitive results to the state-of-the-art, at substantially lower computational cost. The second component of this thesis shifts the focus to inter-domain settings, and proposes a multiclass domain adaptation framework that reconciles systematic differences between spectra captured under similar, but not identical, conditions. Our framework computes a similarity-based mapping that captures structured, relative relationships between classes shared between source and target domains, allowing us apply a classifier trained using labeled source spectra to classify target spectra. We demonstrate improved domain adaptation accuracy in comparison to recently-proposed multitask learning and manifold alignment techniques in several case studies involving state-of-the-art synthetic and real-world hyperspectral imagery

    Personenwiedererkennung mittels maschineller Lernverfahren für öffentliche Einsatzumgebungen

    Get PDF
    Die erscheinungsbasierte Personenwiedererkennung in öffentlichen Einsatzumgebungen ist eines der schwierigsten, noch ungelösten Probleme der Bildverarbeitung. Viele Teilprobleme können nur gelöst werden, wenn Methoden des maschinellen Lernens mit Methoden der Bildverarbeitung kombiniert werden. In dieser Arbeit werden maschinelle Lernverfahren eingesetzt, um alle Abarbeitungsschritte einer erscheinungsbasierten Personenwiedererkennung zu verbessern: Mithilfe von Convolutional Neural Networks werden erscheinungsbasierte Merkmale gelernt, die eine Wiedererkennung auf menschlichem Niveau ermöglichen. Für die Generierung des Templates zur Beschreibung der Zielperson wird durch Einsatz maschineller Lernverfahren eine automatische Auswahl personenspezifischer, diskriminativer Merkmale getroffen. Durch eine gelernte Metrik können beim Vergleich von Merkmalsvektoren szenariospezifische Umwelteinflüsse kompensiert werden. Eine Fusion komplementärer Merkmale auf Score Level steigert die Wiedererkennungsleistung deutlich. Dies wird vor allem durch eine gelernte Gewichtung der Merkmale erreicht. Das entwickelte Verfahren wird exemplarisch anhand zweier Einsatzszenarien - Videoüberwachung und Robotik - evaluiert. Bei der Videoüberwachung ermöglicht die Wiedererkennung von Personen ein kameraübergreifendes Tracking. Dies hilft menschlichen Operateuren, den Aufenthaltsort einer gesuchten Person in kurzer Zeit zu ermitteln. Durch einen mobilen Serviceroboter kann der aktuelle Nutzer anhand einer erscheinungsbasierten Wiedererkennung identifiziert werden. Dies hilft dem Roboter bei der Erfüllung von Aufgaben, bei denen er den Nutzer lotsen oder verfolgen muss. Die Qualität der erscheinungsbasierten Personenwiedererkennung wird in dieser Arbeit anhand von zwölf Kriterien charakterisiert, die einen Vergleich mit biometrischen Verfahren ermöglichen. Durch den Einsatz maschineller Lernverfahren wird bei der erscheinungsbasierten Personenwiedererkennung in den betrachteten unüberwachten, öffentlichen Einsatzfeldern eine Erkennungsleistung erzielt, die sich mit biometrischen Verfahren messen kann.Appearance-based person re-identification in public environments is one of the most challenging, still unsolved computer vision tasks. Many sub-tasks can only be solved by combining machine learning with computer vision methods. In this thesis, we use machine learning approaches in order to improve all processing steps of the appearance-based person re-identification: We apply convolutional neural networks for learning appearance-based features capable of performing re-identification at human level. For generating a template to describe the person of interest, we apply machine learning approaches that automatically select person-specific, discriminative features. A learned metric helps to compensate for scenario-specific perturbations while matching features. Fusing complementary features at score level improves the re-identification performance. This is achieved by a learned feature weighting. We deploy our approach in two applications, namely surveillance and robotics. In the surveillance application, person re-identification enables multi-camera tracking. This helps human operators to quickly determine the current location of the person of interest. By applying appearance-based re-identification, a mobile service robot is able to keep track of users when following or guiding them. In this thesis, we measure the quality of the appearance-based person re-identification by twelve criteria. These criteria enable a comparison with biometric approaches. Due to the application of machine learning techniques, in the considered unsupervised, public fields of application, the appearance-based person re-identification performs on par with biometric approaches.Die erscheinungsbasierte Personenwiedererkennung in öffentlichen Einsatzumgebungen ist eines der schwierigsten, noch ungelösten Probleme der Bildverarbeitung. Viele Teilprobleme können nur gelöst werden, wenn Methoden des maschinellen Lernens mit Methoden der Bildverarbeitung kombiniert werden. In dieser Arbeit werden maschinelle Lernverfahren eingesetzt, um alle Abarbeitungsschritte einer erscheinungsbasierten Personenwiedererkennung zu verbessern, sodass eine Wiedererkennung auf menschlichem Niveau ermöglicht wird. Das entwickelte Verfahren wird anhand zweier Einsatzszenarien — Videoüberwachung und Robotik — evaluiert. Bei der Videoüberwachung ermöglicht die Wiedererkennung von Personen ein kameraübergreifendes Tracking um den Aufenthaltsort einer gesuchten Person in kurzer Zeit zu ermitteln. Durch einen mobilen Serviceroboter kann der aktuelle Nutzer anhand einer erscheinungsbasierten Wiedererkennung identifiziert werden. Dies hilft dem Roboter beim Lots

    A deep learning system for recognizing facial expression in real-time

    Get PDF
    This article presents an image-based real-time facial expression recognition system that is able to recognize the facial expressions of several subjects on a webcam at the same time. Our proposed methodology combines a supervised transfer learning strategy and a joint supervision method with center loss, which is crucial for facial tasks. A newly proposed Convolutional Neural Network (CNN) model, MobileNet, which has both accuracy and speed, is deployed in both offline and in a real-time framework that enables fast and accurate real-time output. Evaluations towards two publicly available datasets, JAFFE and CK+, are carried out respectively. The JAFFE dataset reaches an accuracy of 95.24%, while an accuracy of 96.92% is achieved on the 6-class CK+ dataset, which contains only the last frames of image sequences. At last, the average run-time cost for the recognition of the real-time implementation is around 3.57ms/frame on a NVIDIA Quadro K4200 GPU. - 2019 Association for Computing Machinery.This work was made possible by NPRP grant (10-0205-170346) from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors. Authors' addresses: Y. Miao and H. Dong, University of Ottawa, 800 King Edward Avenue, Ottawa, ON K1N 6N5, Canada; emails: {ymiao036, hdong}@uottawa.ca; J. Mohamad Al Jaam, Qatar University, Ibn Khaldoon Hall, Doha, Qatar; email: [email protected]; A. El Saddik, University of Ottawa, 800 King Edward Avenue, Ottawa, ON K1N 6N5, Canada; email: [email protected]. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. 2019 Association for Computing Machinery. 1551-6857/2019/05-ART33 $15.00 https://doi.org/10.1145/3311747Scopu

    Inter-modality image synthesis and recognition.

    Get PDF
    跨模態圖像的合成和識別已成為計算機視覺領域的熱點。實際應用中存在各種各樣的圖像模態,比如刑偵中使用的素描畫和光照不變人臉識別中使用的近紅外圖像。由於某些模態的圖像很難獲得,模態間的轉換和匹配是一項十分有用的技術,為計算機視覺的應用提供了很大的便利。本論文研究了三個應用:人像素描畫的合成,基於樣本的圖像風格化和人像素描畫識別。我們將人像素描畫的合成的前沿研究擴展到非可控條件下的合成。以前的工作都只能在嚴格可控的條件下從照片合成素描畫。我們提出了一種魯棒的算法,可以從有光照和姿態變化的人臉照片合成素描畫。該算法用多尺度馬爾可夫隨機場來合成局部素描圖像塊。對光照和姿態的魯棒性通過三個部分來實現:基於面部器官的形狀先驗可以抑制缺陷和扭曲的合成效果,圖像塊的特征描述子和魯棒的距離測度用來選擇素描圖像塊,以及像素灰度和梯度的一致性來有效地匹配鄰近的素描圖像塊。在CUHK人像素描數據庫和網上的名人照片上的實驗結果表明我們的算法顯著提高了現有算法的效果。針對基於樣本的圖像風格化,我們提供了一種將模板圖像的藝術風格傳遞到照片上的有效方法。大多數已有方法沒有考慮圖像內容和風格的分離。我們提出了一種通過頻段分解的風格傳遞算法。一幅圖像被分解成低頻、中頻和高頻分量,分別描述內容、主要風格和邊緣信息。接著中頻和高頻分量中的風格從模板傳遞到照片,這一過程用馬爾可夫隨機場來建模。最後我們結合照片中的低頻分量和獲得的風格信息重建出藝術圖像。和其它算法相比,我們的方法不僅合成了風格,而且很好的保持了原有的圖像內容。我們通過圖像風格化和個性化藝術合成的實驗來驗證了算法的有效性。我們為人像素描畫的識別提出了一個從數據中學習人臉描述子的新方向。最近的研究都集中在轉換照片和素描畫到相同的模態,或者設計復雜的分類算法來減少從照片和素描畫提取的特征的模態間差異。我們提出了一種新穎的方法:在提取特征的階段減小模態間差異。我們用一種基於耦合信息論編碼的人臉描述子來獲取有判別性的局部人臉結構和有效的匹配照片和素描畫。通過最大化在量化特征空間的照片和素描畫的互信息,我們設計了耦合信息論投影森林來實現耦合編碼。在世界上最大的人像素描畫數據庫上的結果表明我們的方法和已有最好的方法相比有顯著提高。Inter-modality image synthesis and recognition has been a hot topic in computer vision. In real-world applications, there are diverse image modalities, such as sketch images for law enforcement and near infrared images for illumination invariant face recognition. Therefore, it is often useful to transform images from a modality to another or match images from different modalities, due to the difficulty of acquiring image data in some modality. These techniques provide large flexibility for computer vision applications.In this thesis we study three problems: face sketch synthesis, example-based image stylization, and face sketch recognition.For face sketch synthesis, we expand the frontier to synthesis from uncontrolled face photos. Previous methods only work under well controlled conditions. We propose a robust algorithm for synthesizing a face sketch from a face photo with lighting and pose variations. It synthesizes local sketch patches using a multiscale Markov Random Field (MRF) model. The robustness to lighting and pose variations is achieved with three components: shape priors specific to facial components to reduce artifacts and distortions, patch descriptors and robust metrics for selecting sketch patch candidates, and intensity compatibility and gradient compatibility to match neighboring sketch patches effectively. Experiments on the CUHK face sketch database and celebrity photos collected from the web show that our algorithm significantly improves the performance of the state-of-the-art.For example-based image stylization, we provide an effective approach of transferring artistic effects from a template image to photos. Most existing methods do not consider the content and style separately. We propose a style transfer algorithm via frequency band decomposition. An image is decomposed into the low-frequency (LF), mid-frequency (MF), and highfrequency( HF) components, which describe the content, main style, and information along the boundaries. Then the style is transferred from the template to the photo in the MF and HF components, which is formulated as MRF optimization. Finally a reconstruction step combines the LF component of the photo and the obtained style information to generate the artistic result. Compared to the other algorithms, our method not only synthesizes the style, but also preserves the image content well. We demonstrate that our approach performs excellently in image stylization and personalized artwork in experiments.For face sketch recognition, we propose a new direction based on learning face descriptors from data. Recent research has focused on transforming photos and sketches into the same modality for matching or developing advanced classification algorithms to reduce the modality gap between features extracted from photos and sketches. We propose a novel approach by reducing the modality gap at the feature extraction stage. A face descriptor based on coupled information-theoretic encoding is used to capture discriminative local face structures and to effectively match photos and sketches. Guided by maximizing the mutual information between photos and sketches in the quantized feature spaces, the coupled encoding is achieved by the proposed coupled information-theoretic projection forest. Experiments on the largest face sketch database show that our approach significantly outperforms the state-of-the-art methods.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Zhang, Wei.Thesis (Ph.D.)--Chinese University of Hong Kong, 2012.Includes bibliographical references (leaves 121-137).Abstract also in Chinese.Abstract --- p.iAcknowledgement --- p.vChapter 1 --- Introduction --- p.1Chapter 1.1 --- Multi-Modality Computer Vision --- p.1Chapter 1.2 --- Face Sketches --- p.4Chapter 1.2.1 --- Face Sketch Synthesis --- p.6Chapter 1.2.2 --- Face Sketch Recognition --- p.7Chapter 1.3 --- Example-based Image Stylization --- p.9Chapter 1.4 --- Contributions and Summary of Approaches --- p.10Chapter 1.5 --- Thesis Road Map --- p.13Chapter 2 --- Literature Review --- p.14Chapter 2.1 --- Related Works in Face Sketch Synthesis --- p.14Chapter 2.2 --- Related Works in Example-based Image Stylization --- p.17Chapter 2.3 --- Related Works in Face Sketch Recognition --- p.21Chapter 3 --- Lighting and Pose Robust Sketch Synthesis --- p.27Chapter 3.1 --- The Algorithm --- p.31Chapter 3.1.1 --- Overview of the Method --- p.32Chapter 3.1.2 --- Local Evidence --- p.34Chapter 3.1.3 --- Shape Prior --- p.40Chapter 3.1.4 --- Neighboring Compatibility --- p.42Chapter 3.1.5 --- Implementation Details --- p.43Chapter 3.1.6 --- Acceleration --- p.45Chapter 3.2 --- Experimental Results --- p.47Chapter 3.2.1 --- Lighting and Pose Variations --- p.49Chapter 3.2.2 --- Celebrity Faces from the Web --- p.54Chapter 3.3 --- Conclusion --- p.54Chapter 4 --- Style Transfer via Band Decomposition --- p.58Chapter 4.1 --- Introduction --- p.58Chapter 4.2 --- Algorithm Overview --- p.63Chapter 4.3 --- Image Style Transfer --- p.64Chapter 4.3.1 --- Band Decomposition --- p.64Chapter 4.3.2 --- MF and HF Component Processing --- p.67Chapter 4.3.3 --- Reconstruction --- p.74Chapter 4.4 --- Experiments --- p.76Chapter 4.4.1 --- Comparison to State-of-the-Art --- p.76Chapter 4.4.2 --- Extended Application: Personalized Artwork --- p.82Chapter 4.5 --- Conclusion --- p.84Chapter 5 --- Coupled Encoding for Sketch Recognition --- p.86Chapter 5.1 --- Introduction --- p.86Chapter 5.1.1 --- Related work --- p.89Chapter 5.2 --- Information-Theoretic Projection Tree --- p.90Chapter 5.2.1 --- Projection Tree --- p.91Chapter 5.2.2 --- Mutual Information Maximization --- p.92Chapter 5.2.3 --- Tree Construction with MMI --- p.94Chapter 5.2.4 --- Randomized CITP Forest --- p.102Chapter 5.3 --- Coupled Encoding Based Descriptor --- p.103Chapter 5.4 --- Experiments --- p.106Chapter 5.4.1 --- Descriptor Comparison --- p.108Chapter 5.4.2 --- Parameter Exploration --- p.109Chapter 5.4.3 --- Experiments on Benchmarks --- p.112Chapter 5.5 --- Conclusions --- p.115Chapter 6 --- Conclusion --- p.116Bibliography --- p.12

    Robust Computer Vision Against Adversarial Examples and Domain Shifts

    Get PDF
    Recent advances in deep learning have achieved remarkable success in various computer vision problems. Driven by progressive computing resources and a vast amount of data, deep learning technology is reshaping human life. However, Deep Neural Networks (DNNs) have been shown vulnerable to adversarial examples, in which carefully crafted perturbations can easily fool DNNs into making wrong predictions. On the other hand, DNNs have poor generalization to domain shifts, as they suffer from performance degradation when encountering data from new visual distributions. We view these issues from the perspective of robustness. More precisely, existing deep learning technology is not reliable enough for many scenarios, where adversarial examples and domain shifts are among the most critical. The lack of reliability inevitably limits DNNs from being deployed in more important computer vision applications, such as self-driving vehicles and medical instruments that have major safety concerns. To overcome these challenges, we focus on investigating and addressing the robustness of deep learning-based computer vision approaches. The first part of this thesis attempts to robustify computer vision models against adversarial examples. We dive into such adversarial robustness from four aspects: novel attacks for strengthening benchmarks, empirical defenses validated by a third-party evaluator, generalizable defenses that can defend against multiple and unforeseen attacks, and defenses specifically designed for less explored tasks. The second part of this thesis improves the robustness against domain shifts via domain adaptation. We dive into two important domain adaptation settings: unsupervised domain adaptation, which is the most common, and source-free domain adaptation, which is more practical in real-world scenarios. The last part explores the intersection of adversarial robustness and domain adaptation fields to provide new insights for robust DNNs. We study two directions: adversarial defense for domain adaptation and adversarial defense via domain adaptations. This dissertation aims at more robust, reliable, and trustworthy computer vision

    Attention Mechanism for Recognition in Computer Vision

    Get PDF
    It has been proven that humans do not focus their attention on an entire scene at once when they perform a recognition task. Instead, they pay attention to the most important parts of the scene to extract the most discriminative information. Inspired by this observation, in this dissertation, the importance of attention mechanism in recognition tasks in computer vision is studied by designing novel attention-based models. In specific, four scenarios are investigated that represent the most important aspects of attention mechanism.First, an attention-based model is designed to reduce the visual features\u27 dimensionality by selectively processing only a small subset of the data. We study this aspect of the attention mechanism in a framework based on object recognition in distributed camera networks. Second, an attention-based image retrieval system (i.e., person re-identification) is proposed which learns to focus on the most discriminative regions of the person\u27s image and process those regions with higher computation power using a deep convolutional neural network. Furthermore, we show how visualizing the attention maps can make deep neural networks more interpretable. In other words, by visualizing the attention maps we can observe the regions of the input image where the neural network relies on, in order to make a decision. Third, a model for estimating the importance of the objects in a scene based on a given task is proposed. More specifically, the proposed model estimates the importance of the road users that a driver (or an autonomous vehicle) should pay attention to in a driving scenario in order to have safe navigation. In this scenario, the attention estimation is the final output of the model. Fourth, an attention-based module and a new loss function in a meta-learning based few-shot learning system is proposed in order to incorporate the context of the task into the feature representations of the samples and increasing the few-shot recognition accuracy.In this dissertation, we showed that attention can be multi-facet and studied the attention mechanism from the perspectives of feature selection, reducing the computational cost, interpretable deep learning models, task-driven importance estimation, and context incorporation. Through the study of four scenarios, we further advanced the field of where \u27\u27attention is all you need\u27\u27
    corecore