10 research outputs found

    Primal-Dual Algorithms for Non-negative Matrix Factorization with the Kullback-Leibler Divergence

    Get PDF
    Non-negative matrix factorization (NMF) approximates a given matrix as a product of two non-negative matrices. Multiplicative algorithms deliver reliable results, but they show slow convergence for high-dimensional data and may be stuck away from local minima. Gradient descent methods have better behavior, but only apply to smooth losses such as the least-squares loss. In this article, we propose a first-order primal-dual algorithm for non-negative decomposition problems (where one factor is fixed) with the KL divergence, based on the Chambolle-Pock algorithm. All required computations may be obtained in closed form and we provide an efficient heuristic way to select step-sizes. By using alternating optimization, our algorithm readily extends to NMF and, on synthetic examples, face recognition or music source separation datasets, it is either faster than existing algorithms, or leads to improved local optima, or both

    Gradient edge map features for frontal face recognition under extreme illumination changes

    Full text link
    Our aim in this paper is to robustly match frontal faces in the presence of extreme illumination changes, using only a single training image per person and a single probe image. In the illumination conditions we consider, which include those with the dominant light source placed behind and to the side of the user, directly above and pointing downwards or indeed below and pointing upwards, this is a most challenging problem. The presence of sharp cast shadows, large poorly illuminated regions of the face, quantum and quantization noise and other nuisance effects, makes it difficult to extract a sufficiently discriminative yet robust representation. We introduce a representation which is based on image gradient directions near robust edges which correspond to characteristic facial features. Robust edges are extracted using a cascade of processing steps, each of which seeks to harness further discriminative information or normalize for a particular source of extra-personal appearance variability. The proposed representation was evaluated on the extremely difficult YaleB data set. Unlike most of the previous work we include all available illuminations, perform training using a single image per person and match these also to a single probe image. In this challenging evaluation setup, the proposed gradient edge map achieved 0.8% error rate, demonstrating a nearly perfect receiver-operator characteristic curve behaviour. This is by far the best performance achieved in this setup reported in the literature, the best performing methods previously proposed attaining error rates of approximately 6–7%

    Low-Rank and Sparse Decomposition for Hyperspectral Image Enhancement and Clustering

    Get PDF
    In this dissertation, some new algorithms are developed for hyperspectral imaging analysis enhancement. Tensor data format is applied in hyperspectral dataset sparse and low-rank decomposition, which could enhance the classification and detection performance. And multi-view learning technique is applied in hyperspectral imaging clustering. Furthermore, kernel version of multi-view learning technique has been proposed, which could improve clustering performance. Most of low-rank and sparse decomposition algorithms are based on matrix data format for HSI analysis. As HSI contains high spectral dimensions, tensor based extended low-rank and sparse decomposition (TELRSD) is proposed in this dissertation for better performance of HSI classification with low-rank tensor part, and HSI detection with sparse tensor part. With this tensor based method, HSI is processed in 3D data format, and information between spectral bands and pixels maintain integrated during decomposition process. This proposed algorithm is compared with other state-of-art methods. And the experiment results show that TELRSD has the best performance among all those comparison algorithms. HSI clustering is an unsupervised task, which aims to group pixels into different groups without labeled information. Low-rank sparse subspace clustering (LRSSC) is the most popular algorithms for this clustering task. The spatial-spectral based multi-view low-rank sparse subspace clustering (SSMLC) algorithms is proposed in this dissertation, which extended LRSSC with multi-view learning technique. In this algorithm, spectral and spatial views are created to generate multi-view dataset of HSI, where spectral partition, morphological component analysis (MCA) and principle component analysis (PCA) are applied to create others views. Furthermore, kernel version of SSMLC (k-SSMLC) also has been investigated. The performance of SSMLC and k-SSMLC are compared with sparse subspace clustering (SSC), low-rank sparse subspace clustering (LRSSC), and spectral-spatial sparse subspace clustering (S4C). It has shown that SSMLC could improve the performance of LRSSC, and k-SSMLC has the best performance. The spectral clustering has been proved that it equivalent to non-negative matrix factorization (NMF) problem. In this case, NMF could be applied to the clustering problem. In order to include local and nonlinear features in data source, orthogonal NMF (ONMF), graph-regularized NMF (GNMF) and kernel NMF (k-NMF) has been proposed for better clustering performance. The non-linear orthogonal graph NMF combine both kernel, orthogonal and graph constraints in NMF (k-OGNMF), which push up the clustering performance further. In the HSI domain, kernel multi-view based orthogonal graph NMF (k-MOGNMF) is applied for subspace clustering, where k-OGNMF is extended with multi-view algorithm, and it has better performance and computation efficiency

    Sensing Human Sentiment via Social Media Images: Methodologies and Applications

    Get PDF
    abstract: Social media refers computer-based technology that allows the sharing of information and building the virtual networks and communities. With the development of internet based services and applications, user can engage with social media via computer and smart mobile devices. In recent years, social media has taken the form of different activities such as social network, business network, text sharing, photo sharing, blogging, etc. With the increasing popularity of social media, it has accumulated a large amount of data which enables understanding the human behavior possible. Compared with traditional survey based methods, the analysis of social media provides us a golden opportunity to understand individuals at scale and in turn allows us to design better services that can tailor to individuals’ needs. From this perspective, we can view social media as sensors, which provides online signals from a virtual world that has no geographical boundaries for the real world individual's activity. One of the key features for social media is social, where social media users actively interact to each via generating content and expressing the opinions, such as post and comment in Facebook. As a result, sentiment analysis, which refers a computational model to identify, extract or characterize subjective information expressed in a given piece of text, has successfully employs user signals and brings many real world applications in different domains such as e-commerce, politics, marketing, etc. The goal of sentiment analysis is to classify a user’s attitude towards various topics into positive, negative or neutral categories based on textual data in social media. However, recently, there is an increasing number of people start to use photos to express their daily life on social media platforms like Flickr and Instagram. Therefore, analyzing the sentiment from visual data is poise to have great improvement for user understanding. In this dissertation, I study the problem of understanding human sentiments from large scale collection of social images based on both image features and contextual social network features. We show that neither visual features nor the textual features are by themselves sufficient for accurate sentiment prediction. Therefore, we provide a way of using both of them, and formulate sentiment prediction problem in two scenarios: supervised and unsupervised. We first show that the proposed framework has flexibility to incorporate multiple modalities of information and has the capability to learn from heterogeneous features jointly with sufficient training data. Secondly, we observe that negative sentiment may related to human mental health issues. Based on this observation, we aim to understand the negative social media posts, especially the post related to depression e.g., self-harm content. Our analysis, the first of its kind, reveals a number of important findings. Thirdly, we extend the proposed sentiment prediction task to a general multi-label visual recognition task to demonstrate the methodology flexibility behind our sentiment analysis model.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Towards Addressing Key Visual Processing Challenges in Social Media Computing

    Get PDF
    abstract: Visual processing in social media platforms is a key step in gathering and understanding information in the era of Internet and big data. Online data is rich in content, but its processing faces many challenges including: varying scales for objects of interest, unreliable and/or missing labels, the inadequacy of single modal data and difficulty in analyzing high dimensional data. Towards facilitating the processing and understanding of online data, this dissertation primarily focuses on three challenges that I feel are of great practical importance: handling scale differences in computer vision tasks, such as facial component detection and face retrieval, developing efficient classifiers using partially labeled data and noisy data, and employing multi-modal models and feature selection to improve multi-view data analysis. For the first challenge, I propose a scale-insensitive algorithm to expedite and accurately detect facial landmarks. For the second challenge, I propose two algorithms that can be used to learn from partially labeled data and noisy data respectively. For the third challenge, I propose a new framework that incorporates feature selection modules into LDA models.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Non-negative matrix factorization framework for face recognition

    No full text
    Non-negative Matrix Factorization (NMF) is a part-based image representation method which adds a non-negativity constraint to matrix factorization. NMF is compatible with the intuitive notion of combining parts to form a whole face. In this paper, we propose a framework of face recognition by adding NMF constraint and classifier constraints to matrix factorization to get both intuitive features and good recognition results. Based on the framework, we present two novel subspace methods: Fisher Non-negative Matrix Factorization (FNMF) and PCA Non-negative Matrix Factorization (PNMF). FNMF adds both the non-negative constraint and the Fisher constraint to matrix factorization. The Fisher constraint maximizes the between-class scatter and minimizes the withinclass scatter of face samples. Subsequently, FNMF improves the capability of face recognition. PNMF adds the non-negative constraint and characteristics of PCA, such as maximizing the variance of output coordinates, orthogonal bases, etc. to matrix factorization. Therefore, we can get intuitive features and desirable PCA characteristics. Our experiments show that FNMF and PNMF achieve better face recognition performance than NMF and Local NMF

    Multivariate methods for interpretable analysis of magnetic resonance spectroscopy data in brain tumour diagnosis

    Get PDF
    Malignant tumours of the brain represent one of the most difficult to treat types of cancer due to the sensitive organ they affect. Clinical management of the pathology becomes even more intricate as the tumour mass increases due to proliferation, suggesting that an early and accurate diagnosis is vital for preventing it from its normal course of development. The standard clinical practise for diagnosis includes invasive techniques that might be harmful for the patient, a fact that has fostered intensive research towards the discovery of alternative non-invasive brain tissue measurement methods, such as nuclear magnetic resonance. One of its variants, magnetic resonance imaging, is already used in a regular basis to locate and bound the brain tumour; but a complementary variant, magnetic resonance spectroscopy, despite its higher spatial resolution and its capability to identify biochemical metabolites that might become biomarkers of tumour within a delimited area, lags behind in terms of clinical use, mainly due to its difficult interpretability. The interpretation of magnetic resonance spectra corresponding to brain tissue thus becomes an interesting field of research for automated methods of knowledge extraction such as machine learning, always understanding its secondary role behind human expert medical decision making. The current thesis aims at contributing to the state of the art in this domain by providing novel techniques for assistance of radiology experts, focusing on complex problems and delivering interpretable solutions. In this respect, an ensemble learning technique to accurately discriminate amongst the most aggressive brain tumours, namely glioblastomas and metastases, has been designed; moreover, a strategy to increase the stability of biomarker identification in the spectra by means of instance weighting is provided. From a different analytical perspective, a tool based on signal source separation, guided by tumour type-specific information has been developed to assess the existence of different tissues in the tumoural mass, quantifying their influence in the vicinity of tumoural areas. This development has led to the derivation of a probabilistic interpretation of some source separation techniques, which provide support for uncertainty handling and strategies for the estimation of the most accurate number of differentiated tissues within the analysed tumour volumes. The provided strategies should assist human experts through the use of automated decision support tools and by tackling interpretability and accuracy from different anglesEls tumors cerebrals malignes representen un dels tipus de càncer més difícils de tractar degut a la sensibilitat de l’òrgan que afecten. La gestió clínica de la patologia esdevé encara més complexa quan la massa tumoral s'incrementa degut a la proliferació incontrolada de cèl·lules; suggerint que una diagnosis precoç i acurada és vital per prevenir el curs natural de desenvolupament. La pràctica clínica estàndard per a la diagnosis inclou la utilització de tècniques invasives que poden arribar a ser molt perjudicials per al pacient, factor que ha fomentat la recerca intensiva cap al descobriment de mètodes alternatius de mesurament dels teixits del cervell, tals com la ressonància magnètica nuclear. Una de les seves variants, la imatge de ressonància magnètica, ja s'està actualment utilitzant de forma regular per localitzar i delimitar el tumor. Així mateix, una variant complementària, la espectroscòpia de ressonància magnètica, malgrat la seva alta resolució espacial i la seva capacitat d'identificar metabòlits bioquímics que poden esdevenir biomarcadors de tumor en una àrea delimitada, està molt per darrera en termes d'ús clínic, principalment per la seva difícil interpretació. Per aquest motiu, la interpretació dels espectres de ressonància magnètica corresponents a teixits del cervell esdevé un interessant camp de recerca en mètodes automàtics d'extracció de coneixement tals com l'aprenentatge automàtic, sempre entesos com a una eina d'ajuda per a la presa de decisions per part d'un metge expert humà. La tesis actual té com a propòsit la contribució a l'estat de l'art en aquest camp mitjançant l'aportació de noves tècniques per a l'assistència d'experts radiòlegs, centrades en problemes complexes i proporcionant solucions interpretables. En aquest sentit, s'ha dissenyat una tècnica basada en comitè d'experts per a una discriminació acurada dels diferents tipus de tumors cerebrals agressius, anomenats glioblastomes i metàstasis; a més, es proporciona una estratègia per a incrementar l'estabilitat en la identificació de biomarcadors presents en un espectre mitjançant una ponderació d'instàncies. Des d'una perspectiva analítica diferent, s'ha desenvolupat una eina basada en la separació de fonts, guiada per informació específica de tipus de tumor per a avaluar l'existència de diferents tipus de teixits existents en una massa tumoral, quantificant-ne la seva influència a les regions tumorals veïnes. Aquest desenvolupament ha portat cap a la derivació d'una interpretació probabilística d'algunes d'aquestes tècniques de separació de fonts, proporcionant suport per a la gestió de la incertesa i estratègies d'estimació del nombre més acurat de teixits diferenciats en cada un dels volums tumorals analitzats. Les estratègies proporcionades haurien d'assistir els experts humans en l'ús d'eines automatitzades de suport a la decisió, donada la interpretabilitat i precisió que presenten des de diferents angles
    corecore