36 research outputs found

    Robust capped norm dual hyper-graph regularized non-negative matrix tri-factorization

    Get PDF
    Non-negative matrix factorization (NMF) has been widely used in machine learning and data mining fields. As an extension of NMF, non-negative matrix tri-factorization (NMTF) provides more degrees of freedom than NMF. However, standard NMTF algorithm utilizes Frobenius norm to calculate residual error, which can be dramatically affected by noise and outliers. Moreover, the hidden geometric information in feature manifold and sample manifold is rarely learned. Hence, a novel robust capped norm dual hyper-graph regularized non-negative matrix tri-factorization (RCHNMTF) is proposed. First, a robust capped norm is adopted to handle extreme outliers. Second, dual hyper-graph regularization is considered to exploit intrinsic geometric information in feature manifold and sample manifold. Third, orthogonality constraints are added to learn unique data presentation and improve clustering performance. The experiments on seven datasets testify the robustness and superiority of RCHNMTF

    Gene Ranking of RNA-Seq Data via Discriminant Non-Negative Matrix Factorization

    Get PDF
    RNA-sequencing is rapidly becoming the method of choice for studying the full complexity of transcriptomes, however with increasing dimensionality, accurate gene ranking is becoming increasingly challenging. This paper proposes an accurate and sensitive gene ranking method that implements discriminant non-negative matrix factorization (DNMF) for RNA-seq data. To the best of our knowledge, this is the first work to explore the utility of DNMF for gene ranking. When incorporating Fisher's discriminant criteria and setting the reduced dimension as two, DNMF learns two factors to approximate the original gene expression data, abstracting the up-regulated or down-regulated metagene by using the sample label information. The first factor denotes all the genes' weights of two metagenes as the additive combination of all genes, while the second learned factor represents the expression values of two metagenes. In the gene ranking stage, all the genes are ranked as a descending sequence according to the differential values of the metagene weights. Leveraging the nature of NMF and Fisher's criterion, DNMF can robustly boost the gene ranking performance. The Area Under the Curve analysis of differential expression analysis on two benchmarking tests of four RNA-seq data sets with similar phenotypes showed that our proposed DNMF-based gene ranking method outperforms other widely used methods. Moreover, the Gene Set Enrichment Analysis also showed DNMF outweighs others. DNMF is also computationally efficient, substantially outperforming all other benchmarked methods. Consequently, we suggest DNMF is an effective method for the analysis of differential gene expression and gene ranking for RNA-seq data

    Demand Forecasting for Food Production Using Machine Learning Algorithms: A Case Study of University Refectory

    Get PDF
    Accurate food demand forecasting is one of the critical aspects of successfully managing restaurants, cafeterias, canteens, and refectories. This paper aims to develop demand forecasting models for a university refectory. Our study focused on the development of Machine Learning-based forecasting models which take into account the calendar effect and meal ingredients to predict the heavy demand for food within a limited timeframe (e.g., lunch) and without pre-booking. We have developed eighteen prediction models gathered under five main techniques. Three Artificial Neural Network models (i.e., Feed Forward, Function Fitting, and Cascade Forward), four Gauss Process Regression models (i.e., Rational Quadratic, Squared Exponential, Matern 5/2, and Exponential), six Support Vector Regression models (i.e., Linear, Quadratic, Cubic, Fine Gaussian, Medium Gaussian, and Coarse Gaussian), three Regression Tree models (i.e., Fine, Medium, and Coarse), two Ensemble Decision Tree (EDT) models (i.e., Boosted and Bagged) and one Linear Regression model were applied. When evaluated in terms of method diversity, prediction performance, and application area, to the best of our knowledge, this study offers a different contribution from previous studies. The EDT Boosted model obtained the best prediction performance (i.e., Mean Squared Error = 0,51, Mean Absolute Erro = 0,50, and R = 0,96)

    Factorized second order methods in neural networks

    Full text link
    Les mĂ©thodes d'optimisation de premier ordre (descente de gradient) ont permis d'obtenir des succĂšs impressionnants pour entrainer des rĂ©seaux de neurones artificiels. Les mĂ©thodes de second ordre permettent en thĂ©orie d'accĂ©lĂ©rer l'optimisation d'une fonction, mais dans le cas des rĂ©seaux de neurones le nombre de variables est bien trop important. Dans ce mĂ©moire de maitrise, je prĂ©sente les mĂ©thodes de second ordre habituellement appliquĂ©es en optimisation, ainsi que des mĂ©thodes approchĂ©es qui permettent de les appliquer aux rĂ©seaux de neurones profonds. J'introduis un nouvel algorithme basĂ© sur une approximation des mĂ©thodes de second ordre, et je valide empiriquement qu'il prĂ©sente un intĂ©rĂȘt pratique. J'introduis aussi une modification de l'algorithme de rĂ©tropropagation du gradient, utilisĂ© pour calculer efficacement les gradients nĂ©cessaires aux mĂ©thodes d'optimisation.First order optimization methods (gradient descent) have enabled impressive successes for training artificial neural networks. Second order methods theoretically allow accelerating optimization of functions, but in the case of neural networks the number of variables is far too big. In this master's thesis, I present usual second order methods, as well as approximate methods that allow applying them to deep neural networks. I introduce a new algorithm based on an approximation of second order methods, and I experimentally show that it is of practical interest. I also introduce a modification of the backpropagation algorithm, used to efficiently compute the gradients required in optimization

    Advances in epilepsy monitoring by detection and analysis of brain epileptiform discharges

    Get PDF
    Brain interictal and pre-ictal epileptiform discharges (EDs) are transient events occurred between two or before seizure onsets visible in intracranial electroencephalographs. In the diagnosis of epilepsy and localization of seizure sources, both interictal and ictal recordings are extremely informative. For this propose, computerized intelligent spike and seizure detection techniques have been researched and are constantly improving. This is not only to detect more EDs from over the scalp but also to classify epileptic and non-epileptic discharges. Tensor factorization and deep learning are two advanced and powerful techniques which have been recently suggested for ED detection. Here, our main contribution is to review recent ED detection methods with emphasis on multi-way analysis and deep learning approaches. These techniques have opened a new window to the epilepsy diagnosis and management spheres
    corecore