119 research outputs found

    Hybrid Collaborative Filtering with Autoencoders

    Get PDF
    Collaborative Filtering aims at exploiting the feedback of users to provide personalised recommendations. Such algorithms look for latent variables in a large sparse matrix of ratings. They can be enhanced by adding side information to tackle the well-known cold start problem. While Neu-ral Networks have tremendous success in image and speech recognition, they have received less attention in Collaborative Filtering. This is all the more surprising that Neural Networks are able to discover latent variables in large and heterogeneous datasets. In this paper, we introduce a Collaborative Filtering Neural network architecture aka CFN which computes a non-linear Matrix Factorization from sparse rating inputs and side information. We show experimentally on the MovieLens and Douban dataset that CFN outper-forms the state of the art and benefits from side information. We provide an implementation of the algorithm as a reusable plugin for Torch, a popular Neural Network framework

    Ranking of Sites for Installation of Hydropower Plant Using MLP Neural Network Trained with GA: A MADM Approach

    Get PDF
    Every energy system which we consider is an entity by itself, defined by parameters which are interrelated according to some physical laws. In recent year tremendous importance is given in research on site selection in an imprecise environment. In this context, decision making for the suitable location of power plant installation site is an issue of relevance. Environmental impact assessment is often used as a legislative requirement in site selection for decades. The purpose of this current work is to develop a model for decision makers to rank or classify various power plant projects according to multiple criteria attributes such as air quality, water quality, cost of energy delivery, ecological impact, natural hazard, and project duration. The case study in the paper relates to the application of multilayer perceptron trained by genetic algorithm for ranking various power plant locations in India

    Gene Network Modeling through Semi-Fixed Bayesian Network

    Get PDF
    Abstract. Gene networks describe functional pathways in a given cell or tissue, representing processes such as metabolism, gene expression regulation, protein or RNA transport. Thus, learning gene network is a crucial problem in the post genome era. Most existing works learn gene networks by assuming one gene provokes the expression of another gene directly leading to an over-simplified model. In this paper, we show that the gene regulation is a complex problem with many hidden variables. We propose a semi-fixed model to represent the gene network as a Bayesian network with hidden variables. In addition, an effective algorithm to learn the model is presented. Experiments on artificial and real-life dataset confirm the effectiveness of our approach

    Discovery of time-delayed gene regulatory networks based on temporal gene expression profiling

    Get PDF
    BACKGROUND: It is one of the ultimate goals for modern biological research to fully elucidate the intricate interplays and the regulations of the molecular determinants that propel and characterize the progression of versatile life phenomena, to name a few, cell cycling, developmental biology, aging, and the progressive and recurrent pathogenesis of complex diseases. The vast amount of large-scale and genome-wide time-resolved data is becoming increasing available, which provides the golden opportunity to unravel the challenging reverse-engineering problem of time-delayed gene regulatory networks. RESULTS: In particular, this methodological paper aims to reconstruct regulatory networks from temporal gene expression data by using delayed correlations between genes, i.e., pairwise overlaps of expression levels shifted in time relative each other. We have thus developed a novel model-free computational toolbox termed TdGRN (Time-delayed Gene Regulatory Network) to address the underlying regulations of genes that can span any unit(s) of time intervals. This bioinformatics toolbox has provided a unified approach to uncovering time trends of gene regulations through decision analysis of the newly designed time-delayed gene expression matrix. We have applied the proposed method to yeast cell cycling and human HeLa cell cycling and have discovered most of the underlying time-delayed regulations that are supported by multiple lines of experimental evidence and that are remarkably consistent with the current knowledge on phase characteristics for the cell cyclings. CONCLUSION: We established a usable and powerful model-free approach to dissecting high-order dynamic trends of gene-gene interactions. We have carefully validated the proposed algorithm by applying it to two publicly available cell cycling datasets. In addition to uncovering the time trends of gene regulations for cell cycling, this unified approach can also be used to study the complex gene regulations related to the development, aging and progressive pathogenesis of a complex disease where potential dependences between different experiment units might occurs

    Deep Learning for genomic data analysis

    Get PDF
    Desde o Human Genome Project que os dados genómicos se tornam de fácil acesso. Com os inúmeros investimentos na área, as tecnologias de sequenciação de genomas tornam-se mais avançadas e sofisticadas, permitindo assim uma sequenciação mais fácil e mais rápida. Tal quantidade de dados permite uma melhor e mais avançada pesquisa, o que leva a avanços na área. No entanto, este processo de sequenciação produz dados quer de elevada dimensionalidade, quer de elevado volume e para isso são necessários um bom poder computação e algoritmos eficientes de maneira a extrair informação útil num tempo aceitável, o que representa uma barreira no que diz respeito à extração e interpretação da informação.Neste trabalho focamo-nos principalmente nos aspectos biológicos do RNA-Seq e na sua análise usando os métodos mais comuns de Machine learning, e Deep Learning. O trabalho foi dividido em duas vertentes. Na primeira construímos e comparamos a precisão de classificadores que foram capazes de distinguir amostras de RNA-Seq de pacientes com cancro de amostras de pessoas saudáveis. Em segundo lugar foi investigada a possibilidade de construir boas descrições dos dados a partir das diferenças nos dados de expressão genética usando Denoising Autoencoders e Stacked Autoencoders como classificadores base, e depois fazer o pós-processamento dos dados extraídos dos modelos de maneira a conseguir extrair informação importante.Since the Human Genome Project, the availability of genomic data has largely increased. In the last years, genome sequencing technologies and techniques have been improving at a fast rate, resulting in a cheaper and faster genome sequencing. Such amount of data enables both more complex analysis and advances in research. However, a sequencing process quite often produces a huge amount of data that is highly complex. A considerable computational power and efficient algorithms are mandatory in order to extract useful information and perform it in reasonable time, which can represent a constraint on the extraction and comprehension of such information.In this work, we focus on the biological aspects of RNA-Seq and its analysis using traditional Machine Learning and Deep learning methods. We divided our study into two branches. First, we built and compared the accuracy of classifiers that were able distinguish the RNA-seq samples of thyroid cancer patients from samples of healthy persons. Secondly, we have investigated the possibility of building comprehensible descriptions for the differences in the RNA-Seq data by using Denoising Autoencoders and Stacked Denoising Autoencoders as base classifiers and then devising post-processing techniques to extract comprehensible and biologically meaningful descriptions out of the constructed models

    Hybrid Collaborative Filtering with Autoencoders

    Get PDF
    Collaborative Filtering aims at exploiting the feedback of users to provide personalised recommendations. Such algorithms look for latent variables in a large sparse matrix of ratings. They can be enhanced by adding side information to tackle the well-known cold start problem. While Neu-ral Networks have tremendous success in image and speech recognition, they have received less attention in Collaborative Filtering. This is all the more surprising that Neural Networks are able to discover latent variables in large and heterogeneous datasets. In this paper, we introduce a Collaborative Filtering Neural network architecture aka CFN which computes a non-linear Matrix Factorization from sparse rating inputs and side information. We show experimentally on the MovieLens and Douban dataset that CFN outper-forms the state of the art and benefits from side information. We provide an implementation of the algorithm as a reusable plugin for Torch, a popular Neural Network framework

    Data driven approaches for investigating molecular heterogeneity of the brain

    Get PDF
    It has been proposed that one of the clearest organizing principles for most sensory systems is the existence of parallel subcircuits and processing streams that form orderly and systematic mappings from stimulus space to neurons. Although the spatial heterogeneity of the early olfactory circuitry has long been recognized, we know comparatively little about the circuits that propagate sensory signals downstream. Investigating the potential modularity of the bulb’s intrinsic circuits proves to be a difficult task as termination patterns of converging projections, as with the bulb’s inputs, are not feasibly realized. Thus, if such circuit motifs exist, their detection essentially relies on identifying differential gene expression, or “molecular signatures,” that may demarcate functional subregions. With the arrival of comprehensive (whole genome, cellular resolution) datasets in biology and neuroscience, it is now possible for us to carry out large-scale investigations and make particular use of the densely catalogued, whole genome expression maps of the Allen Brain Atlas to carry out systematic investigations of the molecular topography of the olfactory bulb’s intrinsic circuits. To address the challenges associated with high-throughput and high-dimensional datasets, a deep learning approach will form the backbone of our informatic pipeline. In the proposed work, we test the hypothesis that the bulb’s intrinsic circuits are parceled into distinct, parallel modules that can be defined by genome-wide patterns of expression. In pursuit of this aim, our deep learning framework will facilitate the group-registration of the mitral cell layers of ~ 50,000 in-situ olfactory bulb circuits to test this hypothesis

    Classification of Explainable Artificial Intelligence Methods through Their Output Formats

    Get PDF
    Machine and deep learning have proven their utility to generate data-driven models with high accuracy and precision. However, their non-linear, complex structures are often difficult to interpret. Consequently, many scholars have developed a plethora of methods to explain their functioning and the logic of their inferences. This systematic review aimed to organise these methods into a hierarchical classification system that builds upon and extends existing taxonomies by adding a significant dimension—the output formats. The reviewed scientific papers were retrieved by conducting an initial search on Google Scholar with the keywords “explainable artificial intelligence”; “explainable machine learning”; and “interpretable machine learning”. A subsequent iterative search was carried out by checking the bibliography of these articles. The addition of the dimension of the explanation format makes the proposed classification system a practical tool for scholars, supporting them to select the most suitable type of explanation format for the problem at hand. Given the wide variety of challenges faced by researchers, the existing XAI methods provide several solutions to meet the requirements that differ considerably between the users, problems and application fields of artificial intelligence (AI). The task of identifying the most appropriate explanation can be daunting, thus the need for a classification system that helps with the selection of methods. This work concludes by critically identifying the limitations of the formats of explanations and by providing recommendations and possible future research directions on how to build a more generally applicable XAI method. Future work should be flexible enough to meet the many requirements posed by the widespread use of AI in several fields, and the new regulation

    Deep Time-Series Clustering: A Review

    Get PDF
    We present a comprehensive, detailed review of time-series data analysis, with emphasis on deep time-series clustering (DTSC), and a case study in the context of movement behavior clustering utilizing the deep clustering method. Specifically, we modified the DCAE architectures to suit time-series data at the time of our prior deep clustering work. Lately, several works have been carried out on deep clustering of time-series data. We also review these works and identify state-of-the-art, as well as present an outlook on this important field of DTSC from five important perspectives
    corecore