3,612 research outputs found

    Message Passing Clustering with Stochastic Merging Based on Kernel Functions

    Get PDF
    In this paper, we propose a new Stochastic Message Passing Clustering (SMPC) algorithm for clustering biological data based on the Message Passing Clustering (MPC) algorithm, which we introduced in earlier work. MPC has shown its advantage when applied to describing parallel and spontaneous biological processes. SMPC, as a generalized version of MPC, extends the clustering algorithm from a deterministic process to a stochastic process, adding three major advantages. First, in deciding the merging cluster pair, the influences of all clusters are quantified by probabilities, estimated by kernel functions based on their relative distances. Second, the proposed algorithm property resolve the ā€œtieā€ problem, which often occurs for integer distances as in the case of protein interaction data. Third, clustering can be undone to improve the clustering performance when the algorithm detects objects which donā€™t have good probabilities inside the cluster and moves them outside. The test results on colon cancer gene-expression data show that SMPC performs better than the deterministic MPC

    A Dynamic Bayesian Network Model for Hierarchial Classification and its Application in Predicting Yeast Genes Functions

    Get PDF
    In this paper, we propose a Dynamic Naive Bayesian (DNB) network model for classifying data sets with hierarchical labels. The DNB model is built upon a Naive Bayesian (NB) network, a successful classifier for data with flattened (nonhierarchical) class labels. The problems using flattened class labels for hierarchical classification are addressed in this paper. The DNB has a top-down structure with each level of the class hierarchy modeled as a random variable. We defined augmenting operations to transform class hierarchy into a form that satisfies the probability law. We present algorithms for efficient learning and inference with the DNB model. The learning algorithm can be used to estimate the parameters of the network. The inference algorithm is designed to find the optimal classification path in the class hierarchy. The methods are tested on yeast gene expression data sets, and the classification accuracy with DNB classifier is significantly higher than it is with previous approachesā€“ flattened classification using NB classifier

    Dynamics of asynchronous random Boolean networks with asynchrony generated by stochastic processes

    Get PDF
    An asynchronous Boolean network with N nodes whose states at each time point are determined by certain parent nodes is considered. We make use of the models developed by Matache and Heidel [Matache, M.T., Heidel, J., 2005. Asynchronous random Boolean network model based on elementary cellular automata rule 126. Phys. Rev. E 71, 026232] for a constant number of parents, and Matache [Matache, M.T., 2006. Asynchronous random Boolean network model with variable number of parents based on elementary cellular automata rule 126. IJMPB 20 (8), 897ā€“923] for a varying number of parents. In both these papers the authors consider an asynchronous updating of all nodes, with asynchrony generated by various random distributions. We supplement those results by using various stochastic processes as generators for the number of nodes to be updated at each time point. In this paper we use the following stochastic processes: Poisson process, random walk, birth and death process, Brownian motion, and fractional Brownian motion. We study the dynamics of the model through sensitivity of the orbits to initial values, bifurcation diagrams, and fixed-point analysis. The dynamics of the system show that the number of nodes to be updated at each time point is of great importance, especially for the random walk, the birth and death, and the Brownian motion processes. Small or moderate values for the number of updated nodes generate order, while large values may generate chaos depending on the underlying parameters. The Poisson process generates order. With fractional Brownian motion, as the values of the Hurst parameter increase, the system exhibits order for a wider range of combinations of the underlying parameters

    Effects of Hot-Pressing Parameters and Wax Content on the Properties of Fiberboard Made from Paper Mill Sludge

    Get PDF
    Primary sludge combined with 20% secondary sludge was used for the manufacture of fiberboard. A factorial design was carried out to determine the effects of panel density, pressing temperature and time, and wax level on the panel properties of fiberboard. Two levels were employed for each of the four variables, and the panel dimensional stability and mechanical properties were analyzed using Design-Expert software. The statistical analysis indicated that internal bonding (IB) was significantly affected by panel density, pressing temperature, and their interaction. Pressing time and wax level were not directly related to IB. Similarly, modulus of rupture (MOR) was dependent strongly on panel density, pressing temperature, and their interaction, but was not affected by pressing time and wax level. The effect of panel density on modulus of elasticity (MOE) was as strong as on MOR, but the effect of pressing temperature was weaker on MOE than on MOR. MOE was also related to pressing time, but not to wax level. Thickness swelling (TS) was not affected by panel density, but it was significantly dependent on pressing temperature and time. Unexpectedly, wax level did not have significant impact on TS

    Cross-platform Analysis of Cancer Biomarkers: A Bayesian Network Approach to Incorporating Mass Spectrometry and Microarray Data

    Get PDF
    Many studies showed inconsistent cancer biomarkers due to bioinformatics artifacts. In this paper we use multiple data sets from microarrays, mass spectrometry, protein sequences, and other biological knowledge in order to improve the reliability of cancer biomarkers. We present a novel Bayesian network (BN) model which integrates and cross-annotates multiple data sets related to prostate cancer. The main contribution of this study is that we provide a method that is designed to find cancer biomarkers whose presence is supported by multiple data sources and biological knowledge. Relevant biological knowledge is explicitly encoded into the model parameters, and the biomarker finding problem is formulated as a Bayesian inference problem. Besides diagnostic accuracy, we introduce reliability as another quality measurement of the biological relevance of biomarkers. Based on the proposed BN model, we develop an empirical scoring scheme and a simulation algorithm for inferring biomarkers. Fourteen genes/proteins including prostate specific antigen (PSA) are identified as reliable serum biomarkers which are insensitive to the model assumptions. The computational results show that our method is able to find biologically relevant biomarkers with highest reliability while maintaining competitive predictive power. In addition, by combining biological knowledge and data from multiple platforms, the number of putative biomarkers is greatly reduced to allow more-focused clinical studies

    Applications of Hidden Markov Models in Microarray Gene Expression Data

    Get PDF
    Hidden Markov models (HMMs) are well developed statistical models to capture hidden information from observable sequential symbols. They were first used in speech recognition in 1970s and have been successfully applied to the analysis of biological sequences since late 1980s as in finding protein secondary structure, CpG islands and families of related DNA or protein sequences [1]. In a HMM, the system being modeled is assumed to be a Markov process with unknown parameters, and the challenge is to determine the hidden parameters from the observable parameters. In this chapter, we described two applications using HMMs to predict gene functions in yeast and DNA copy number alternations in human tumor cells, based on gene expression microarray data
    • ā€¦
    corecore