108 research outputs found

    Deep Stacked Stochastic Configuration Networks for Lifelong Learning of Non-Stationary Data Streams

    Full text link
    The concept of SCN offers a fast framework with universal approximation guarantee for lifelong learning of non-stationary data streams. Its adaptive scope selection property enables for proper random generation of hidden unit parameters advancing conventional randomized approaches constrained with a fixed scope of random parameters. This paper proposes deep stacked stochastic configuration network (DSSCN) for continual learning of non-stationary data streams which contributes two major aspects: 1) DSSCN features a self-constructing methodology of deep stacked network structure where hidden unit and hidden layer are extracted automatically from continuously generated data streams; 2) the concept of SCN is developed to randomly assign inverse covariance matrix of multivariate Gaussian function in the hidden node addition step bypassing its computationally prohibitive tuning phase. Numerical evaluation and comparison with prominent data stream algorithms under two procedures: periodic hold-out and prequential test-then-train processes demonstrate the advantage of proposed methodology.Comment: This paper has been published in Information Science

    Stochastic Configuration Machines: FPGA Implementation

    Full text link
    Neural networks for industrial applications generally have additional constraints such as response speed, memory size and power usage. Randomized learners can address some of these issues. However, hardware solutions can provide better resource reduction whilst maintaining the model's performance. Stochastic configuration networks (SCNs) are a prime choice in industrial applications due to their merits and feasibility for data modelling. Stochastic Configuration Machines (SCMs) extend this to focus on reducing the memory constraints by limiting the randomized weights to a binary value with a scalar for each node and using a mechanism model to improve the learning performance and result interpretability. This paper aims to implement SCM models on a field programmable gate array (FPGA) and introduce binary-coded inputs to the algorithm. Results are reported for two benchmark and two industrial datasets, including SCM with single-layer and deep architectures.Comment: 19 pages, 9 figures, 8 table

    SOMEA: self-organizing map based extraction algorithm for DNA motif identification with heterogeneous model

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Discrimination of transcription factor binding sites (TFBS) from background sequences plays a key role in computational motif discovery. Current clustering based algorithms employ homogeneous model for problem solving, which assumes that motifs and background signals can be equivalently characterized. This assumption has some limitations because both sequence signals have distinct properties.</p> <p>Results</p> <p>This paper aims to develop a Self-Organizing Map (SOM) based clustering algorithm for extracting binding sites in DNA sequences. Our framework is based on a novel intra-node soft competitive procedure to achieve maximum discrimination of motifs from background signals in datasets. The intra-node competition is based on an adaptive weighting technique on two different signal models to better represent these two classes of signals. Using several real and artificial datasets, we compared our proposed method with several motif discovery tools. Compared to SOMBRERO, a state-of-the-art SOM based motif discovery tool, it is found that our algorithm can achieve significant improvements in the average precision rates (i.e., about 27%) on the real datasets without compromising its sensitivity. Our method also performed favourably comparing against other motif discovery tools.</p> <p>Conclusions</p> <p>Motif discovery with model based clustering framework should consider the use of heterogeneous model to represent the two classes of signals in DNA sequences. Such heterogeneous model can achieve better signal discrimination compared to the homogeneous model.</p

    Stochastic Configuration Machines for Industrial Artificial Intelligence

    Full text link
    Real-time predictive modelling with desired accuracy is highly expected in industrial artificial intelligence (IAI), where neural networks play a key role. Neural networks in IAI require powerful, high-performance computing devices to operate a large number of floating point data. Based on stochastic configuration networks (SCNs), this paper proposes a new randomized learner model, termed stochastic configuration machines (SCMs), to stress effective modelling and data size saving that are useful and valuable for industrial applications. Compared to SCNs and random vector functional-link (RVFL) nets with binarized implementation, the model storage of SCMs can be significantly compressed while retaining favourable prediction performance. Besides the architecture of the SCM learner model and its learning algorithm, as an important part of this contribution, we also provide a theoretical basis on the learning capacity of SCMs by analysing the model's complexity. Experimental studies are carried out over some benchmark datasets and three industrial applications. The results demonstrate that SCM has great potential for dealing with industrial data analytics.Comment: 23 pages, 7 figures, 12 table

    MISCORE: Mismatch-Based Matrix Similarity Scores for DNA Motif Detection

    Get PDF
    To detect or discover motifs in DNA sequences, two important concepts related to existing computational approaches are motif model and similarity score. One of motif models, represented by a position frequency matrix (PFM), has been widely employed to search for putative motifs. Detection and discovery of motifs can be done by comparing kmers with a motif model, or clustering kmers according to some criteria. In the past, information content based similarity scores have been widely used in searching tools. In this paper, we present a mismatchbased matrix similarity score (namely, MISCORE) for motif searching and discovering purpose. The proposed MISCORE can be biologically interpreted as an evolutionary metric for predicting a kmer as a motif member or not. Weighting factors, which are meaningful for biological data mining practice, are introduced in the MISCORE. The effectiveness of the MISCORE is investigated through exploring its separability, recognizability and robustness. Three well-known information contentbased matrix similarity scores are compared, and results show that our MISCORE works well

    Optimization of MISCORE-based Motif Identification Systems

    Get PDF
    Identification of motifs in DNA sequences using classification techniques is one of computational approaches to discovering novel binding sites. In the previous work [16], we proposed a simple and effective method for motif detection using a single crisp rule governed by a mismatch-based matrix similarity score (MISCORE). In this paper, we consider the problem of finding suitable motif cut-off value for MISCORE-based motif identification systems using cost-sensitivity metric. We utilize phylogenetic footprinting data to estimate the parameters in the cost function. We also extend the MISCORE to include entropy to weigh each motif model position to minimize the false positive rate. The performance evaluation is done by using artificial and real DNA sequences. The results demonstrate the feasibility and usefulness of our proposed approach for model based cut-off value estimation

    Realization of Generalized RBF Network

    Get PDF
    Neural classifiers have been widely used in many application areas. This paper describes generalized neural classifier based on the radial basis function network. The contributions of this work are: i) improvement on the standard radial basis function network architecture, ii) proposed a new cost function for classification, iii) hidden units feature subset selection algorithm, and iv) optimizing the neural classifier using the genetic algorithm with a new cost function. Comparative studies on the proposed neural classifier on protein classification problem are given

    Computational Discovery of Motifs Using Hierarchical Clustering Techniques

    Get PDF
    Discovery of motifs plays a key role in understanding gene regulation in organisms. Existing tools for motif discovery demonstrate some weaknesses in dealing with reliability and scalability. Therefore, development of advanced algorithms for resolving this problem will be useful. This paper aims to develop data mining techniques for discovering motifs. A mismatch based hierarchical clustering algorithm is proposed in this paper, where three heuristic rules for classifying clusters and a post-processing for ranking and refining the clusters are employed in the algorithm. Our algorithm is evaluated using two sets of DNA sequences with comparisons. Results demonstrate that the proposed techniques in this paper outperform MEME, AlignACE and SOMBRERO for most of the testing datasets

    SOMIX: Motifs Discovery in Gene Regulatory Sequences Using Self-Organizing Maps

    Get PDF
    We present a clustering algorithm called Self-organizing Map Neural Network with mixed signals discrimination (SOMIX), to discover binding sites in a set of regulatory regions. Our framework integrates a novel intra-node soft competitive procedure in each node model to achieve maximum discrimination of motif from background signals. The intra-node competition is based on an adaptive weighting technique on two different signal models: position specific scoring matrix and markov chain. Simulations on real and artificial datasets showed that, SOMIX could achieve significant performance improvement in terms of sensitivity and specificity over SOMBRERO, which is a well-known SOM based motif discovery tool. SOMIX has also been found promising comparing against other popular motif discovery tools
    • ā€¦
    corecore