Search CORE

1,687 research outputs found

A comparison of graph-theoretic DNA hybridization models

Author: Ben-Jacob
Brijder
Jan Van den Bussche
Jonoska
Jonoska
Joris J.M. Gillis
Robert Brijder
Rothemund
Rothemund
Whitesides
Winfree
Winfree
Publication venue: 'Elsevier BV'
Publication date
Field of study

Data Mining Using the Crossing Minimization Paradigm

Author: Abdullah Ahsan
Publication venue: University of Stirling
Publication date: 01/01/2007
Field of study

Our ability and capacity to generate, record and store multi-dimensional, apparently unstructured data is increasing rapidly, while the cost of data storage is going down. The data recorded is not perfect, as noise gets introduced in it from different sources. Some of the basic forms of noise are incorrect recording of values and missing values. The formal study of discovering useful hidden information in the data is called Data Mining. Because of the size, and complexity of the problem, practical data mining problems are best attempted using automatic means. Data Mining can be categorized into two types i.e. supervised learning or classification and unsupervised learning or clustering. Clustering only the records in a database (or data matrix) gives a global view of the data and is called one-way clustering. For a detailed analysis or a local view, biclustering or co-clustering or two-way clustering is required involving the simultaneous clustering of the records and the attributes. In this dissertation, a novel fast and white noise tolerant data mining solution is proposed based on the Crossing Minimization (CM) paradigm; the solution works for one-way as well as two-way clustering for discovering overlapping biclusters. For decades the CM paradigm has traditionally been used for graph drawing and VLSI (Very Large Scale Integration) circuit design for reducing wire length and congestion. The utility of the proposed technique is demonstrated by comparing it with other biclustering techniques using simulated noisy, as well as real data from Agriculture, Biology and other domains. Two other interesting and hard problems also addressed in this dissertation are (i) the Minimum Attribute Subset Selection (MASS) problem and (ii) Bandwidth Minimization (BWM) problem of sparse matrices. The proposed CM technique is demonstrated to provide very convincing results while attempting to solve the said problems using real public domain data. Pakistan is the fourth largest supplier of cotton in the world. An apparent anomaly has been observed during 1989-97 between cotton yield and pesticide consumption in Pakistan showing unexpected periods of negative correlation. By applying the indigenous CM technique for one-way clustering to real Agro-Met data (2001-2002), a possible explanation of the anomaly has been presented in this thesis

CiteSeerX

Stirling Online Research Repository

Computational Molecular Biology

Author: Lenhof H.
Mutzel P.
Vingron M.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1996
Field of study

Computational Biology is a fairly new subject that arose in response to the computational problems posed by the analysis and the processing of biomolecular sequence and structure data. The field was initiated in the late 60's and early 70's largely by pioneers working in the life sciences. Physicists and mathematicians entered the field in the 70's and 80's, while Computer Science became involved with the new biological problems in the late 1980's. Computational problems have gained further importance in molecular biology through the various genome projects which produce enormous amounts of data. For this bibliography we focus on those areas of computational molecular biology that involve discrete algorithms or discrete optimization. We thus neglect several other areas of computational molecular biology, like most of the literature on the protein folding problem, as well as databases for molecular and genetic data, and genetic mapping algorithms. Due to the availability of review papers and a bibliography this bibliography

MPG.PuRe

9th IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS)

Author: Atwal Gurinder Singh “Mickey”
Dimitrova Nevenka
Vikalo Haris
Yoon Byung-Jun
Publication venue
Publication date: 10/11/2010
Field of study

Cold Spring Harbor Laboratory Institutional Repository

Noise and information transmission in promoters with multiple internal states

Author: Rieckh Georg
Tkačik Gašper
Publication venue: 'Elsevier BV'
Publication date: 29/12/2013
Field of study

Based on the measurements of noise in gene expression performed during the last decade, it has become customary to think of gene regulation in terms of a two-state model, where the promoter of a gene can stochastically switch between an ON and an OFF state. As experiments are becoming increasingly precise and the deviations from the two-state model start to be observable, we ask about the experimental signatures of complex multi-state promoters, as well as the functional consequences of this additional complexity. In detail, we (i) extend the calculations for noise in gene expression to promoters described by state transition diagrams with multiple states, (ii) systematically compute the experimentally accessible noise characteristics for these complex promoters, and (iii) use information theory to evaluate the channel capacities of complex promoter architectures and compare them to the baseline provided by the two-state model. We find that adding internal states to the promoter generically decreases channel capacity, except in certain cases, three of which (cooperativity, dual-role regulation, promoter cycling) we analyze in detail.Comment: 16 pages, 9 figure

arXiv.org e-Print Archive

Elsevier - Publisher Connector

PubMed Central

IST Austria: PubRep (Institute of Science and Technology)

Tailored graph ensembles as proxies or null models for real networks II: results on directed graphs

Author: A C C Coolen
Annibale A
Coolen A C C
E S Roberts
Memisević V
T Schlitt
Publication venue: 'IOP Publishing'
Publication date: 01/01/1101
Field of study

We generate new mathematical tools with which to quantify the macroscopic topological structure of large directed networks. This is achieved via a statistical mechanical analysis of constrained maximum entropy ensembles of directed random graphs with prescribed joint distributions for in- and outdegrees and prescribed degree-degree correlation functions. We calculate exact and explicit formulae for the leading orders in the system size of the Shannon entropies and complexities of these ensembles, and for information-theoretic distances. The results are applied to data on gene regulation networks.Comment: 21 pages, 1 figure, submitted to J. Phys.

arXiv.org e-Print Archive

CiteSeerX

Crossref

King's Research Portal

A temporal precedence based clustering method for gene expression microarray data

Author: Buchanan-Wollaston Vicky
Krishna Ritesh V.
Li Chang-Tsun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

Background: Time-course microarray experiments can produce useful data which can help in understanding the underlying dynamics of the system. Clustering is an important stage in microarray data analysis where the data is grouped together according to certain characteristics. The majority of clustering techniques are based on distance or visual similarity measures which may not be suitable for clustering of temporal microarray data where the sequential nature of time is important. We present a Granger causality based technique to cluster temporal microarray gene expression data, which measures the interdependence between two time-series by statistically testing if one time-series can be used for forecasting the other time-series or not. Results: A gene-association matrix is constructed by testing temporal relationships between pairs of genes using the Granger causality test. The association matrix is further analyzed using a graph-theoretic technique to detect highly connected components representing interesting biological modules. We test our approach on synthesized datasets and real biological datasets obtained for Arabidopsis thaliana. We show the effectiveness of our approach by analyzing the results using the existing biological literature. We also report interesting structural properties of the association network commonly desired in any biological system. Conclusions: Our experiments on synthesized and real microarray datasets show that our approach produces encouraging results. The method is simple in implementation and is statistically traceable at each step. The method can produce sets of functionally related genes which can be further used for reverse-engineering of gene circuits

Deakin Research Online

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Warwick Research Archives Portal Repository

Extracting Gene Networks for Low-Dose Radiation Using Graph Theoretical Algorithms

Author: Andy D Perkins
Arnold M Saxton
Bhavesh Borate
Brynn H Voy
David A Boothman
Elissa J Chesler
Jon A Scharff
Lisa K Branstetter
Michael A Langston
Publication venue: Public Library of Science
Publication date: 01/01/2006
Field of study

Genes with common functions often exhibit correlated expression levels, which can be used to identify sets of interacting genes from microarray data. Microarrays typically measure expression across genomic space, creating a massive matrix of co-expression that must be mined to extract only the most relevant gene interactions. We describe a graph theoretical approach to extracting co-expressed sets of genes, based on the computation of cliques. Unlike the results of traditional clustering algorithms, cliques are not disjoint and allow genes to be assigned to multiple sets of interacting partners, consistent with biological reality. A graph is created by thresholding the correlation matrix to include only the correlations most likely to signify functional relationships. Cliques computed from the graph correspond to sets of genes for which significant edges are present between all members of the set, representing potential members of common or interacting pathways. Clique membership can be used to infer function about poorly annotated genes, based on the known functions of better-annotated genes with which they share clique membership (i.e., “guilt-by-association”). We illustrate our method by applying it to microarray data collected from the spleens of mice exposed to low-dose ionizing radiation. Differential analysis is used to identify sets of genes whose interactions are impacted by radiation exposure. The correlation graph is also queried independently of clique to extract edges that are impacted by radiation. We present several examples of multiple gene interactions that are altered by radiation exposure and thus represent potential molecular pathways that mediate the radiation response

CiteSeerX

Crossref

The Jackson Laboratory: The Mouseion at the JAXlibrary

Directory of Open Access Journals

PubMed Central