531 research outputs found

    Open Problems in (Hyper)Graph Decomposition

    Full text link
    Large networks are useful in a wide range of applications. Sometimes problem instances are composed of billions of entities. Decomposing and analyzing these structures helps us gain new insights about our surroundings. Even if the final application concerns a different problem (such as traversal, finding paths, trees, and flows), decomposing large graphs is often an important subproblem for complexity reduction or parallelization. This report is a summary of discussions that happened at Dagstuhl seminar 23331 on "Recent Trends in Graph Decomposition" and presents currently open problems and future directions in the area of (hyper)graph decomposition

    CAD Tools for DNA Micro-Array Design, Manufacture and Application

    Get PDF
    Motivation: As the human genome project progresses and some microbial and eukaryotic genomes are recognized, numerous biotechnological processes have attracted increasing number of biologists, bioengineers and computer scientists recently. Biotechnological processes profoundly involve production and analysis of highthroughput experimental data. Numerous sequence libraries of DNA and protein structures of a large number of micro-organisms and a variety of other databases related to biology and chemistry are available. For example, microarray technology, a novel biotechnology, promises to monitor the whole genome at once, so that researchers can study the whole genome on the global level and have a better picture of the expressions among millions of genes simultaneously. Today, it is widely used in many fields- disease diagnosis, gene classification, gene regulatory network, and drug discovery. For example, designing organism specific microarray and analysis of experimental data require combining heterogeneous computational tools that usually differ in the data format; such as, GeneMark for ORF extraction, Promide for DNA probe selection, Chip for probe placement on microarray chip, BLAST to compare sequences, MEGA for phylogenetic analysis, and ClustalX for multiple alignments. Solution: Surprisingly enough, despite huge research efforts invested in DNA array applications, very few works are devoted to computer-aided optimization of DNA array design and manufacturing. Current design practices are dominated by ad-hoc heuristics incorporated in proprietary tools with unknown suboptimality. This will soon become a bottleneck for the new generation of high-density arrays, such as the ones currently being designed at Perlegen [109]. The goal of the already accomplished research was to develop highly scalable tools, with predictable runtime and quality, for cost-effective, computer-aided design and manufacturing of DNA probe arrays. We illustrate the utility of our approach by taking a concrete example of combining the design tools of microarray technology for Harpes B virus DNA data

    A Review of Methodological Approaches for the Design and Optimization of Wind Farms

    Get PDF
    This article presents a review of the state of the art of the Wind Farm Design and Optimization (WFDO) problem. The WFDO problem refers to a set of advanced planning actions needed to extremize the performance of wind farms, which may be composed of a few individual Wind Turbines (WTs) up to thousands of WTs. The WFDO problem has been investigated in different scenarios, with substantial differences in main objectives, modelling assumptions, constraints, and numerical solution methods. The aim of this paper is: (1) to present an exhaustive survey of the literature covering the full span of the subject, an analysis of the state-of-the-art models describing the performance of wind farms as well as its extensions, and the numerical approaches used to solve the problem; (2) to provide an overview of the available knowledge and recent progress in the application of such strategies to real onshore and offshore wind farms; and (3) to propose a comprehensive agenda for future research

    Bayesian inference for protein signalling networks

    Get PDF
    Cellular response to a changing chemical environment is mediated by a complex system of interactions involving molecules such as genes, proteins and metabolites. In particular, genetic and epigenetic variation ensure that cellular response is often highly specific to individual cell types, or to different patients in the clinical setting. Conceptually, cellular systems may be characterised as networks of interacting components together with biochemical parameters specifying rates of reaction. Taken together, the network and parameters form a predictive model of cellular dynamics which may be used to simulate the effect of hypothetical drug regimens. In practice, however, both network topology and reaction rates remain partially or entirely unknown, depending on individual genetic variation and environmental conditions. Prediction under parameter uncertainty is a classical statistical problem. Yet, doubly uncertain prediction, where both parameters and the underlying network topology are unknown, leads to highly non-trivial probability distributions which currently require gross simplifying assumptions to analyse. Recent advances in molecular assay technology now permit high-throughput data-driven studies of cellular dynamics. This thesis sought to develop novel statistical methods in this context, focussing primarily on the problems of (i) elucidating biochemical network topology from assay data and (ii) prediction of dynamical response to therapy when both network and parameters are uncertain

    Multipartite Graph Algorithms for the Analysis of Heterogeneous Data

    Get PDF
    The explosive growth in the rate of data generation in recent years threatens to outpace the growth in computer power, motivating the need for new, scalable algorithms and big data analytic techniques. No field may be more emblematic of this data deluge than the life sciences, where technologies such as high-throughput mRNA arrays and next generation genome sequencing are routinely used to generate datasets of extreme scale. Data from experiments in genomics, transcriptomics, metabolomics and proteomics are continuously being added to existing repositories. A goal of exploratory analysis of such omics data is to illuminate the functions and relationships of biomolecules within an organism. This dissertation describes the design, implementation and application of graph algorithms, with the goal of seeking dense structure in data derived from omics experiments in order to detect latent associations between often heterogeneous entities, such as genes, diseases and phenotypes. Exact combinatorial solutions are developed and implemented, rather than relying on approximations or heuristics, even when problems are exceedingly large and/or difficult. Datasets on which the algorithms are applied include time series transcriptomic data from an experiment on the developing mouse cerebellum, gene expression data measuring acute ethanol response in the prefrontal cortex, and the analysis of a predicted protein-protein interaction network. A bipartite graph model is used to integrate heterogeneous data types, such as genes with phenotypes and microbes with mouse strains. The techniques are then extended to a multipartite algorithm to enumerate dense substructure in multipartite graphs, constructed using data from three or more heterogeneous sources, with applications to functional genomics. Several new theoretical results are given regarding multipartite graphs and the multipartite enumeration algorithm. In all cases, practical implementations are demonstrated to expand the frontier of computational feasibility

    Structural Optimization with Fatigue Constraints

    Get PDF
    corecore