583,901 research outputs found

    MuPlex: multi-objective multiplex PCR assay design

    Get PDF
    We have developed a web-enabled system called MuPlex that aids researchers in the design of multiplex PCR assays. Multiplex PCR is a key technology for an endless list of applications, including detecting infectious microorganisms, whole-genome sequencing and closure, forensic analysis and for enabling flexible yet low-cost genotyping. However, the design of a multiplex PCR assays is computationally challenging because it involves tradeoffs among competing objectives, and extensive computational analysis is required in order to screen out primer-pair cross interactions. With MuPlex, users specify a set of DNA sequences along with primer selection criteria, interaction parameters and the target multiplexing level. MuPlex designs a set of multiplex PCR assays designed to cover as many of the input sequences as possible. MuPlex provides multiple solution alternatives that reveal tradeoffs among competing objectives. MuPlex is uniquely designed for large-scale multiplex PCR assay design in an automated high-throughput environment, where high coverage of potentially thousands of single nucleotide polymorphisms is required. The server is available at

    MuPlex: multi-objective multiplex PCR assay design

    Get PDF
    We have developed a web-enabled system called MuPlex that aids researchers in the design of multiplex PCR assays. Multiplex PCR is a key technology for an endless list of applications, including detecting infectious microorganisms, whole-genome sequencing and closure, forensic analysis and for enabling flexible yet low-cost genotyping. However, the design of a multiplex PCR assays is computationally challenging because it involves tradeoffs among competing objectives, and extensive computational analysis is required in order to screen out primer-pair cross interactions. With MuPlex, users specify a set of DNA sequences along with primer selection criteria, interaction parameters and the target multiplexing level. MuPlex designs a set of multiplex PCR assays designed to cover as many of the input sequences as possible. MuPlex provides multiple solution alternatives that reveal tradeoffs among competing objectives. MuPlex is uniquely designed for large-scale multiplex PCR assay design in an automated high-throughput environment, where high coverage of potentially thousands of single nucleotide polymorphisms is required. The server is available at

    High-Performance Reachability Query Processing under Index Size Restrictions

    Full text link
    In this paper, we propose a scalable and highly efficient index structure for the reachability problem over graphs. We build on the well-known node interval labeling scheme where the set of vertices reachable from a particular node is compactly encoded as a collection of node identifier ranges. We impose an explicit bound on the size of the index and flexibly assign approximate reachability ranges to nodes of the graph such that the number of index probes to answer a query is minimized. The resulting tunable index structure generates a better range labeling if the space budget is increased, thus providing a direct control over the trade off between index size and the query processing performance. By using a fast recursive querying method in conjunction with our index structure, we show that in practice, reachability queries can be answered in the order of microseconds on an off-the-shelf computer - even for the case of massive-scale real world graphs. Our claims are supported by an extensive set of experimental results using a multitude of benchmark and real-world web-scale graph datasets.Comment: 30 page

    Efficient Subgraph Matching on Billion Node Graphs

    Full text link
    The ability to handle large scale graph data is crucial to an increasing number of applications. Much work has been dedicated to supporting basic graph operations such as subgraph matching, reachability, regular expression matching, etc. In many cases, graph indices are employed to speed up query processing. Typically, most indices require either super-linear indexing time or super-linear indexing space. Unfortunately, for very large graphs, super-linear approaches are almost always infeasible. In this paper, we study the problem of subgraph matching on billion-node graphs. We present a novel algorithm that supports efficient subgraph matching for graphs deployed on a distributed memory store. Instead of relying on super-linear indices, we use efficient graph exploration and massive parallel computing for query processing. Our experimental results demonstrate the feasibility of performing subgraph matching on web-scale graph data.Comment: VLDB201

    Ontology selection: ontology evaluation on the real Semantic Web

    Get PDF
    The increasing number of ontologies on the Web and the appearance of large scale ontology repositories has brought the topic of ontology selection in the focus of the semantic web research agenda. Our view is that ontology evaluation is core to ontology selection and that, because ontology selection is performed in an open Web environment, it brings new challenges to ontology evaluation. Unfortunately, current research regards ontology selection and evaluation as two separate topics. Our goal in this paper is to explore how these two tasks relate. In particular, we are interested to get a better understanding of the ontology selection task and filter out the challenges that it brings to ontology evaluation. We discuss requirements posed by the open Web environment on ontology selection, we overview existing work on selection and point out future directions. Our major conclusion is that, even if selection methods still need further development, they have already brought novel approaches to ontology evaluatio

    Fast Shortest Path Distance Estimation in Large Networks

    Full text link
    We study the problem of preprocessing a large graph so that point-to-point shortest-path queries can be answered very fast. Computing shortest paths is a well studied problem, but exact algorithms do not scale to huge graphs encountered on the web, social networks, and other applications. In this paper we focus on approximate methods for distance estimation, in particular using landmark-based distance indexing. This approach involves selecting a subset of nodes as landmarks and computing (offline) the distances from each node in the graph to those landmarks. At runtime, when the distance between a pair of nodes is needed, we can estimate it quickly by combining the precomputed distances of the two nodes to the landmarks. We prove that selecting the optimal set of landmarks is an NP-hard problem, and thus heuristic solutions need to be employed. Given a budget of memory for the index, which translates directly into a budget of landmarks, different landmark selection strategies can yield dramatically different results in terms of accuracy. A number of simple methods that scale well to large graphs are therefore developed and experimentally compared. The simplest methods choose central nodes of the graph, while the more elaborate ones select central nodes that are also far away from one another. The efficiency of the suggested techniques is tested experimentally using five different real world graphs with millions of edges; for a given accuracy, they require as much as 250 times less space than the current approach in the literature which considers selecting landmarks at random. Finally, we study applications of our method in two problems arising naturally in large-scale networks, namely, social search and community detection.Yahoo! Research (internship

    Controlling edge dynamics in complex networks

    Get PDF
    The interaction of distinct units in physical, social, biological and technological systems naturally gives rise to complex network structures. Networks have constantly been in the focus of research for the last decade, with considerable advances in the description of their structural and dynamical properties. However, much less effort has been devoted to studying the controllability of the dynamics taking place on them. Here we introduce and evaluate a dynamical process defined on the edges of a network, and demonstrate that the controllability properties of this process significantly differ from simple nodal dynamics. Evaluation of real-world networks indicates that most of them are more controllable than their randomized counterparts. We also find that transcriptional regulatory networks are particularly easy to control. Analytic calculations show that networks with scale-free degree distributions have better controllability properties than uncorrelated networks, and positively correlated in- and out-degrees enhance the controllability of the proposed dynamics.Comment: Preprint. 24 pages, 4 figures, 2 tables. Source code available at http://github.com/ntamas/netctr
    • …
    corecore