Search CORE

7,970 research outputs found

Packing a Knapsack of Unknown Capacity

Author: Disser Yann
Klimm Max
Megow Nicole
Stiller Sebastian
Publication venue
Publication date: 10/07/2013
Field of study

We study the problem of packing a knapsack without knowing its capacity. Whenever we attempt to pack an item that does not fit, the item is discarded; if the item fits, we have to include it in the packing. We show that there is always a policy that packs a value within factor 2 of the optimum packing, irrespective of the actual capacity. If all items have unit density, we achieve a factor equal to the golden ratio. Both factors are shown to be best possible. In fact, we obtain the above factors using packing policies that are universal in the sense that they fix a particular order of the items and try to pack the items in this order, independent of the observations made while packing. We give efficient algorithms computing these policies. On the other hand, we show that, for any alpha>1, the problem of deciding whether a given universal policy achieves a factor of alpha is coNP-complete. If alpha is part of the input, the same problem is shown to be coNP-complete for items with unit densities. Finally, we show that it is coNP-hard to decide, for given alpha, whether a set of items admits a universal policy with factor alpha, even if all items have unit densities

arXiv.org e-Print Archive

TUbiblio

Dagstuhl Research Online Publication Server

MPG.PuRe

Merging DNA metabarcoding and ecological network analysis to understand and build resilient terrestrial ecosystems

Author: Evans Darren M.
Kitson James J. N.
Kitson James J.N.
Lunt David H.
Pocock Michael J. O.
Pocock Michael J.O.
Straw Nigel A.
Publication venue: 'Wiley'
Publication date: 23/05/2016
Field of study

Summary 1. Significant advances in both mathematical and molecular approaches in ecology offer unprecedented opportunities to describe and understand ecosystem functioning. Ecological networks describe interactions between species, the underlying structure of communities and the function and stability of ecosystems. They provide the ability to assess the robustness of complex ecological communities to species loss, as well as a novel way of guiding restoration. However, empirically quantifying the interactions between entire communities remains a significant challenge. 2. Concomitantly, advances in DNA sequencing technologies are resolving previously intractable questions in functional and taxonomic biodiversity and provide enormous potential to determine hitherto difficult to observe species interactions. Combining DNA metabarcoding approaches with ecological network analysis presents important new opportunities for understanding large-scale ecological and evolutionary processes, as well as providing powerful tools for building ecosystems that are resilient to environmental change. 3. We propose a novel ‘nested tagging’ metabarcoding approach for the rapid construction of large, phylogenetically structured species-interaction networks. Taking tree–insect–parasitoid ecological networks as an illustration, we show how measures of network robustness, constructed using DNA metabarcoding, can be used to determine the consequences of tree species loss within forests, and forest habitat loss within wider landscapes. By determining which species and habitats are important to network integrity, we propose new directions for forest management. 4. Merging metabarcoding with ecological network analysis provides a revolutionary opportunity to construct some of the largest, phylogenetically structured species-interaction networks to date, providing new ways to: (i) monitor biodiversity and ecosystem functioning; (ii) assess the robustness of interacting communities to species loss; and (iii) build ecosystems that are more resilient to environmental change

Repository@Hull - Worktribe

Crossref

NERC Open Research Archive

A model of large-scale proteome evolution

Author: Kepler Thomas B.
Pastor-Satorras Romualdo
Smith Eric
Sole Ricard V.
Publication venue
Publication date: 01/01/2002
Field of study

The next step in the understanding of the genome organization, after the determination of complete sequences, involves proteomics. The proteome includes the whole set of protein-protein interactions, and two recent independent studies have shown that its topology displays a number of surprising features shared by other complex networks, both natural and artificial. In order to understand the origins of this topology and its evolutionary implications, we present a simple model of proteome evolution that is able to reproduce many of the observed statistical regularities reported from the analysis of the yeast proteome. Our results suggest that the observed patterns can be explained by a process of gene duplication and diversification that would evolve proteome networks under a selection pressure, favoring robustness against failure of its individual components

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Universal Sequencing on an Unreliable Machine

Author: Alberto Marchetti-Spaccamela
Asaf Levin
Bansal N.
Cheung M.
Diedrich F.
Epstein L.
Erdös P.
Hajiaghayi M.T.
Höhn W.
Jia L.
Julián Mestre
Leah Epstein
Leen Stougie
Martin Skutella
Mastrolilli M.
Nicole Megow
Stiller S.
Valiant L.G.
Publication venue
Publication date: 01/01/2011
Field of study

We consider scheduling on an unreliable machine that may experience unexpected changes in processing speed or even full breakdowns. Our objective is to minimize ∑ wjf(Cj) for any nondecreasing, nonnegative, differentiable cost function f(Cj). We aim for a universal solution that performs well without adaptation for all cost functions for any possible machine behavior. We design a deterministic algorithm that finds a universal scheduling sequence with a solution value within 4 times the value of an optimal clairvoyant algorithm that knows the machine behavior in advance. A randomized version of this algorithm attains in expectation a ratio of e. We also show that both performance guarantees are best possible for any unbounded cost function. Our algorithms can be adapted to run in polynomial time with slightly increased cost. When jobs have individual release dates, the situation changes drastically. Even if all weights are equal, there are instances for which any universal solution is a factor of Ω(log n / log log n) worse than an optimal sequence for any unbounded cost function. Motivated by this hardness, we study the special case when the processing time of each job is proportional to its weight. We present a nontrivial algorithm with a small constant performance guarantee

CiteSeerX

Crossref

VU Research Portal

CWI's Institutional Repository

Archivio della ricerca- Università di Roma La Sapienza

MPG.PuRe

PhylOTU: a high-throughput procedure quantifies microbial community diversity and resolves novel taxa from metagenomic data.

Author: Eisen Jonathan A
Green Jessica L
Kembel Steven W
Ladau Joshua
O'Dwyer James P
Pollard Katherine S
Riesenfeld Samantha J
Sharpton Thomas J
Publication venue: eScholarship, University of California
Publication date: 01/01/2011
Field of study

Microbial diversity is typically characterized by clustering ribosomal RNA (SSU-rRNA) sequences into operational taxonomic units (OTUs). Targeted sequencing of environmental SSU-rRNA markers via PCR may fail to detect OTUs due to biases in priming and amplification. Analysis of shotgun sequenced environmental DNA, known as metagenomics, avoids amplification bias but generates fragmentary, non-overlapping sequence reads that cannot be clustered by existing OTU-finding methods. To circumvent these limitations, we developed PhylOTU, a computational workflow that identifies OTUs from metagenomic SSU-rRNA sequence data through the use of phylogenetic principles and probabilistic sequence profiles. Using simulated metagenomic data, we quantified the accuracy with which PhylOTU clusters reads into OTUs. Comparisons of PCR and shotgun sequenced SSU-rRNA markers derived from the global open ocean revealed that while PCR libraries identify more OTUs per sequenced residue, metagenomic libraries recover a greater taxonomic diversity of OTUs. In addition, we discover novel species, genera and families in the metagenomic libraries, including OTUs from phyla missed by analysis of PCR sequences. Taken together, these results suggest that PhylOTU enables characterization of part of the biosphere currently hidden from PCR-based surveys of diversity

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

An Improved Algorithm for Generating Database Transactions from Relational Algebra Specifications

Author: Anamaria Martins Moreira
Daniel J. Dougherty
Daniel Jackson
Daniel Jackson
Ian Mackie
J. Michael Spivey
Jean-Raymond Abrial
José A. Blakeley
Shriram Krishnamurthi
Theophilos Giannakopoulos
Vanessa P. Braganholo
Publication venue: 'Open Publishing Association'
Publication date: 21/12/2009
Field of study

Alloy is a lightweight modeling formalism based on relational algebra. In prior work with Fisler, Giannakopoulos, Krishnamurthi, and Yoo, we have presented a tool, Alchemy, that compiles Alloy specifications into implementations that execute against persistent databases. The foundation of Alchemy is an algorithm for rewriting relational algebra formulas into code for database transactions. In this paper we report on recent progress in improving the robustness and efficiency of this transformation

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

A computational method for estimating the PCR duplication rate in DNA and RNA-seq experiments.

Author: Bansal Vikas
Publication venue: eScholarship, University of California
Publication date: 01/03/2017
Field of study

BackgroundPCR amplification is an important step in the preparation of DNA sequencing libraries prior to high-throughput sequencing. PCR amplification introduces redundant reads in the sequence data and estimating the PCR duplication rate is important to assess the frequency of such reads. Existing computational methods do not distinguish PCR duplicates from "natural" read duplicates that represent independent DNA fragments and therefore, over-estimate the PCR duplication rate for DNA-seq and RNA-seq experiments.ResultsIn this paper, we present a computational method to estimate the average PCR duplication rate of high-throughput sequence datasets that accounts for natural read duplicates by leveraging heterozygous variants in an individual genome. Analysis of simulated data and exome sequence data from the 1000 Genomes project demonstrated that our method can accurately estimate the PCR duplication rate on paired-end as well as single-end read datasets which contain a high proportion of natural read duplicates. Further, analysis of exome datasets prepared using the Nextera library preparation method indicated that 45-50% of read duplicates correspond to natural read duplicates likely due to fragmentation bias. Finally, analysis of RNA-seq datasets from individuals in the 1000 Genomes project demonstrated that 70-95% of read duplicates observed in such datasets correspond to natural duplicates sampled from genes with high expression and identified outlier samples with a 2-fold greater PCR duplication rate than other samples.ConclusionsThe method described here is a useful tool for estimating the PCR duplication rate of high-throughput sequence datasets and for assessing the fraction of read duplicates that correspond to natural read duplicates. An implementation of the method is available at https://github.com/vibansal/PCRduplicates

PubMed Central

eScholarship - University of California