16,075 research outputs found
Dagstuhl Reports : Volume 1, Issue 2, February 2011
Online Privacy: Towards Informational Self-Determination on the Internet (Dagstuhl Perspectives Workshop 11061) : Simone Fischer-Hübner, Chris Hoofnagle, Kai Rannenberg, Michael Waidner, Ioannis Krontiris and Michael Marhöfer Self-Repairing Programs (Dagstuhl Seminar 11062) : Mauro Pezzé, Martin C. Rinard, Westley Weimer and Andreas Zeller Theory and Applications of Graph Searching Problems (Dagstuhl Seminar 11071) : Fedor V. Fomin, Pierre Fraigniaud, Stephan Kreutzer and Dimitrios M. Thilikos Combinatorial and Algorithmic Aspects of Sequence Processing (Dagstuhl Seminar 11081) : Maxime Crochemore, Lila Kari, Mehryar Mohri and Dirk Nowotka Packing and Scheduling Algorithms for Information and Communication Services (Dagstuhl Seminar 11091) Klaus Jansen, Claire Mathieu, Hadas Shachnai and Neal E. Youn
An Introduction to Programming for Bioscientists: A Python-based Primer
Computing has revolutionized the biological sciences over the past several
decades, such that virtually all contemporary research in the biosciences
utilizes computer programs. The computational advances have come on many
fronts, spurred by fundamental developments in hardware, software, and
algorithms. These advances have influenced, and even engendered, a phenomenal
array of bioscience fields, including molecular evolution and bioinformatics;
genome-, proteome-, transcriptome- and metabolome-wide experimental studies;
structural genomics; and atomistic simulations of cellular-scale molecular
assemblies as large as ribosomes and intact viruses. In short, much of
post-genomic biology is increasingly becoming a form of computational biology.
The ability to design and write computer programs is among the most
indispensable skills that a modern researcher can cultivate. Python has become
a popular programming language in the biosciences, largely because (i) its
straightforward semantics and clean syntax make it a readily accessible first
language; (ii) it is expressive and well-suited to object-oriented programming,
as well as other modern paradigms; and (iii) the many available libraries and
third-party toolkits extend the functionality of the core language into
virtually every biological domain (sequence and structure analyses,
phylogenomics, workflow management systems, etc.). This primer offers a basic
introduction to coding, via Python, and it includes concrete examples and
exercises to illustrate the language's usage and capabilities; the main text
culminates with a final project in structural bioinformatics. A suite of
Supplemental Chapters is also provided. Starting with basic concepts, such as
that of a 'variable', the Chapters methodically advance the reader to the point
of writing a graphical user interface to compute the Hamming distance between
two DNA sequences.Comment: 65 pages total, including 45 pages text, 3 figures, 4 tables,
numerous exercises, and 19 pages of Supporting Information; currently in
press at PLOS Computational Biolog
GROTESQUE: Noisy Group Testing (Quick and Efficient)
Group-testing refers to the problem of identifying (with high probability) a
(small) subset of defectives from a (large) set of items via a "small"
number of "pooled" tests. For ease of presentation in this work we focus on the
regime when D = \cO{N^{1-\gap}} for some \gap > 0. The tests may be
noiseless or noisy, and the testing procedure may be adaptive (the pool
defining a test may depend on the outcome of a previous test), or non-adaptive
(each test is performed independent of the outcome of other tests). A rich body
of literature demonstrates that tests are
information-theoretically necessary and sufficient for the group-testing
problem, and provides algorithms that achieve this performance. However, it is
only recently that reconstruction algorithms with computational complexity that
is sub-linear in have started being investigated (recent work by
\cite{GurI:04,IndN:10, NgoP:11} gave some of the first such algorithms). In the
scenario with adaptive tests with noisy outcomes, we present the first scheme
that is simultaneously order-optimal (up to small constant factors) in both the
number of tests and the decoding complexity (\cO{D\log(N)} in both the
performance metrics). The total number of stages of our adaptive algorithm is
"small" (\cO{\log(D)}). Similarly, in the scenario with non-adaptive tests
with noisy outcomes, we present the first scheme that is simultaneously
near-optimal in both the number of tests and the decoding complexity (via an
algorithm that requires \cO{D\log(D)\log(N)} tests and has a decoding
complexity of {}. Finally, we present an
adaptive algorithm that only requires 2 stages, and for which both the number
of tests and the decoding complexity scale as {}. For all three settings the probability of error of our
algorithms scales as \cO{1/(poly(D)}.Comment: 26 pages, 5 figure
- …