2,682 research outputs found
Recommended from our members
Inference of single-cell phylogenies from lineage tracing data using Cassiopeia.
The pairing of CRISPR/Cas9-based gene editing with massively parallel single-cell readouts now enables large-scale lineage tracing. However, the rapid growth in complexity of data from these assays has outpaced our ability to accurately infer phylogenetic relationships. First, we introduce Cassiopeia-a suite of scalable maximum parsimony approaches for tree reconstruction. Second, we provide a simulation framework for evaluating algorithms and exploring lineage tracer design principles. Finally, we generate the most complex experimental lineage tracing dataset to date, 34,557 human cells continuously traced over 15 generations, and use it for benchmarking phylogenetic inference approaches. We show that Cassiopeia outperforms traditional methods by several metrics and under a wide variety of parameter regimes, and provide insight into the principles for the design of improved Cas9-enabled recorders. Together, these should broadly enable large-scale mammalian lineage tracing efforts. Cassiopeia and its benchmarking resources are publicly available at www.github.com/YosefLab/Cassiopeia
Concurrent Geometric Multicasting
We present MCFR, a multicasting concurrent face routing algorithm that uses
geometric routing to deliver a message from source to multiple targets. We
describe the algorithm's operation, prove it correct, estimate its performance
bounds and evaluate its performance using simulation. Our estimate shows that
MCFR is the first geometric multicast routing algorithm whose message delivery
latency is independent of network size and only proportional to the distance
between the source and the targets. Our simulation indicates that MCFR has
significantly better reliability than existing algorithms
Robust and MaxMin Optimization under Matroid and Knapsack Uncertainty Sets
Consider the following problem: given a set system (U,I) and an edge-weighted
graph G = (U, E) on the same universe U, find the set A in I such that the
Steiner tree cost with terminals A is as large as possible: "which set in I is
the most difficult to connect up?" This is an example of a max-min problem:
find the set A in I such that the value of some minimization (covering) problem
is as large as possible.
In this paper, we show that for certain covering problems which admit good
deterministic online algorithms, we can give good algorithms for max-min
optimization when the set system I is given by a p-system or q-knapsacks or
both. This result is similar to results for constrained maximization of
submodular functions. Although many natural covering problems are not even
approximately submodular, we show that one can use properties of the online
algorithm as a surrogate for submodularity.
Moreover, we give stronger connections between max-min optimization and
two-stage robust optimization, and hence give improved algorithms for robust
versions of various covering problems, for cases where the uncertainty sets are
given by p-systems and q-knapsacks.Comment: 17 pages. Preliminary version combining this paper and
http://arxiv.org/abs/0912.1045 appeared in ICALP 201
A note on the data-driven capacity of P2P networks
We consider two capacity problems in P2P networks. In the first one, the
nodes have an infinite amount of data to send and the goal is to optimally
allocate their uplink bandwidths such that the demands of every peer in terms
of receiving data rate are met. We solve this problem through a mapping from a
node-weighted graph featuring two labels per node to a max flow problem on an
edge-weighted bipartite graph. In the second problem under consideration, the
resource allocation is driven by the availability of the data resource that the
peers are interested in sharing. That is a node cannot allocate its uplink
resources unless it has data to transmit first. The problem of uplink bandwidth
allocation is then equivalent to constructing a set of directed trees in the
overlay such that the number of nodes receiving the data is maximized while the
uplink capacities of the peers are not exceeded. We show that the problem is
NP-complete, and provide a linear programming decomposition decoupling it into
a master problem and multiple slave subproblems that can be resolved in
polynomial time. We also design a heuristic algorithm in order to compute a
suboptimal solution in a reasonable time. This algorithm requires only a local
knowledge from nodes, so it should support distributed implementations.
We analyze both problems through a series of simulation experiments featuring
different network sizes and network densities. On large networks, we compare
our heuristic and its variants with a genetic algorithm and show that our
heuristic computes the better resource allocation. On smaller networks, we
contrast these performances to that of the exact algorithm and show that
resource allocation fulfilling a large part of the peer can be found, even for
hard configuration where no resources are in excess.Comment: 10 pages, technical report assisting a submissio
- …