4,086 research outputs found
An Efficient Algorithm For Chinese Postman Walk on Bi-directed de Bruijn Graphs
Sequence assembly from short reads is an important problem in biology. It is
known that solving the sequence assembly problem exactly on a bi-directed de
Bruijn graph or a string graph is intractable. However finding a Shortest
Double stranded DNA string (SDDNA) containing all the k-long words in the reads
seems to be a good heuristic to get close to the original genome. This problem
is equivalent to finding a cyclic Chinese Postman (CP) walk on the underlying
un-weighted bi-directed de Bruijn graph built from the reads. The Chinese
Postman walk Problem (CPP) is solved by reducing it to a general bi-directed
flow on this graph which runs in O(|E|2 log2(|V |)) time. In this paper we show
that the cyclic CPP on bi-directed graphs can be solved without reducing it to
bi-directed flow. We present a ?(p(|V | + |E|) log(|V |) + (dmaxp)3) time
algorithm to solve the cyclic CPP on a weighted bi-directed de Bruijn graph,
where p = max{|{v|din(v) - dout(v) > 0}|, |{v|din(v) - dout(v) < 0}|} and dmax
= max{|din(v) - dout(v)}. Our algorithm performs asymptotically better than the
bidirected flow algorithm when the number of imbalanced nodes p is much less
than the nodes in the bi-directed graph. From our experimental results on
various datasets, we have noticed that the value of p/|V | lies between 0.08%
and 0.13% with 95% probability
Determining the Most Vital Arcs within a Multi-Mode Communication Network Using Set-Based Measures
Technology has dramatically changed the way the military has disseminated information over the last fifty years. The Air Force has adapted to the change by operating a network with various ways to disseminate information. The Air Operating Center (AOC) is a large contributor to disseminating information in the Air Force. When the standard mode of sending information is disrupted, the AOC seeks both alternative ways available to send information and long term approaches to decrease vulnerability of its standard procedures. In this thesis, we seek to identify and quantify the most vital components within a multi-mode communications network via a combination of a set-based efficiency and set-based cost efficiency measures that utilize the all pairs shortest path (APSP) problem and minimum cost flow (MCF) problem. We capture the phenomenon that network components must work together to provide flow by examining how the network performs when sets of arcs are disrupted. We run 125 different computational experiments examining varying degrees of damage experienced by the network. From these results, we deduce insights into the characteristics of the most vital arcs in a multi-mode communication network which can inform future fortification decisions
Empirical Evaluation of Mutation-based Test Prioritization Techniques
We propose a new test case prioritization technique that combines both
mutation-based and diversity-based approaches. Our diversity-aware
mutation-based technique relies on the notion of mutant distinguishment, which
aims to distinguish one mutant's behavior from another, rather than from the
original program. We empirically investigate the relative cost and
effectiveness of the mutation-based prioritization techniques (i.e., using both
the traditional mutant kill and the proposed mutant distinguishment) with 352
real faults and 553,477 developer-written test cases. The empirical evaluation
considers both the traditional and the diversity-aware mutation criteria in
various settings: single-objective greedy, hybrid, and multi-objective
optimization. The results show that there is no single dominant technique
across all the studied faults. To this end, \rev{we we show when and the reason
why each one of the mutation-based prioritization criteria performs poorly,
using a graphical model called Mutant Distinguishment Graph (MDG) that
demonstrates the distribution of the fault detecting test cases with respect to
mutant kills and distinguishment
Quantifying the benefits of vehicle pooling with shareability networks
Taxi services are a vital part of urban transportation, and a considerable
contributor to traffic congestion and air pollution causing substantial adverse
effects on human health. Sharing taxi trips is a possible way of reducing the
negative impact of taxi services on cities, but this comes at the expense of
passenger discomfort quantifiable in terms of a longer travel time. Due to
computational challenges, taxi sharing has traditionally been approached on
small scales, such as within airport perimeters, or with dynamical ad-hoc
heuristics. However, a mathematical framework for the systematic understanding
of the tradeoff between collective benefits of sharing and individual passenger
discomfort is lacking. Here we introduce the notion of shareability network
which allows us to model the collective benefits of sharing as a function of
passenger inconvenience, and to efficiently compute optimal sharing strategies
on massive datasets. We apply this framework to a dataset of millions of taxi
trips taken in New York City, showing that with increasing but still relatively
low passenger discomfort, cumulative trip length can be cut by 40% or more.
This benefit comes with reductions in service cost, emissions, and with split
fares, hinting towards a wide passenger acceptance of such a shared service.
Simulation of a realistic online system demonstrates the feasibility of a
shareable taxi service in New York City. Shareability as a function of trip
density saturates fast, suggesting effectiveness of the taxi sharing system
also in cities with much sparser taxi fleets or when willingness to share is
low.Comment: Main text: 6 pages, 3 figures, SI: 24 page
The Merits of Sharing a Ride
The culture of sharing instead of ownership is sharply increasing in
individuals behaviors. Particularly in transportation, concepts of sharing a
ride in either carpooling or ridesharing have been recently adopted. An
efficient optimization approach to match passengers in real-time is the core of
any ridesharing system. In this paper, we model ridesharing as an online
matching problem on general graphs such that passengers do not drive private
cars and use shared taxis. We propose an optimization algorithm to solve it.
The outlined algorithm calculates the optimal waiting time when a passenger
arrives. This leads to a matching with minimal overall overheads while
maximizing the number of partnerships. To evaluate the behavior of our
algorithm, we used NYC taxi real-life data set. Results represent a substantial
reduction in overall overheads
Recommended from our members
Spatial Ecology of Great Barracuda (Sphyraena barracuda) around Buck Island Reef National Monument, St. Croix, U.S.V.I.
Marine protected areas (MPAs) are increasing in popularity as a tool to manage fish stocks through conservation of entire habitats and fish assemblages. Quantifying the habitat use, site fidelity, and movement patterns of marine species is vital to this method of marine spatial planning. The success of these protected areas requires that sufficient habitat is guarded against fishing pressure. For large animals, which often have correspondingly large home range areas, protecting an entire home range can be logistically challenging. For MPAs to successfully protect large top predator species, it is important to understand what areas of a home range are especially important, such as breeding and feeding grounds. New technologies, such as acoustic telemetry, have made it possible to track marine animal movements at finer spatial and temporal scales than previously possible, better illuminating these spatial use patterns. This study focused on the movement patterns of great barracuda (n=35), an ecologically important top predator, around Buck Island Reef National Monument, a no-take MPA in St. Croix, U.S.V.I. managed by the National Park Service. As developing standardized methods for acoustic telemetry is still a work in progress, the first half of this study focuses on determining appropriate tools for generating home range size estimates for great barracuda and analyzing ecological parameters driving these results. The second half of this study focused on the use of network analysis to look at spatial divisions within individual home ranges and to compare individual to population level spatial patterns, as well as to generate a relative estimate of population density within the park. Barracuda within the park demonstrated high site fidelity to individual territories, but at the population level they consistently used all habitats within the array. Core use areas within home ranges were evenly distributed throughout all habitats monitored by the acoustic array, although movement corridors were detected along high rugosity reef structures. Greater population densities within the park indicate that density dependent behaviors may be influencing habitat use within the park, and suggest that barracuda are contributing high levels of top down pressure through predation within the park boundaries
- …