4,086 research outputs found

    An Efficient Algorithm For Chinese Postman Walk on Bi-directed de Bruijn Graphs

    Full text link
    Sequence assembly from short reads is an important problem in biology. It is known that solving the sequence assembly problem exactly on a bi-directed de Bruijn graph or a string graph is intractable. However finding a Shortest Double stranded DNA string (SDDNA) containing all the k-long words in the reads seems to be a good heuristic to get close to the original genome. This problem is equivalent to finding a cyclic Chinese Postman (CP) walk on the underlying un-weighted bi-directed de Bruijn graph built from the reads. The Chinese Postman walk Problem (CPP) is solved by reducing it to a general bi-directed flow on this graph which runs in O(|E|2 log2(|V |)) time. In this paper we show that the cyclic CPP on bi-directed graphs can be solved without reducing it to bi-directed flow. We present a ?(p(|V | + |E|) log(|V |) + (dmaxp)3) time algorithm to solve the cyclic CPP on a weighted bi-directed de Bruijn graph, where p = max{|{v|din(v) - dout(v) > 0}|, |{v|din(v) - dout(v) < 0}|} and dmax = max{|din(v) - dout(v)}. Our algorithm performs asymptotically better than the bidirected flow algorithm when the number of imbalanced nodes p is much less than the nodes in the bi-directed graph. From our experimental results on various datasets, we have noticed that the value of p/|V | lies between 0.08% and 0.13% with 95% probability

    Determining the Most Vital Arcs within a Multi-Mode Communication Network Using Set-Based Measures

    Get PDF
    Technology has dramatically changed the way the military has disseminated information over the last fifty years. The Air Force has adapted to the change by operating a network with various ways to disseminate information. The Air Operating Center (AOC) is a large contributor to disseminating information in the Air Force. When the standard mode of sending information is disrupted, the AOC seeks both alternative ways available to send information and long term approaches to decrease vulnerability of its standard procedures. In this thesis, we seek to identify and quantify the most vital components within a multi-mode communications network via a combination of a set-based efficiency and set-based cost efficiency measures that utilize the all pairs shortest path (APSP) problem and minimum cost flow (MCF) problem. We capture the phenomenon that network components must work together to provide flow by examining how the network performs when sets of arcs are disrupted. We run 125 different computational experiments examining varying degrees of damage experienced by the network. From these results, we deduce insights into the characteristics of the most vital arcs in a multi-mode communication network which can inform future fortification decisions

    Empirical Evaluation of Mutation-based Test Prioritization Techniques

    Full text link
    We propose a new test case prioritization technique that combines both mutation-based and diversity-based approaches. Our diversity-aware mutation-based technique relies on the notion of mutant distinguishment, which aims to distinguish one mutant's behavior from another, rather than from the original program. We empirically investigate the relative cost and effectiveness of the mutation-based prioritization techniques (i.e., using both the traditional mutant kill and the proposed mutant distinguishment) with 352 real faults and 553,477 developer-written test cases. The empirical evaluation considers both the traditional and the diversity-aware mutation criteria in various settings: single-objective greedy, hybrid, and multi-objective optimization. The results show that there is no single dominant technique across all the studied faults. To this end, \rev{we we show when and the reason why each one of the mutation-based prioritization criteria performs poorly, using a graphical model called Mutant Distinguishment Graph (MDG) that demonstrates the distribution of the fault detecting test cases with respect to mutant kills and distinguishment

    Quantifying the benefits of vehicle pooling with shareability networks

    Get PDF
    Taxi services are a vital part of urban transportation, and a considerable contributor to traffic congestion and air pollution causing substantial adverse effects on human health. Sharing taxi trips is a possible way of reducing the negative impact of taxi services on cities, but this comes at the expense of passenger discomfort quantifiable in terms of a longer travel time. Due to computational challenges, taxi sharing has traditionally been approached on small scales, such as within airport perimeters, or with dynamical ad-hoc heuristics. However, a mathematical framework for the systematic understanding of the tradeoff between collective benefits of sharing and individual passenger discomfort is lacking. Here we introduce the notion of shareability network which allows us to model the collective benefits of sharing as a function of passenger inconvenience, and to efficiently compute optimal sharing strategies on massive datasets. We apply this framework to a dataset of millions of taxi trips taken in New York City, showing that with increasing but still relatively low passenger discomfort, cumulative trip length can be cut by 40% or more. This benefit comes with reductions in service cost, emissions, and with split fares, hinting towards a wide passenger acceptance of such a shared service. Simulation of a realistic online system demonstrates the feasibility of a shareable taxi service in New York City. Shareability as a function of trip density saturates fast, suggesting effectiveness of the taxi sharing system also in cities with much sparser taxi fleets or when willingness to share is low.Comment: Main text: 6 pages, 3 figures, SI: 24 page

    The Merits of Sharing a Ride

    Full text link
    The culture of sharing instead of ownership is sharply increasing in individuals behaviors. Particularly in transportation, concepts of sharing a ride in either carpooling or ridesharing have been recently adopted. An efficient optimization approach to match passengers in real-time is the core of any ridesharing system. In this paper, we model ridesharing as an online matching problem on general graphs such that passengers do not drive private cars and use shared taxis. We propose an optimization algorithm to solve it. The outlined algorithm calculates the optimal waiting time when a passenger arrives. This leads to a matching with minimal overall overheads while maximizing the number of partnerships. To evaluate the behavior of our algorithm, we used NYC taxi real-life data set. Results represent a substantial reduction in overall overheads
    • …
    corecore