16,140 research outputs found
Towards the k-server conjecture: A unifying potential, pushing the frontier to the circle
The k-server conjecture, first posed by Manasse, McGeoch and Sleator in 1988, states that a k-competitive deterministic algorithm for the k-server problem exists. It is conjectured that the work function algorithm (WFA) achieves this guarantee, a multi-purpose algorithm with applications to various online problems. This has been shown for several special cases: k = 2, (k + 1)-point metrics, (k + 2)-point metrics, the line metric, weighted star metrics, and k = 3 in the Manhattan plane. The known proofs of these results are based on potential functions tied to each particular special case, thus requiring six different potential functions for the six cases. We present a single potential function proving k-competitiveness of WFA for all these cases. We also use this potential to show k-competitiveness of WFA on multiray spaces and for k = 3 on trees. While the Double Coverage algorithm was known to be k-competitive for these latter cases, it has been open for WFA. Our potential captures a type of lazy adversary and thus shows that in all settled cases, the worst-case adversary is lazy. Chrobak and Larmore conjectured in 1992 that a potential capturing the lazy adversary would resolve the k-server conjecture. To our major surprise, this is not the case, as we show (using connections to the k-taxi problem) that our potential fails for three servers on the circle. Thus, our potential highlights laziness of the adversary as a fundamental property that is shared by all settled cases but violated in general. On the one hand, this weakens our confidence in the validity of the k-server conjecture. On the other hand, if the k-server conjecture holds, then we believe it can be proved by a variant of our potential
Quantifying the benefits of vehicle pooling with shareability networks
Taxi services are a vital part of urban transportation, and a considerable
contributor to traffic congestion and air pollution causing substantial adverse
effects on human health. Sharing taxi trips is a possible way of reducing the
negative impact of taxi services on cities, but this comes at the expense of
passenger discomfort quantifiable in terms of a longer travel time. Due to
computational challenges, taxi sharing has traditionally been approached on
small scales, such as within airport perimeters, or with dynamical ad-hoc
heuristics. However, a mathematical framework for the systematic understanding
of the tradeoff between collective benefits of sharing and individual passenger
discomfort is lacking. Here we introduce the notion of shareability network
which allows us to model the collective benefits of sharing as a function of
passenger inconvenience, and to efficiently compute optimal sharing strategies
on massive datasets. We apply this framework to a dataset of millions of taxi
trips taken in New York City, showing that with increasing but still relatively
low passenger discomfort, cumulative trip length can be cut by 40% or more.
This benefit comes with reductions in service cost, emissions, and with split
fares, hinting towards a wide passenger acceptance of such a shared service.
Simulation of a realistic online system demonstrates the feasibility of a
shareable taxi service in New York City. Shareability as a function of trip
density saturates fast, suggesting effectiveness of the taxi sharing system
also in cities with much sparser taxi fleets or when willingness to share is
low.Comment: Main text: 6 pages, 3 figures, SI: 24 page
Long-Term Average Cost in Featured Transition Systems
A software product line is a family of software products that share a common
set of mandatory features and whose individual products are differentiated by
their variable (optional or alternative) features. Family-based analysis of
software product lines takes as input a single model of a complete product line
and analyzes all its products at the same time. As the number of products in a
software product line may be large, this is generally preferable to analyzing
each product on its own. Family-based analysis, however, requires that standard
algorithms be adapted to accomodate variability.
In this paper we adapt the standard algorithm for computing limit average
cost of a weighted transition system to software product lines. Limit average
is a useful and popular measure for the long-term average behavior of a quality
attribute such as performance or energy consumption, but has hitherto not been
available for family-based analysis of software product lines. Our algorithm
operates on weighted featured transition systems, at a symbolic level, and
computes limit average cost for all products in a software product line at the
same time. We have implemented the algorithm and evaluated it on several
examples
Detecting Outliers in Data with Correlated Measures
Advances in sensor technology have enabled the collection of large-scale
datasets. Such datasets can be extremely noisy and often contain a significant
amount of outliers that result from sensor malfunction or human operation
faults. In order to utilize such data for real-world applications, it is
critical to detect outliers so that models built from these datasets will not
be skewed by outliers.
In this paper, we propose a new outlier detection method that utilizes the
correlations in the data (e.g., taxi trip distance vs. trip time). Different
from existing outlier detection methods, we build a robust regression model
that explicitly models the outliers and detects outliers simultaneously with
the model fitting.
We validate our approach on real-world datasets against methods specifically
designed for each dataset as well as the state of the art outlier detectors.
Our outlier detection method achieves better performances, demonstrating the
robustness and generality of our method. Last, we report interesting case
studies on some outliers that result from atypical events.Comment: 10 page
Supersampling and network reconstruction of urban mobility
Understanding human mobility is of vital importance for urban planning,
epidemiology, and many other fields that aim to draw policies from the
activities of humans in space. Despite recent availability of large scale data
sets related to human mobility such as GPS traces, mobile phone data, etc., it
is still true that such data sets represent a subsample of the population of
interest, and then might give an incomplete picture of the entire population in
question. Notwithstanding the abundant usage of such inherently limited data
sets, the impact of sampling biases on mobility patterns is unclear -- we do
not have methods available to reliably infer mobility information from a
limited data set. Here, we investigate the effects of sampling using a data set
of millions of taxi movements in New York City. On the one hand, we show that
mobility patterns are highly stable once an appropriate simple rescaling is
applied to the data, implying negligible loss of information due to subsampling
over long time scales. On the other hand, contrasting an appropriate null model
on the weighted network of vehicle flows reveals distinctive features which
need to be accounted for. Accordingly, we formulate a "supersampling"
methodology which allows us to reliably extrapolate mobility data from a
reduced sample and propose a number of network-based metrics to reliably assess
its quality (and that of other human mobility models). Our approach provides a
well founded way to exploit temporal patterns to save effort in recording
mobility data, and opens the possibility to scale up data from limited records
when information on the full system is needed.Comment: 14 pages, 4 figure
Penalized estimation in large-scale generalized linear array models
Large-scale generalized linear array models (GLAMs) can be challenging to
fit. Computation and storage of its tensor product design matrix can be
impossible due to time and memory constraints, and previously considered design
matrix free algorithms do not scale well with the dimension of the parameter
vector. A new design matrix free algorithm is proposed for computing the
penalized maximum likelihood estimate for GLAMs, which, in particular, handles
nondifferentiable penalty functions. The proposed algorithm is implemented and
available via the R package \verb+glamlasso+. It combines several ideas --
previously considered separately -- to obtain sparse estimates while at the
same time efficiently exploiting the GLAM structure. In this paper the
convergence of the algorithm is treated and the performance of its
implementation is investigated and compared to that of \verb+glmnet+ on
simulated as well as real data. It is shown that the computation time fo
- …