5,407 research outputs found
Finding The Lazy Programmer's Bugs
Traditionally developers and testers created huge numbers of explicit tests, enumerating interesting cases, perhaps
biased by what they believe to be the current boundary conditions of the function being tested. Or at
least, they were supposed to.
A major step forward was the development of property testing. Property testing requires the user to write a few
functional properties that are used to generate tests, and requires an external library or tool to create test data
for the tests. As such many thousands of tests can be created for a single property. For the purely functional
programming language Haskell there are several such libraries; for example QuickCheck [CH00], SmallCheck
and Lazy SmallCheck [RNL08].
Unfortunately, property testing still requires the user to write explicit tests. Fortunately, we note there are
already many implicit tests present in programs. Developers may throw assertion errors, or the compiler may
silently insert runtime exceptions for incomplete pattern matches.
We attempt to automate the testing process using these implicit tests. Our contributions are in four main
areas: (1) We have developed algorithms to automatically infer appropriate constructors and functions needed
to generate test data without requiring additional programmer work or annotations. (2) To combine the
constructors and functions into test expressions we take advantage of Haskell's lazy evaluation semantics by
applying the techniques of needed narrowing and lazy instantiation to guide generation. (3) We keep the type
of test data at its most general, in order to prevent committing too early to monomorphic types that cause
needless wasted tests. (4) We have developed novel ways of creating Haskell case expressions to inspect elements
inside returned data structures, in order to discover exceptions that may be hidden by laziness, and to make
our test data generation algorithm more expressive.
In order to validate our claims, we have implemented these techniques in Irulan, a fully automatic tool for
generating systematic black-box unit tests for Haskell library code. We have designed Irulan to generate high
coverage test suites and detect common programming errors in the process
Basis Token Consistency: A Practical Mechanism for Strong Web Cache Consistency
With web caching and cache-related services like CDNs and edge services playing an increasingly significant role in the modern internet, the problem of the weak consistency and coherence provisions in current web protocols is becoming increasingly significant and drawing the attention of the standards community [LCD01]. Toward this end, we present definitions of consistency and coherence for web-like environments, that is, distributed client-server information systems where the semantics of interactions with resource are more general than the read/write operations found in memory hierarchies and distributed file systems. We then present a brief review of proposed mechanisms which strengthen the consistency of caches in the web, focusing upon their conceptual contributions and their weaknesses in real-world practice. These insights motivate a new mechanism, which we call "Basis Token Consistency" or BTC; when implemented at the server, this mechanism allows any client (independent of the presence and conformity of any intermediaries) to maintain a self-consistent view of the server's state. This is accomplished by annotating responses with additional per-resource application information which allows client caches to recognize the obsolescence of currently cached entities and identify responses from other caches which are already stale in light of what has already been seen. The mechanism requires no deviation from the existing client-server communication model, and does not require servers to maintain any additional per-client state. We discuss how our mechanism could be integrated into a fragment-assembling Content Management System (CMS), and present a simulation-driven performance comparison between the BTC algorithm and the use of the Time-To-Live (TTL) heuristic.National Science Foundation (ANI-9986397, ANI-0095988
On-Line Paging against Adversarially Biased Random Inputs
In evaluating an algorithm, worst-case analysis can be overly pessimistic.
Average-case analysis can be overly optimistic. An intermediate approach is to
show that an algorithm does well on a broad class of input distributions.
Koutsoupias and Papadimitriou recently analyzed the least-recently-used (LRU)
paging strategy in this manner, analyzing its performance on an input sequence
generated by a so-called diffuse adversary -- one that must choose each request
probabilitistically so that no page is chosen with probability more than some
fixed epsilon>0. They showed that LRU achieves the optimal competitive ratio
(for deterministic on-line algorithms), but they didn't determine the actual
ratio.
In this paper we estimate the optimal ratios within roughly a factor of two
for both deterministic strategies (e.g. least-recently-used and
first-in-first-out) and randomized strategies. Around the threshold epsilon ~
1/k (where k is the cache size), the optimal ratios are both Theta(ln k). Below
the threshold the ratios tend rapidly to O(1). Above the threshold the ratio is
unchanged for randomized strategies but tends rapidly to Theta(k) for
deterministic ones.
We also give an alternate proof of the optimality of LRU.Comment: Conference version appeared in SODA '98 as "Bounding the Diffuse
Adversary
The Cost of Address Translation
Modern computers are not random access machines (RAMs). They have a memory
hierarchy, multiple cores, and virtual memory. In this paper, we address the
computational cost of address translation in virtual memory. Starting point for
our work is the observation that the analysis of some simple algorithms (random
scan of an array, binary search, heapsort) in either the RAM model or the EM
model (external memory model) does not correctly predict growth rates of actual
running times. We propose the VAT model (virtual address translation) to
account for the cost of address translations and analyze the algorithms
mentioned above and others in the model. The predictions agree with the
measurements. We also analyze the VAT-cost of cache-oblivious algorithms.Comment: A extended abstract of this paper was published in the proceedings of
ALENEX13, New Orleans, US
Faster linearizability checking via -compositionality
Linearizability is a well-established consistency and correctness criterion
for concurrent data types. An important feature of linearizability is Herlihy
and Wing's locality principle, which says that a concurrent system is
linearizable if and only if all of its constituent parts (so-called objects)
are linearizable. This paper presents -compositionality, which generalizes
the idea behind the locality principle to operations on the same concurrent
data type. We implement -compositionality in a novel linearizability
checker. Our experiments with over nine implementations of concurrent sets,
including Intel's TBB library, show that our linearizability checker is one
order of magnitude faster and/or more space efficient than the state-of-the-art
algorithm.Comment: 15 pages, 2 figure
Fully Dynamic Single-Source Reachability in Practice: An Experimental Study
Given a directed graph and a source vertex, the fully dynamic single-source
reachability problem is to maintain the set of vertices that are reachable from
the given vertex, subject to edge deletions and insertions. It is one of the
most fundamental problems on graphs and appears directly or indirectly in many
and varied applications. While there has been theoretical work on this problem,
showing both linear conditional lower bounds for the fully dynamic problem and
insertions-only and deletions-only upper bounds beating these conditional lower
bounds, there has been no experimental study that compares the performance of
fully dynamic reachability algorithms in practice. Previous experimental
studies in this area concentrated only on the more general all-pairs
reachability or transitive closure problem and did not use real-world dynamic
graphs.
In this paper, we bridge this gap by empirically studying an extensive set of
algorithms for the single-source reachability problem in the fully dynamic
setting. In particular, we design several fully dynamic variants of well-known
approaches to obtain and maintain reachability information with respect to a
distinguished source. Moreover, we extend the existing insertions-only or
deletions-only upper bounds into fully dynamic algorithms. Even though the
worst-case time per operation of all the fully dynamic algorithms we evaluate
is at least linear in the number of edges in the graph (as is to be expected
given the conditional lower bounds) we show in our extensive experimental
evaluation that their performance differs greatly, both on generated as well as
on real-world instances
- …