4,914 research outputs found
Grafalgo - A Library of Graph Algorithms and Supporting Data Structures (revised)
This report provides an (updated) overview of {\sl Grafalgo}, an open-source
library of graph algorithms and the data structures used to implement them. The
programs in this library were originally written to support a graduate class in
advanced data structures and algorithms at Washington University. Because the
code's primary purpose was pedagogical, it was written to be as straightforward
as possible, while still being highly efficient. Grafalgo is implemented in C++
and incorporates some features of C++11.
The library is available on an open-source basis and may be downloaded from
https://code.google.com/p/grafalgo/. Source code documentation is at
www.arl.wustl.edu/\textasciitilde jst/doc/grafalgo. While not designed as
production code, the library is suitable for use in larger systems, so long as
its limitations are understood. The readability of the code also makes it
relatively straightforward to extend it for other purposes
Feedback Generation for Performance Problems in Introductory Programming Assignments
Providing feedback on programming assignments manually is a tedious, error
prone, and time-consuming task. In this paper, we motivate and address the
problem of generating feedback on performance aspects in introductory
programming assignments. We studied a large number of functionally correct
student solutions to introductory programming assignments and observed: (1)
There are different algorithmic strategies, with varying levels of efficiency,
for solving a given problem. These different strategies merit different
feedback. (2) The same algorithmic strategy can be implemented in countless
different ways, which are not relevant for reporting feedback on the student
program.
We propose a light-weight programming language extension that allows a
teacher to define an algorithmic strategy by specifying certain key values that
should occur during the execution of an implementation. We describe a dynamic
analysis based approach to test whether a student's program matches a teacher's
specification. Our experimental results illustrate the effectiveness of both
our specification language and our dynamic analysis. On one of our benchmarks
consisting of 2316 functionally correct implementations to 3 programming
problems, we identified 16 strategies that we were able to describe using our
specification language (in 95 minutes after inspecting 66, i.e., around 3%,
implementations). Our dynamic analysis correctly matched each implementation
with its corresponding specification, thereby automatically producing the
intended feedback.Comment: Tech report/extended version of FSE 2014 pape
Explain3D: Explaining Disagreements in Disjoint Datasets
Data plays an important role in applications, analytic processes, and many
aspects of human activity. As data grows in size and complexity, we are met
with an imperative need for tools that promote understanding and explanations
over data-related operations. Data management research on explanations has
focused on the assumption that data resides in a single dataset, under one
common schema. But the reality of today's data is that it is frequently
un-integrated, coming from different sources with different schemas. When
different datasets provide different answers to semantically similar questions,
understanding the reasons for the discrepancies is challenging and cannot be
handled by the existing single-dataset solutions.
In this paper, we propose Explain3D, a framework for explaining the
disagreements across disjoint datasets (3D). Explain3D focuses on identifying
the reasons for the differences in the results of two semantically similar
queries operating on two datasets with potentially different schemas. Our
framework leverages the queries to perform a semantic mapping across the
relevant parts of their provenance; discrepancies in this mapping point to
causes of the queries' differences. Exploiting the queries gives Explain3D an
edge over traditional schema matching and record linkage techniques, which are
query-agnostic. Our work makes the following contributions: (1) We formalize
the problem of deriving optimal explanations for the differences of the results
of semantically similar queries over disjoint datasets. (2) We design a 3-stage
framework for solving the optimal explanation problem. (3) We develop a
smart-partitioning optimizer that improves the efficiency of the framework by
orders of magnitude. (4)~We experiment with real-world and synthetic data to
demonstrate that Explain3D can derive precise explanations efficiently
Optimum matchings in weighted bipartite graphs
Given an integer weighted bipartite graph we consider the problems of finding all the edges that occur in
some minimum weight matching of maximum cardinality and enumerating all the
minimum weight perfect matchings. Moreover, we construct a subgraph of
which depends on an -optimal solution of the dual linear program
associated to the assignment problem on that allows us to reduced
this problems to their unweighed variants on . For instance, when
has a perfect matching and we have an -optimal solution of the dual
linear program associated to the assignment problem on , we solve the
problem of finding all the edges that occur in some minimum weight perfect
matching in linear time on the number of edges. Therefore, starting from
scratch we get an algorithm that solves this problem in time
, where , , and .Comment: 11 page
Stochastic Model Predictive Control for Autonomous Mobility on Demand
This paper presents a stochastic, model predictive control (MPC) algorithm
that leverages short-term probabilistic forecasts for dispatching and
rebalancing Autonomous Mobility-on-Demand systems (AMoD, i.e. fleets of
self-driving vehicles). We first present the core stochastic optimization
problem in terms of a time-expanded network flow model. Then, to ameliorate its
tractability, we present two key relaxations. First, we replace the original
stochastic problem with a Sample Average Approximation (SAA), and characterize
the performance guarantees. Second, we separate the controller into two
separate parts to address the task of assigning vehicles to the outstanding
customers separate from that of rebalancing. This enables the problem to be
solved as two totally unimodular linear programs, and thus easily scalable to
large problem sizes. Finally, we test the proposed algorithm in two scenarios
based on real data and show that it outperforms prior state-of-the-art
algorithms. In particular, in a simulation using customer data from DiDi
Chuxing, the algorithm presented here exhibits a 62.3 percent reduction in
customer waiting time compared to state of the art non-stochastic algorithms.Comment: Submitting to the IEEE International Conference on Intelligent
Transportation Systems 201
Efficient Pattern Matching in Python
Pattern matching is a powerful tool for symbolic computations. Applications
include term rewriting systems, as well as the manipulation of symbolic
expressions, abstract syntax trees, and XML and JSON data. It also allows for
an intuitive description of algorithms in the form of rewrite rules. We present
the open source Python module MatchPy, which offers functionality and
expressiveness similar to the pattern matching in Mathematica. In particular,
it includes syntactic pattern matching, as well as matching for commutative
and/or associative functions, sequence variables, and matching with
constraints. MatchPy uses new and improved algorithms to efficiently find
matches for large pattern sets by exploiting similarities between patterns. The
performance of MatchPy is investigated on several real-world problems
- …