9,712 research outputs found
Compressed Representations of Conjunctive Query Results
Relational queries, and in particular join queries, often generate large
output results when executed over a huge dataset. In such cases, it is often
infeasible to store the whole materialized output if we plan to reuse it
further down a data processing pipeline. Motivated by this problem, we study
the construction of space-efficient compressed representations of the output of
conjunctive queries, with the goal of supporting the efficient access of the
intermediate compressed result for a given access pattern. In particular, we
initiate the study of an important tradeoff: minimizing the space necessary to
store the compressed result, versus minimizing the answer time and delay for an
access request over the result. Our main contribution is a novel parameterized
data structure, which can be tuned to trade off space for answer time. The
tradeoff allows us to control the space requirement of the data structure
precisely, and depends both on the structure of the query and the access
pattern. We show how we can use the data structure in conjunction with query
decomposition techniques, in order to efficiently represent the outputs for
several classes of conjunctive queries.Comment: To appear in PODS'18; 35 pages; comments welcom
Fast Witness Extraction Using a Decision Oracle
The gist of many (NP-)hard combinatorial problems is to decide whether a
universe of elements contains a witness consisting of elements that
match some prescribed pattern. For some of these problems there are known
advanced algebra-based FPT algorithms which solve the decision problem but do
not return the witness. We investigate techniques for turning such a
YES/NO-decision oracle into an algorithm for extracting a single witness, with
an objective to obtain practical scalability for large values of . By
relying on techniques from combinatorial group testing, we demonstrate that a
witness may be extracted with queries to either a deterministic or
a randomized set inclusion oracle with one-sided probability of error.
Furthermore, we demonstrate through implementation and experiments that the
algebra-based FPT algorithms are practical, in particular in the setting of the
-path problem. Also discussed are engineering issues such as optimizing
finite field arithmetic.Comment: Journal version, 16 pages. Extended abstract presented at ESA'1
New Variants of Pattern Matching with Constants and Variables
Given a text and a pattern over two types of symbols called constants and
variables, the parameterized pattern matching problem is to find all
occurrences of substrings of the text that the pattern matches by substituting
a variable in the text for each variable in the pattern, where the substitution
should be injective. The function matching problem is a variant of it that
lifts the injection constraint. In this paper, we discuss variants of those
problems, where one can substitute a constant or a variable for each variable
of the pattern. We give two kinds of algorithms for both problems, a
convolution-based method and an extended KMP-based method, and analyze their
complexity.Comment: 15 pages, 2 figure
Query-Focused Video Summarization: Dataset, Evaluation, and A Memory Network Based Approach
Recent years have witnessed a resurgence of interest in video summarization.
However, one of the main obstacles to the research on video summarization is
the user subjectivity - users have various preferences over the summaries. The
subjectiveness causes at least two problems. First, no single video summarizer
fits all users unless it interacts with and adapts to the individual users.
Second, it is very challenging to evaluate the performance of a video
summarizer.
To tackle the first problem, we explore the recently proposed query-focused
video summarization which introduces user preferences in the form of text
queries about the video into the summarization process. We propose a memory
network parameterized sequential determinantal point process in order to attend
the user query onto different video frames and shots. To address the second
challenge, we contend that a good evaluation metric for video summarization
should focus on the semantic information that humans can perceive rather than
the visual features or temporal overlaps. To this end, we collect dense
per-video-shot concept annotations, compile a new dataset, and suggest an
efficient evaluation method defined upon the concept annotations. We conduct
extensive experiments contrasting our video summarizer to existing ones and
present detailed analyses about the dataset and the new evaluation method
Causal Confusion in Imitation Learning
Behavioral cloning reduces policy learning to supervised learning by training
a discriminative model to predict expert actions given observations. Such
discriminative models are non-causal: the training procedure is unaware of the
causal structure of the interaction between the expert and the environment. We
point out that ignoring causality is particularly damaging because of the
distributional shift in imitation learning. In particular, it leads to a
counter-intuitive "causal misidentification" phenomenon: access to more
information can yield worse performance. We investigate how this problem
arises, and propose a solution to combat it through targeted
interventions---either environment interaction or expert queries---to determine
the correct causal model. We show that causal misidentification occurs in
several benchmark control domains as well as realistic driving settings, and
validate our solution against DAgger and other baselines and ablations.Comment: Published at NeurIPS 2019 9 pages, plus references and appendice
Conditional Lower Bounds for Space/Time Tradeoffs
In recent years much effort has been concentrated towards achieving
polynomial time lower bounds on algorithms for solving various well-known
problems. A useful technique for showing such lower bounds is to prove them
conditionally based on well-studied hardness assumptions such as 3SUM, APSP,
SETH, etc. This line of research helps to obtain a better understanding of the
complexity inside P.
A related question asks to prove conditional space lower bounds on data
structures that are constructed to solve certain algorithmic tasks after an
initial preprocessing stage. This question received little attention in
previous research even though it has potential strong impact.
In this paper we address this question and show that surprisingly many of the
well-studied hard problems that are known to have conditional polynomial time
lower bounds are also hard when concerning space. This hardness is shown as a
tradeoff between the space consumed by the data structure and the time needed
to answer queries. The tradeoff may be either smooth or admit one or more
singularity points.
We reveal interesting connections between different space hardness
conjectures and present matching upper bounds. We also apply these hardness
conjectures to both static and dynamic problems and prove their conditional
space hardness.
We believe that this novel framework of polynomial space conjectures can play
an important role in expressing polynomial space lower bounds of many important
algorithmic problems. Moreover, it seems that it can also help in achieving a
better understanding of the hardness of their corresponding problems in terms
of time
- …