5,495 research outputs found
Task-Driven Dictionary Learning
Modeling data with linear combinations of a few elements from a learned
dictionary has been the focus of much recent research in machine learning,
neuroscience and signal processing. For signals such as natural images that
admit such sparse representations, it is now well established that these models
are well suited to restoration tasks. In this context, learning the dictionary
amounts to solving a large-scale matrix factorization problem, which can be
done efficiently with classical optimization tools. The same approach has also
been used for learning features from data for other purposes, e.g., image
classification, but tuning the dictionary in a supervised way for these tasks
has proven to be more difficult. In this paper, we present a general
formulation for supervised dictionary learning adapted to a wide variety of
tasks, and present an efficient algorithm for solving the corresponding
optimization problem. Experiments on handwritten digit classification, digital
art identification, nonlinear inverse image problems, and compressed sensing
demonstrate that our approach is effective in large-scale settings, and is well
suited to supervised and semi-supervised classification, as well as regression
tasks for data that admit sparse representations.Comment: final draft post-refereein
Dynamic Ordered Sets with Exponential Search Trees
We introduce exponential search trees as a novel technique for converting
static polynomial space search structures for ordered sets into fully-dynamic
linear space data structures.
This leads to an optimal bound of O(sqrt(log n/loglog n)) for searching and
updating a dynamic set of n integer keys in linear space. Here searching an
integer y means finding the maximum key in the set which is smaller than or
equal to y. This problem is equivalent to the standard text book problem of
maintaining an ordered set (see, e.g., Cormen, Leiserson, Rivest, and Stein:
Introduction to Algorithms, 2nd ed., MIT Press, 2001).
The best previous deterministic linear space bound was O(log n/loglog n) due
Fredman and Willard from STOC 1990. No better deterministic search bound was
known using polynomial space.
We also get the following worst-case linear space trade-offs between the
number n, the word length w, and the maximal key U < 2^w: O(min{loglog n+log
n/log w, (loglog n)(loglog U)/(logloglog U)}). These trade-offs are, however,
not likely to be optimal.
Our results are generalized to finger searching and string searching,
providing optimal results for both in terms of n.Comment: Revision corrects some typoes and state things better for
applications in subsequent paper
Sublinear Space Algorithms for the Longest Common Substring Problem
Given documents of total length , we consider the problem of finding a
longest string common to at least of the documents. This problem is
known as the \emph{longest common substring (LCS) problem} and has a classic
space and time solution (Weiner [FOCS'73], Hui [CPM'92]).
However, the use of linear space is impractical in many applications. In this
paper we show that for any trade-off parameter , the LCS
problem can be solved in space and time, thus providing
the first smooth deterministic time-space trade-off from constant to linear
space. The result uses a new and very simple algorithm, which computes a
-additive approximation to the LCS in time and
space. We also show a time-space trade-off lower bound for deterministic
branching programs, which implies that any deterministic RAM algorithm solving
the LCS problem on documents from a sufficiently large alphabet in
space must use
time.Comment: Accepted to 22nd European Symposium on Algorithm
DDoS Attacks with Randomized Traffic Innovation: Botnet Identification Challenges and Strategies
Distributed Denial-of-Service (DDoS) attacks are usually launched through the
, an "army" of compromised nodes hidden in the network. Inferential
tools for DDoS mitigation should accordingly enable an early and reliable
discrimination of the normal users from the compromised ones. Unfortunately,
the recent emergence of attacks performed at the application layer has
multiplied the number of possibilities that a botnet can exploit to conceal its
malicious activities. New challenges arise, which cannot be addressed by simply
borrowing the tools that have been successfully applied so far to earlier DDoS
paradigms. In this work, we offer basically three contributions: we
introduce an abstract model for the aforementioned class of attacks, where the
botnet emulates normal traffic by continually learning admissible patterns from
the environment; we devise an inference algorithm that is shown to
provide a consistent (i.e., converging to the true solution as time progresses)
estimate of the botnet possibly hidden in the network; and we verify the
validity of the proposed inferential strategy over network traces.Comment: Submitted for publicatio
- âŠ