230,186 research outputs found
The Sample Complexity of Dictionary Learning
A large set of signals can sometimes be described sparsely using a
dictionary, that is, every element can be represented as a linear combination
of few elements from the dictionary. Algorithms for various signal processing
applications, including classification, denoising and signal separation, learn
a dictionary from a set of signals to be represented. Can we expect that the
representation found by such a dictionary for a previously unseen example from
the same source will have L_2 error of the same magnitude as those for the
given examples? We assume signals are generated from a fixed distribution, and
study this questions from a statistical learning theory perspective.
We develop generalization bounds on the quality of the learned dictionary for
two types of constraints on the coefficient selection, as measured by the
expected L_2 error in representation when the dictionary is used. For the case
of l_1 regularized coefficient selection we provide a generalization bound of
the order of O(sqrt(np log(m lambda)/m)), where n is the dimension, p is the
number of elements in the dictionary, lambda is a bound on the l_1 norm of the
coefficient vector and m is the number of samples, which complements existing
results. For the case of representing a new signal as a combination of at most
k dictionary elements, we provide a bound of the order O(sqrt(np log(m k)/m))
under an assumption on the level of orthogonality of the dictionary (low Babel
function). We further show that this assumption holds for most dictionaries in
high dimensions in a strong probabilistic sense. Our results further yield fast
rates of order 1/m as opposed to 1/sqrt(m) using localized Rademacher
complexity. We provide similar results in a general setting using kernels with
weak smoothness requirements
SASA Dictionary as the Gold Standard for Good Dictionary Examples for Serbian
In this paper we present a model for selection of good dictionary examples for Serbian and the
development of initial model components. The method used is based on a thorough analysis of
various lexical and syntactic features in a corpus compiled of examples from the five digitized
volumes of the Serbian Academy of Sciences and Arts (SASA) dictionary. The initial set of
features was inspired by a similar approach for other languages. The feature distribution of
examples from this corpus is compared with the feature distribution of sentence samples
extracted from corpora comprising various texts. The analysis showed that there is a group of
features which are strong indicators that a sentence should not be used as an example. The
remaining features, including detection of non-standard and other marked lexis from the SASA
dictionary, are used for ranking. The selected candidate examples, represented as featurevectors,
are used with the GDEX ranking tool for Serbian candidate examples and a supervised
machine learning model for classification on standard and non-standard Serbian sentences, for
further integration into a solution for present and future dictionary production projects
Robust Baseline Subtraction for Ultrasonic Full Wavefield Analysis
Full wavefield analysis can be effectively used to study and characterize the interaction between waves and structural damage.Wavefields are sequentiallymeasured as damage evolves over time, and differences between each wavefield are then analyzed. Yet, as wavefields are measured and as damage evolves, environmental and operational variations can significantly affect wave propagation properties. As a result, wavefields are sensitive to variations in temperature, stress, sensor coupling, and other sources that can significantly distort data. Several approaches, including time-stretching and optimal baseline selection, can remove environmental variations, but these methods are often limited to removing specific effects, are ineffective for large environmental variations, and can require an unrealistic number of prior baseline measurements.
This paper presents a robust methodology for subtracting wavefields and isolating wave-damage interactions. The method is based on dictionary learning, is robust to multiple environmental and operational variations, and requires only one initial baseline wavefield. For this application, the dictionary represents a matrix of basis vectors that generally describe wave propagation for a particular wavefield. We learn or train the dictionary using multiple frequencies from the single baseline wavefield. We then statistically fit new measurements with the dictionary through sparse regression techniques. This effectively creates a new baseline with propagation properties (for example, velocities) according to the new data. The new baseline is then compared with the measured data
Investigating the Selection of Example Sentences for Unknown Target Words in ICALL Reading Texts for L2 German
Institute for Communicating and Collaborative SystemsThis thesis considers possible criteria for the selection of example
sentences for difficult or unknown words in reading texts for students of German
as a Second Language (GSL). The examples are intended to be provided
within the context of an Intelligent Computer-Aided Language Learning (ICALL) Vocabulary
Learning System, where students can choose
among several explanation options for difficult words. Some of these options (e.g. glosses)
have received a good deal of attention in the ICALL/Second Language (L2) Acquisition
literature; in contrast, literature on examples has been the near exclusive province
of lexicographers.
The selection of examples is explored from an educational,
L2 teaching point of view: the thesis is intended as a first
exploration of the question of what makes an example helpful to the
L2 student from the perspective of L2 teachers. An important motivation for this work is that
selecting examples from a dictionary or randomly from a corpus has
several drawbacks: first, the number of available dictionary
examples is limited; second, the examples fail to take into account the context
in which the word was encountered; and third, the rationale
and precise principles behind the selection of dictionary examples is usually
less than clear.
Central to this thesis is the hypothesis that a random selection of example
sentences from a suitable corpus can be improved by a guided selection process that takes
into account characteristics of helpful examples.
This is investigated by an empirical study conducted with teachers of L2 German.
The teacher data show that four dimensions are significant criteria amenable to analysis:
(a) reduced syntactic complexity, (b) sentence similarity,
provision of (c) significant co-occurrences and (d) semantically related words.
Models based on these dimensions are developed using logistic regression analysis,
and evaluated through two further empirical studies with teachers and students of L2 German.
The results of the teacher evaluation are encouraging: for the teacher evaluation, they indicate
that, for one of the models, the top-ranked selections perform on the same level as dictionary
examples. In addition, the model provides a ranking of potential examples that roughly
corresponds to that of experienced teachers of L2 German. The student evaluation confirms
and notably improves on the teacher evaluation in that the best-performing model of the
teacher evaluation significantly outperforms both random corpus selections
and dictionary examples (when a penalty for missing entries is included)
An MDL framework for sparse coding and dictionary learning
The power of sparse signal modeling with learned over-complete dictionaries
has been demonstrated in a variety of applications and fields, from signal
processing to statistical inference and machine learning. However, the
statistical properties of these models, such as under-fitting or over-fitting
given sets of data, are still not well characterized in the literature. As a
result, the success of sparse modeling depends on hand-tuning critical
parameters for each data and application. This work aims at addressing this by
providing a practical and objective characterization of sparse models by means
of the Minimum Description Length (MDL) principle -- a well established
information-theoretic approach to model selection in statistical inference. The
resulting framework derives a family of efficient sparse coding and dictionary
learning algorithms which, by virtue of the MDL principle, are completely
parameter free. Furthermore, such framework allows to incorporate additional
prior information to existing models, such as Markovian dependencies, or to
define completely new problem formulations, including in the matrix analysis
area, in a natural way. These virtues will be demonstrated with parameter-free
algorithms for the classic image denoising and classification problems, and for
low-rank matrix recovery in video applications
- …