362,264 research outputs found
MULTIPLE DICTIONARY FOR SPARSE MODELING
Much of the progress made in image processing in the past decades can be attributed to better modeling of image content, and a wise deployment of these models in relevant applications. In this paper, we review the role of this recent model in image processing, its rationale, and models related to it. As it turns out, the field of image processing is one of the main beneficiaries from the recent progress made in the theory and practice of sparse and redundant representations. Sparse coding is a key principle that underlies wavelet representation of images. Sparse representation based classification has led to interesting image recognition results, while the dictionary used for sparse coding plays a key role in it. In general, the choice of a proper dictionary can be done using one of two ways: i) building asparsifying dictionary based on a mathematical model of the data, or ii) learning a dictionary to perform best on a training set
Plan for leksikalsk dokumentasjon av moderne bokmål
The lexicon of the modern Norwegian bokmål standard needs a better description and documentation than what is the situation today. The article presents a plan for building a modern lexical database based on a balanced corpus of 40 million words of modern bokmål. This base should serve as a source for a traditional scientific dictionary as well as a dictionary for language technological applications
Going organic: Building an experimental bottom-up dictionary of verbs in science
International audienceChoosing what headwords to enter in a dictionary has always been a major question in lexicographical practice. Corpora have greatly helped ease both the choice of words to add, and those to remove, by resorting to frequency counts so as to monitor usage over time. This has been particular valuable in the building of learners dictionaries as, however good earlier word lists may have been, they were built largely in intuition whereas, corpora allow the consultation of large reference corpora for a better picture of current realities. In specialised dictionaries dealing with terminological issues, pure frequency is not a feasible solution for headword extraction. However, linked with extraction patterns and statistical tools, corpora still play a major role in supplying information on terms in use. In this research we aim to tackle a situation that lies in between the needs of an advanced learners dictionary and those of a terminological dictionary in attempting to build a pattern dictionary for verbs used in scientific research papers. In order to select verbs for this dictionary and put them into classes, we propose to use collocational relationships as a tool for both selection and analysis of patterns. The principle here is that a series of high frequency verbs can provide the seeds from which prototypical patterns can be extracted. By moving backwards and forwards from verb to argument and back pattern are revealed that use the statistical selectionning to highlight verbs lower in the frequency list that would otherwise be overlooked. Thus patterns will naturally enlarge the word list by selecting what is statistically significant with a textual environment. These patterns not only illustrate typical usage in a specialised environment, but will also group verbs according to textual functions as authorial positioning and description of processes
Computational Pronunciation Analysis in Sung Utterances
Recent automatic lyrics transcription (ALT) approaches focus on building
stronger acoustic models or in-domain language models, while the pronunciation
aspect is seldom touched upon. This paper applies a novel computational
analysis on the pronunciation variances in sung utterances and further proposes
a new pronunciation model adapted for singing. The singing-adapted model is
tested on multiple public datasets via word recognition experiments. It
performs better than the standard speech dictionary in all settings reporting
the best results on ALT in a capella recordings using n-gram language models.
For reproducibility, we share the sentence-level annotations used in testing,
providing a new benchmark evaluation set for ALT
Computational Pronunciation Analysis in Sung Utterances
Recent automatic lyrics transcription (ALT) approaches focus on building stronger acoustic models or indomain language models, while the pronunciation aspect is
seldom touched upon. This paper applies a novel computational
analysis on the pronunciation variances in sung utterances
and further proposes a new pronunciation model adapted for
singing. The singing-adapted model is tested on multiple public
datasets via word recognition experiments. It performs better
than the standard speech dictionary in all settings reporting
the best results on ALT in a capella recordings using n-gram
language models. For reproducibility, we share the sentencelevel annotations used in testing, providing a new benchmark
evaluation set for ALT
An analysis of The Oxford Guide to practical lexicography (Atkins and Rundell 2008)
Since at least a decade ago, the lexicographic community at large has been demanding that a modern textbook be designed - one that Would place corpora in the centre of the lexicographic enterprise. Written by two of the most respected practising lexicographers, this book has finally arrived, and delivers on very many levels. This review article presents a critical analysis of its features
- …