Search CORE

192,265 research outputs found

SimpleTrack:Adaptive Trajectory Compression with Deterministic Projection Matrix for Mobile Sensor Networks

Author: Chou Chun Tung
Hu Wen
Rana Rajib
Wark Tim
Yang Mingrui
Publication venue
Publication date: 23/04/2014
Field of study

Some mobile sensor network applications require the sensor nodes to transfer their trajectories to a data sink. This paper proposes an adaptive trajectory (lossy) compression algorithm based on compressive sensing. The algorithm has two innovative elements. First, we propose a method to compute a deterministic projection matrix from a learnt dictionary. Second, we propose a method for the mobile nodes to adaptively predict the number of projections needed based on the speed of the mobile nodes. Extensive evaluation of the proposed algorithm using 6 datasets shows that our proposed algorithm can achieve sub-metre accuracy. In addition, our method of computing projection matrices outperforms two existing methods. Finally, comparison of our algorithm against a state-of-the-art trajectory compression algorithm show that our algorithm can reduce the error by 10-60 cm for the same compression ratio

arXiv.org e-Print Archive

University of Southern Queensland ePrints

Fifty years of spellchecking

Author: Blair CR
Brooks G
Carlson AJ
Cucerzan S
Damerau FJ
Damerau FJ
Golding AR
Golding AR
Leech G
Levenshtein VI
McIlroy MD
Mihov S
Mitton R
Mitton R
Mitton R
Mitton R
Morris R
Oflazer K
Pedler J
Peterson JL
Peterson JL
Pollock JL
Roger Mitton
Savary A
Sterling CM
Veronis J
Wagner RA
Wing AM
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2010
Field of study

A short history of spellchecking from the late 1950s to the present day, describing its development through dictionary lookup, affix stripping, correction, confusion sets, and edit distance to the use of gigantic databases

Crossref

Birkbeck Institutional Research Online

Towards a collocation writing assistant for learners of Spanish

Author: Alonso Ramos Margarita
García Salido Marcos
Vincze Orsolya
Publication venue
Publication date: 16/01/1932
Field of study

This paper describes the process followed in creating a tool aimed at helping learners produce collocations in Spanish. First we present the Diccionario de colocaciones del español (DiCE), an online collocation dictionary, which represents the first stage of this process. The following section focuses on the potential user of a collocation learning tool: we examine the usability problems DiCE presents in this respect, and explore the actual learner needs through a learner corpus study of collocation errors. Next, we review how collocation production problems of English language learners can be solved using a variety of electronic tools devised for that language. Finally, taking all the above into account, we present a new tool aimed at assisting learners of Spanish in writing texts, with particular attention being paid to the use of collocations in this language

Biblioteca Virtual de Prensa Histórica (Virtual Library of Historical Newspapers)

University of Hildesheim

Role of homeostasis in learning sparse representations

Author: Baudot P.
Brüderle D.
Hebb D. O.
Laughlin S. B.
Laurent U. Perrinet
Lee H.
Mallat S.
Olshausen B. A.
Olshausen B. A.
Perrinet L.
Perrinet L.
Perrinet L.
Ranzato M. A.
Saito N.
Publication venue: 'MIT Press - Journals'
Publication date: 01/07/2010
Field of study

Neurons in the input layer of primary visual cortex in primates develop edge-like receptive fields. One approach to understanding the emergence of this response is to state that neural activity has to efficiently represent sensory data with respect to the statistics of natural scenes. Furthermore, it is believed that such an efficient coding is achieved using a competition across neurons so as to generate a sparse representation, that is, where a relatively small number of neurons are simultaneously active. Indeed, different models of sparse coding, coupled with Hebbian learning and homeostasis, have been proposed that successfully match the observed emergent response. However, the specific role of homeostasis in learning such sparse representations is still largely unknown. By quantitatively assessing the efficiency of the neural representation during learning, we derive a cooperative homeostasis mechanism that optimally tunes the competition between neurons within the sparse coding algorithm. We apply this homeostasis while learning small patches taken from natural images and compare its efficiency with state-of-the-art algorithms. Results show that while different sparse coding algorithms give similar coding results, the homeostasis provides an optimal balance for the representation of natural images within the population of neurons. Competition in sparse coding is optimized when it is fair. By contributing to optimizing statistical competition across neurons, homeostasis is crucial in providing a more efficient solution to the emergence of independent components

arXiv.org e-Print Archive

Ordering the suggestions of a spellchecker without using context.

Author: Mitton Roger
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2009
Field of study

Having located a misspelling, a spellchecker generally offers some suggestions for the intended word. Even without using context, a spellchecker can draw on various types of information in ordering its suggestions. A series of experiments is described, beginning with a basic corrector that implements a well-known algorithm for reversing single simple errors, and making successive enhancements to take account of substring matches, pronunciation, known error patterns, syllable structure and word frequency. The improvement in the ordering produced by each enhancement is measured on a large corpus of misspellings. The final version is tested on other corpora against a widely used commercial spellchecker and a research prototype

Birkbeck Institutional Research Online

A practical index for approximate dictionary matching with few mismatches

Author: Cisłak Aleksander
Grabowski Szymon
Publication venue
Publication date: 11/02/2016
Field of study

Approximate dictionary matching is a classic string matching problem (checking if a query string occurs in a collection of strings) with applications in, e.g., spellchecking, online catalogs, geolocation, and web searchers. We present a surprisingly simple solution called a split index, which is based on the Dirichlet principle, for matching a keyword with few mismatches, and experimentally show that it offers competitive space-time tradeoffs. Our implementation in the C++ language is focused mostly on data compaction, which is beneficial for the search speed (e.g., by being cache friendly). We compare our solution with other algorithms and we show that it performs better for the Hamming distance. Query times in the order of 1 microsecond were reported for one mismatch for the dictionary size of a few megabytes on a medium-end PC. We also demonstrate that a basic compression technique consisting in

q

-gram substitution can significantly reduce the index size (up to 50% of the input text size for the DNA), while still keeping the query time relatively low

arXiv.org e-Print Archive

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

The Latent Structure of Dictionaries

Author: Blondin Massé Alexandre
Harnad Stevan
Lopes Marcos
Lord Mélanie
Marcotte Odile
Vincent-Lamarre Philippe
Publication venue: 'Wiley'
Publication date: 22/01/2016
Field of study

How many words (and which ones) are sufficient to define all other words? When dictionaries are analyzed as directed graphs with links from defining words to defined words, they reveal a latent structure. Recursively removing all words that are reachable by definition but that do not define any further words reduces the dictionary to a Kernel of about 10%. This is still not the smallest number of words that can define all the rest. About 75% of the Kernel turns out to be its Core, a Strongly Connected Subset of words with a definitional path to and from any pair of its words and no word’s definition depending on a word outside the set. But the Core cannot define all the rest of the dictionary. The 25% of the Kernel surrounding the Core consists of small strongly connected subsets of words: the Satellites. The size of the smallest set of words that can define all the rest (the graph’s Minimum Feedback Vertex Set or MinSet) is about 1% of the dictionary, 15% of the Kernel, and half-Core, half-Satellite. But every dictionary has a huge number of MinSets. The Core words are learned earlier, more frequent, and less concrete than the Satellites, which in turn are learned earlier and more frequent but more concrete than the rest of the Dictionary. In principle, only one MinSet’s words would need to be grounded through the sensorimotor capacity to recognize and categorize their referents. In a dual-code sensorimotor-symbolic model of the mental lexicon, the symbolic code could do all the rest via re-combinatory definition

arXiv.org e-Print Archive

Southampton (e-Prints Soton)