7,355 research outputs found
A practical index for approximate dictionary matching with few mismatches
Approximate dictionary matching is a classic string matching problem
(checking if a query string occurs in a collection of strings) with
applications in, e.g., spellchecking, online catalogs, geolocation, and web
searchers. We present a surprisingly simple solution called a split index,
which is based on the Dirichlet principle, for matching a keyword with few
mismatches, and experimentally show that it offers competitive space-time
tradeoffs. Our implementation in the C++ language is focused mostly on data
compaction, which is beneficial for the search speed (e.g., by being cache
friendly). We compare our solution with other algorithms and we show that it
performs better for the Hamming distance. Query times in the order of 1
microsecond were reported for one mismatch for the dictionary size of a few
megabytes on a medium-end PC. We also demonstrate that a basic compression
technique consisting in -gram substitution can significantly reduce the
index size (up to 50% of the input text size for the DNA), while still keeping
the query time relatively low
A New Linear-Time Dynamic Dictionary Matching Algorithm
This research presents inverted lists as a new data structure for the dynamic dictionary matching algorithm. The inverted lists structure, which derives from the inverted index, is implemented by the perfect hashing table. The dictionary is constructed in optimal time and the individual patterns can be updated in minimal time. The searching phase scans the given text in a single pass, even in a worst case scenario. In experimental results, the inverted lists used less time and space than the traditional structures; the searches were processed and showed an efficient linear time
Sparse representation based hyperspectral image compression and classification
Abstract
This thesis presents a research work on applying sparse representation to lossy hyperspectral image
compression and hyperspectral image classification. The proposed lossy hyperspectral image
compression framework introduces two types of dictionaries distinguished by the terms sparse
representation spectral dictionary (SRSD) and multi-scale spectral dictionary (MSSD), respectively.
The former is learnt in the spectral domain to exploit the spectral correlations, and the
latter in wavelet multi-scale spectral domain to exploit both spatial and spectral correlations in
hyperspectral images. To alleviate the computational demand of dictionary learning, either a
base dictionary trained offline or an update of the base dictionary is employed in the compression
framework. The proposed compression method is evaluated in terms of different objective
metrics, and compared to selected state-of-the-art hyperspectral image compression schemes, including
JPEG 2000. The numerical results demonstrate the effectiveness and competitiveness of
both SRSD and MSSD approaches.
For the proposed hyperspectral image classification method, we utilize the sparse coefficients
for training support vector machine (SVM) and k-nearest neighbour (kNN) classifiers. In particular,
the discriminative character of the sparse coefficients is enhanced by incorporating contextual
information using local mean filters. The classification performance is evaluated and compared
to a number of similar or representative methods. The results show that our approach could outperform
other approaches based on SVM or sparse representation.
This thesis makes the following contributions. It provides a relatively thorough investigation
of applying sparse representation to lossy hyperspectral image compression. Specifically,
it reveals the effectiveness of sparse representation for the exploitation of spectral correlations
in hyperspectral images. In addition, we have shown that the discriminative character of sparse
coefficients can lead to superior performance in hyperspectral image classification.EM201
- …