74,254 research outputs found
String Matching with Multicore CPUs: Performing Better with the Aho-Corasick Algorithm
Multiple string matching is known as locating all the occurrences of a given
number of patterns in an arbitrary string. It is used in bio-computing
applications where the algorithms are commonly used for retrieval of
information such as sequence analysis and gene/protein identification.
Extremely large amount of data in the form of strings has to be processed in
such bio-computing applications. Therefore, improving the performance of
multiple string matching algorithms is always desirable. Multicore
architectures are capable of providing better performance by parallelizing the
multiple string matching algorithms. The Aho-Corasick algorithm is the one that
is commonly used in exact multiple string matching algorithms. The focus of
this paper is the acceleration of Aho-Corasick algorithm through a multicore
CPU based software implementation. Through our implementation and evaluation of
results, we prove that our method performs better compared to the state of the
art
Searching by approximate personal-name matching
We discuss the design, building and evaluation of a method to access theinformation of a person, using his name as a search key, even if it has deformations. We present a similarity function, the DEA function, based
on the probabilities of the edit operations accordingly to the involved
letters and their position, and using a variable threshold. The efficacy
of DEA is quantitatively evaluated, without human relevance judgments,
very superior to the efficacy of known methods. A very efficient
approximate search technique for the DEA function is also presented
based on a compacted trie-tree structure.Postprint (published version
Generating collaborative systems for digital libraries: A model-driven approach
This is an open access article shared under a Creative Commons Attribution 3.0 Licence (http://creativecommons.org/licenses/by/3.0/). Copyright @ 2010 The Authors.The design and development of a digital library involves different stakeholders, such as: information architects, librarians, and domain experts, who need to agree on a common language to describe, discuss, and negotiate the services the library has to offer. To this end, high-level, language-neutral models have to be devised. Metamodeling techniques favor the definition of domainspecific visual languages through which stakeholders can share their views and directly manipulate representations of the domain entities. This paper describes CRADLE (Cooperative-Relational Approach to Digital Library Environments), a metamodel-based framework and visual language for the definition of notions and services related to the development of digital libraries. A collection of tools allows the automatic generation of several services, defined with the CRADLE visual language, and of the graphical user interfaces providing access to them for the final user. The effectiveness of the approach is illustrated by presenting digital libraries generated with CRADLE, while the CRADLE environment has been evaluated by using the cognitive dimensions framework
- âŠ