34,118 research outputs found
Improved Approximate String Matching and Regular Expression Matching on Ziv-Lempel Compressed Texts
We study the approximate string matching and regular expression matching
problem for the case when the text to be searched is compressed with the
Ziv-Lempel adaptive dictionary compression schemes. We present a time-space
trade-off that leads to algorithms improving the previously known complexities
for both problems. In particular, we significantly improve the space bounds,
which in practical applications are likely to be a bottleneck
An introduction to Graph Data Management
A graph database is a database where the data structures for the schema
and/or instances are modeled as a (labeled)(directed) graph or generalizations
of it, and where querying is expressed by graph-oriented operations and type
constructors. In this article we present the basic notions of graph databases,
give an historical overview of its main development, and study the main current
systems that implement them
Recommended from our members
The scheduling of sparse matrix-vector multiplication on a massively parallel dap computer
An efficient data structure is presented which supports general unstructured sparse matrix-vector multiplications on a Distributed Array of Processors (DAP). This approach seeks to reduce the inter-processor data movements and organises the operations in batches of massively parallel steps by a heuristic scheduling procedure performed on the host computer.
The resulting data structure is of particular relevance to iterative schemes for solving linear systems. Performance results for matrices taken from well known Linear Programming (LP) test problems are presented and analysed
RDF Querying
Reactive Web systems, Web services, and Web-based publish/
subscribe systems communicate events as XML messages, and in
many cases require composite event detection: it is not sufficient to react
to single event messages, but events have to be considered in relation to
other events that are received over time.
Emphasizing language design and formal semantics, we describe the
rule-based query language XChangeEQ for detecting composite events.
XChangeEQ is designed to completely cover and integrate the four complementary
querying dimensions: event data, event composition, temporal
relationships, and event accumulation. Semantics are provided as
model and fixpoint theories; while this is an established approach for rule
languages, it has not been applied for event queries before
Hardware acceleration for power efficient deep packet inspection
The rapid growth of the Internet leads to a massive spread of malicious attacks like viruses and malwares, making the safety of online activity a major concern. The use of Network Intrusion Detection Systems (NIDS) is an effective method to safeguard the Internet. One key procedure in NIDS is Deep Packet Inspection (DPI). DPI can examine the contents of a packet and take actions on the packets based on predefined rules. In this thesis, DPI is mainly discussed in the context of security applications. However, DPI can also be used for bandwidth management and network surveillance.
DPI inspects the whole packet payload, and due to this and the complexity of the inspection rules, DPI algorithms consume significant amounts of resources including time, memory and energy. The aim of this thesis is to design hardware accelerated methods for memory and energy efficient high-speed DPI.
The patterns in packet payloads, especially complex patterns, can be efficiently represented by regular expressions, which can be translated by the use of Deterministic Finite Automata (DFA). DFA algorithms are fast but consume very large amounts of memory with certain kinds of regular expressions. In this thesis, memory efficient algorithms are proposed based on the transition compressions of the DFAs.
In this work, Bloom filters are used to implement DPI on an FPGA for hardware acceleration with the design of a parallel architecture. Furthermore, devoted at a balance of power and performance, an energy efficient adaptive Bloom filter is designed with the capability of adjusting the number of active hash functions according to current workload. In addition, a method is given for implementation on both two-stage and multi-stage platforms. Nevertheless, false positive rates still prevents the Bloom filter from extensive utilization; a cache-based counting Bloom filter is presented in this work to get rid of the false positives for fast and precise matching.
Finally, in future work, in order to estimate the effect of power savings, models will be built for routers and DPI, which will also analyze the latency impact of dynamic frequency adaption to current traffic. Besides, a low power DPI system will be designed with a single or multiple DPI engines. Results and evaluation of the low power DPI model and system will be produced in future
Pattern matching in compilers
In this thesis we develop tools for effective and flexible pattern matching.
We introduce a new pattern matching system called amethyst. Amethyst is not
only a generator of parsers of programming languages, but can also serve as an
alternative to tools for matching regular expressions.
Our framework also produces dynamic parsers. Its intended use is in the
context of IDE (accurate syntax highlighting and error detection on the fly).
Amethyst offers pattern matching of general data structures. This makes it a
useful tool for implementing compiler optimizations such as constant folding,
instruction scheduling, and dataflow analysis in general.
The parsers produced are essentially top-down parsers. Linear time complexity
is obtained by introducing the novel notion of structured grammars and
regularized regular expressions. Amethyst uses techniques known from compiler
optimizations to produce effective parsers.Comment: master thesi
- …