2,807 research outputs found
Faster Algorithms for the Maximum Common Subtree Isomorphism Problem
The maximum common subtree isomorphism problem asks for the largest possible
isomorphism between subtrees of two given input trees. This problem is a
natural restriction of the maximum common subgraph problem, which is -hard in general graphs. Confining to trees renders polynomial time
algorithms possible and is of fundamental importance for approaches on more
general graph classes. Various variants of this problem in trees have been
intensively studied. We consider the general case, where trees are neither
rooted nor ordered and the isomorphism is maximum w.r.t. a weight function on
the mapped vertices and edges. For trees of order and maximum degree
our algorithm achieves a running time of by
exploiting the structure of the matching instances arising as subproblems. Thus
our algorithm outperforms the best previously known approaches. No faster
algorithm is possible for trees of bounded degree and for trees of unbounded
degree we show that a further reduction of the running time would directly
improve the best known approach to the assignment problem. Combining a
polynomial-delay algorithm for the enumeration of all maximum common subtree
isomorphisms with central ideas of our new algorithm leads to an improvement of
its running time from to ,
where is the order of the larger tree, is the number of different
solutions, and is the minimum of the maximum degrees of the input
trees. Our theoretical results are supplemented by an experimental evaluation
on synthetic and real-world instances
Computerized Analysis of Magnetic Resonance Images to Study Cerebral Anatomy in Developing Neonates
The study of cerebral anatomy in developing neonates is of great importance for
the understanding of brain development during the early period of life. This
dissertation therefore focuses on three challenges in the modelling of cerebral
anatomy in neonates during brain development. The methods that have been
developed all use Magnetic Resonance Images (MRI) as source data.
To facilitate study of vascular development in the neonatal period, a set of image
analysis algorithms are developed to automatically extract and model cerebral
vessel trees. The whole process consists of cerebral vessel tracking from
automatically placed seed points, vessel tree generation, and vasculature
registration and matching. These algorithms have been tested on clinical Time-of-
Flight (TOF) MR angiographic datasets.
To facilitate study of the neonatal cortex a complete cerebral cortex segmentation
and reconstruction pipeline has been developed. Segmentation of the neonatal
cortex is not effectively done by existing algorithms designed for the adult brain
because the contrast between grey and white matter is reversed. This causes pixels
containing tissue mixtures to be incorrectly labelled by conventional methods. The
neonatal cortical segmentation method that has been developed is based on a novel
expectation-maximization (EM) method with explicit correction for mislabelled
partial volume voxels. Based on the resulting cortical segmentation, an implicit
surface evolution technique is adopted for the reconstruction of the cortex in
neonates. The performance of the method is investigated by performing a detailed
landmark study.
To facilitate study of cortical development, a cortical surface registration algorithm
for aligning the cortical surface is developed. The method first inflates extracted
cortical surfaces and then performs a non-rigid surface registration using free-form
deformations (FFDs) to remove residual alignment. Validation experiments using
data labelled by an expert observer demonstrate that the method can capture local
changes and follow the growth of specific sulcus
Evaluation of the NLP Components of the OVIS2 Spoken Dialogue System
The NWO Priority Programme Language and Speech Technology is a 5-year
research programme aiming at the development of spoken language information
systems. In the Programme, two alternative natural language processing (NLP)
modules are developed in parallel: a grammar-based (conventional, rule-based)
module and a data-oriented (memory-based, stochastic, DOP) module. In order to
compare the NLP modules, a formal evaluation has been carried out three years
after the start of the Programme. This paper describes the evaluation procedure
and the evaluation results. The grammar-based component performs much better
than the data-oriented one in this comparison.Comment: Proceedings of CLIN 9
Recommended from our members
Applying an abstract data structure description approach to parallelizing scientific pointer programs
Even though impressive progress has been made in the area of parallelizing scientific programs with arrays, the application of similar techniques to programs with pointer data structures has remained difficult. Unlike arrays which have a small number of well-defined properties that can be utilized by a parallelizing compiler, pointer data structures are used to implement a wide variety of structures that exhibit a much more diverse set of properties. The complexity and diversity of such properties means that, in general, scientific programs with pointer data structures cannot be effectively analyzed by an optimizing and parallelizing compiler.In order to provide a system in which the compiler can fully utilize the properties of different types of pointer data structures, we have developed a mechanism for the Abstract Description of Data Structures (ADDS). With our approach, the programmer can explicitly describe important properties such as dimensionality of the pointer data structure, independence of dimensions, and direction of traversal. These abstract descriptions of pointer data structures are then used by the compiler to guide analysis, optimization, and parallelization.In this paper we summarize the ADDS approach through the use of numerous examples of data structures used in scientific computations, we illustrate how such declarations are natural and non-tedious to specify, and we show how the ADDS declarations can be used to improve compile-time analysis. In order to demonstrate the viability of our approach, we show how such techniques can be used to parallelize an important class of scientific codes which naturally use recursive pointer data structures. In particular, we use our approach to develop the parallelization of an N-body simulation that is based on a relatively complicated pointer data structure, and we report the speedup results for a Sequent multiprocessor
Parallel Recursive State Compression for Free
This paper focuses on reducing memory usage in enumerative model checking,
while maintaining the multi-core scalability obtained in earlier work. We
present a tree-based multi-core compression method, which works by leveraging
sharing among sub-vectors of state vectors.
An algorithmic analysis of both worst-case and optimal compression ratios
shows the potential to compress even large states to a small constant on
average (8 bytes). Our experiments demonstrate that this holds up in practice:
the median compression ratio of 279 measured experiments is within 17% of the
optimum for tree compression, and five times better than the median compression
ratio of SPIN's COLLAPSE compression.
Our algorithms are implemented in the LTSmin tool, and our experiments show
that for model checking, multi-core tree compression pays its own way: it comes
virtually without overhead compared to the fastest hash table-based methods.Comment: 19 page
Combining semantic and syntactic structure for language modeling
Structured language models for speech recognition have been shown to remedy
the weaknesses of n-gram models. All current structured language models are,
however, limited in that they do not take into account dependencies between
non-headwords. We show that non-headword dependencies contribute to
significantly improved word error rate, and that a data-oriented parsing model
trained on semantically and syntactically annotated data can exploit these
dependencies. This paper also contains the first DOP model trained by means of
a maximum likelihood reestimation procedure, which solves some of the
theoretical shortcomings of previous DOP models.Comment: 4 page
Trying again to fail-first
For constraint satisfaction problems (CSPs), Haralick and Elliott [1] introduced the Fail-First Principle and defined in it terms of minimizing branch depth. By devising a range of variable ordering heuristics, each in turn trying harder to fail first, Smith and Grant [2] showed that adherence to this strategy does not guarantee reduction in search effort. The present work builds on Smith and Grant. It benefits from the development of a new framework for characterizing heuristic performance that defines two policies, one concerned with enhancing the likelihood of correctly extending a partial solution, the other with minimizing the effort to prove insolubility. The Fail-First Principle can be restated as calling for adherence to the second, fail-first policy, while discounting the other, promise policy. Our work corrects some deficiencies in the work of Smith and Grant, and goes on to confirm their finding that the Fail-First Principle, as originally defined, is insufficient. We then show that adherence to the fail-first policy must be measured in terms of size of insoluble subtrees, not branch depth. We also show that for soluble problems, both policies must be considered in evaluating heuristic performance. Hence, even in its proper form the Fail-First Principle is insufficient. We also show that the “FF” series of heuristics devised by Smith and Grant is a powerful tool for evaluating heuristic performance, including the subtle relations between heuristic features and adherence to a policy
- …