Search CORE

61,279 research outputs found

Extended path-indexing

Author: Graf P.
Kirsch C.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1993
Field of study

The performance of a theorem prover crucially depends on the speed of the basic retrieval operations, such as finding terms that are unifiable with (instances of, or more general than) some query term. Among the known indexing methods for term retrieval in deduction systems, Path--Indexing exhibits a good performance in general. However, as Path--Indexing is not a perfect filter, the candidates found by this method have still to be subjected to a unification algorithm in order to detect occur--check failures and indirect clashes. As perfect filters, discrimination trees and abstraction trees thus outperform Path--Indexing in some cases. We present an improved version of Path--Indexing that provides both the query trees and the Path--Index with indirect clash an occur--check information. Thus compared to the standard method we dismiss much more terms as possible candidates

MPG.PuRe

Thesaurus-based index term extraction for agricultural documents

Author: Medelyan Olena
Witten Ian H.
Publication venue: EFITA/WICCA
Publication date: 01/01/2005
Field of study

This paper describes a new algorithm for automatically extracting index terms from documents relating to the domain of agriculture. The domain-specific Agrovoc thesaurus developed by the FAO is used both as a controlled vocabulary and as a knowledge base for semantic matching. The automatically assigned terms are evaluated against a manually indexed 200-item sample of the FAO’s document repository, and the performance of the new algorithm is compared with a state-of-the-art system for keyphrase extraction

CiteSeerX

Research Commons@Waikato

Storing and Indexing Plan Derivations through Explanation-based Analysis of Retrieval Failures

Author: Ihrig L. H.
Kambhampati S.
Publication venue
Publication date: 01/01/1997
Field of study

Case-Based Planning (CBP) provides a way of scaling up domain-independent planning to solve large problems in complex domains. It replaces the detailed and lengthy search for a solution with the retrieval and adaptation of previous planning experiences. In general, CBP has been demonstrated to improve performance over generative (from-scratch) planning. However, the performance improvements it provides are dependent on adequate judgements as to problem similarity. In particular, although CBP may substantially reduce planning effort overall, it is subject to a mis-retrieval problem. The success of CBP depends on these retrieval errors being relatively rare. This paper describes the design and implementation of a replay framework for the case-based planner DERSNLP+EBL. DERSNLP+EBL extends current CBP methodology by incorporating explanation-based learning techniques that allow it to explain and learn from the retrieval failures it encounters. These techniques are used to refine judgements about case similarity in response to feedback when a wrong decision has been made. The same failure analysis is used in building the case library, through the addition of repairing cases. Large problems are split and stored as single goal subproblems. Multi-goal problems are stored only when these smaller cases fail to be merged into a full solution. An empirical evaluation of this approach demonstrates the advantage of learning from experienced retrieval failure.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Heaviest Induced Ancestors and Longest Common Substrings

Author: Gagie Travis
Gawrychowski Paweł
Nekrich Yakov
Publication venue
Publication date: 01/01/2013
Field of study

Suppose we have two trees on the same set of leaves, in which nodes are weighted such that children are heavier than their parents. We say a node from the first tree and a node from the second tree are induced together if they have a common leaf descendant. In this paper we describe data structures that efficiently support the following heaviest-induced-ancestor query: given a node from the first tree and a node from the second tree, find an induced pair of their ancestors with maximum combined weight. Our solutions are based on a geometric interpretation that enables us to find heaviest induced ancestors using range queries. We then show how to use these results to build an LZ-compressed index with which we can quickly find with high probability a longest substring common to the indexed string and a given pattern

arXiv.org e-Print Archive

CiteSeerX

MPG.PuRe

Word matching using single closed contours for indexing handwritten historical documents

Author: Alan F. Smeaton
C.C. Teppert
D. Cheng
F. Mokhtarian
L. Vincent
L.K. Huang
Noel E. O’Connor
R.F. Farag
S. Belongie
S. Madhvanath
S. Madhvanath
S. Madhvanath
Tomasz Adamek
W. Niblack
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2007
Field of study

Effective indexing is crucial for providing convenient access to scanned versions of large collections of historically valuable handwritten manuscripts. Since traditional handwriting recognizers based on optical character recognition (OCR) do not perform well on historical documents, recently a holistic word recognition approach has gained in popularity as an attractive and more straightforward solution (Lavrenko et al. in proc. document Image Analysis for Libraries (DIAL’04), pp. 278–287, 2004). Such techniques attempt to recognize words based on scalar and profile-based features extracted from whole word images. In this paper, we propose a new approach to holistic word recognition for historical handwritten manuscripts based on matching word contours instead of whole images or word profiles. The new method consists of robust extraction of closed word contours and the application of an elastic contour matching technique proposed originally for general shapes (Adamek and O’Connor in IEEE Trans Circuits Syst Video Technol 5:2004). We demonstrate that multiscale contour-based descriptors can effectively capture intrinsic word features avoiding any segmentation of words into smaller subunits. Our experiments show a recognition accuracy of 83%, which considerably exceeds the performance of other systems reported in the literature

Crossref

Irish Universities

DCU Online Research Access Service

Models of Type Theory Based on Moore Paths

Author: Orton Ian
Pitts Andrew M.
Publication venue
Publication date: 08/01/2019
Field of study

This paper introduces a new family of models of intensional Martin-L\"of type theory. We use constructive ordered algebra in toposes. Identity types in the models are given by a notion of Moore path. By considering a particular gros topos, we show that there is such a model that is non-truncated, i.e. contains non-trivial structure at all dimensions. In other words, in this model a type in a nested sequence of identity types can contain more than one element, no matter how great the degree of nesting. Although inspired by existing non-truncated models of type theory based on simplicial and cubical sets, the notion of model presented here is notable for avoiding any form of Kan filling condition in the semantics of types.Comment: This is a revised and expanded version of a paper with the same name that appeared in the proceedings of the 2nd International Conference on Formal Structures for Computation and Deduction (FSCD 2017

arXiv.org e-Print Archive

Episciences.org