347 research outputs found

    Proceedings of the Workshop Semantic Content Acquisition and Representation (SCAR) 2007

    Get PDF
    This is the proceedings of the Workshop on Semantic Content Acquisition and Representation, held in conjunction with NODALIDA 2007, on May 24 2007 in Tartu, Estonia.</p

    Symmetry, Compact Closure and Dagger Compactness for Categories of Convex Operational Models

    Full text link
    In the categorical approach to the foundations of quantum theory, one begins with a symmetric monoidal category, the objects of which represent physical systems, and the morphisms of which represent physical processes. Usually, this category is taken to be at least compact closed, and more often, dagger compact, enforcing a certain self-duality, whereby preparation processes (roughly, states) are inter-convertible with processes of registration (roughly, measurement outcomes). This is in contrast to the more concrete "operational" approach, in which the states and measurement outcomes associated with a physical system are represented in terms of what we here call a "convex operational model": a certain dual pair of ordered linear spaces -- generally, {\em not} isomorphic to one another. On the other hand, state spaces for which there is such an isomorphism, which we term {\em weakly self-dual}, play an important role in reconstructions of various quantum-information theoretic protocols, including teleportation and ensemble steering. In this paper, we characterize compact closure of symmetric monoidal categories of convex operational models in two ways: as a statement about the existence of teleportation protocols, and as the principle that every process allowed by that theory can be realized as an instance of a remote evaluation protocol --- hence, as a form of classical probabilistic conditioning. In a large class of cases, which includes both the classical and quantum cases, the relevant compact closed categories are degenerate, in the weak sense that every object is its own dual. We characterize the dagger-compactness of such a category (with respect to the natural adjoint) in terms of the existence, for each system, of a {\em symmetric} bipartite state, the associated conditioning map of which is an isomorphism

    Optimal-constraint lexicons for requirements specifications

    Full text link
    Constrained Natural Languages (CNLs) are becoming an increasingly popular way of writing technical documents such as requirements specifications. This is because CNLs aim to reduce the ambiguity inherent within natural languages, whilst maintaining their readability and expressiveness. The design of existing CNLs appears to be unfocused towards achieving specific quality outcomes, in that the majority of lexical selections have been based upon lexicographer preferences rather than an optimum trade-off between quality factors such as ambiguity, readability, expressiveness, and lexical magnitude. In this paper we introduce the concept of 'replaceability' as a way of identifying the lexical redundancy inherent within a sample of requirements. Our novel and practical approach uses Natural Language Processing (NLP) techniques to enable us to make dynamic trade-offs between quality factors to optimise the resultant CNL. We also challenge the concept of a CNL being a one-dimensional static language, and demonstrate that our optimal-constraint process results in a CNL that can adapt to a changing domain while maintaining its expressiveness. © Springer-Verlag Berlin Heidelberg 2007

    Knowledge-based methods for automatic extraction of domain-specific ontologies

    Get PDF
    Semantic web technology aims at developing methodologies for representing large amount of knowledge in web accessible form. The semantics of knowledge should be easy to interpret and understand by computer programs, so that sharing and utilizing knowledge across the Web would be possible. Domain specific ontologies form the basis for knowledge representation in the semantic web. Research on automated development of ontologies from texts has become increasingly important because manual construction of ontologies is labor intensive and costly, and, at the same time, large amount of texts for individual domains is already available in electronic form. However, automatic extraction of domain specific ontologies is challenging due to the unstructured nature of texts and inherent semantic ambiguities in natural language. Moreover, the large size of texts to be processed renders full-fledged natural language processing methods infeasible. In this dissertation, we develop a set of knowledge-based techniques for automatic extraction of ontological components (concepts, taxonomic and non-taxonomic relations) from domain texts. The proposed methods combine information retrieval metrics, lexical knowledge-base(like WordNet), machine learning techniques, heuristics, and statistical approaches to meet the challenge of the task. These methods are domain-independent and automatic approaches. For extraction of concepts, the proposed WNSCA+{PE, POP} method utilizes the lexical knowledge base WordNet to improve precision and recall over the traditional information retrieval metrics. A WordNet-based approach, the compound term heuristic, and a supervised learning approach are developed for taxonomy extraction. We also developed a weighted word-sense disambiguation method for use with the WordNet-based approach. An unsupervised approach using log-likelihood ratios is proposed for extracting non-taxonomic relations. Further more, a supervised approach is investigated to learn the semantic constraints for identifying relations from prepositional phrases. The proposed methods are validated by experiments with the Electronic Voting and the Tender Offers, Mergers, and Acquisitions domain corpus. Experimental results and comparisons with some existing approaches clearly indicate the superiority of our methods. In summary, a good combination of information retrieval, lexical knowledge base, statistics and machine learning methods in this study has led to the techniques efficient and effective for extracting ontological components automatically

    A Machine learning approach to POS tagging

    Get PDF
    We have applied inductive learning of statistical decision trees and relaxation labelling to the Natural Language Processing (NLP) task of morphosyntactic disambiguation (Part Of Speech Tagging). The learning process is supervised and obtains a language model oriented to resolve POS ambiguities. This model consists of a set of statistical decision trees expressing distribution of tags and words in some relevant contexts. The acquired language models are complete enough to be directly used as sets of POS disambiguation rules, and include more complex contextual information than simple collections of n-grams usually used in statistical taggers. We have implemented a quite simple and fast tagger that has been tested and evaluated on the Wall Street Journal (WSJ) corpus with a remarkable accuracy. However, better results can be obtained by translating the trees into rules to feed a flexible relaxation labelling based tagger. In this direction we describe a tagger which is able to use information of any kind (n-grams, automatically acquired constraints, linguistically motivated manually written constraints, etc.), and in particular to incorporate the machine learned decision trees. Simultaneously, we address the problem of tagging when only small training material is available, which is crucial in any process of constructing, from scratch, an annotated corpus. We show that quite high accuracy can be achieved with our system in this situation.Postprint (published version

    POS Tagging Using Relaxation Labelling

    Full text link
    Relaxation labelling is an optimization technique used in many fields to solve constraint satisfaction problems. The algorithm finds a combination of values for a set of variables such that satisfies -to the maximum possible degree- a set of given constraints. This paper describes some experiments performed applying it to POS tagging, and the results obtained. It also ponders the possibility of applying it to word sense disambiguation.Comment: compressed & uuencoded postscript file. Paper length: 39 page
    • …
    corecore