347 research outputs found
Proceedings of the Workshop Semantic Content Acquisition and Representation (SCAR) 2007
This is the proceedings of the Workshop on Semantic Content Acquisition and Representation, held in conjunction with NODALIDA 2007, on May 24 2007 in Tartu, Estonia.</p
Symmetry, Compact Closure and Dagger Compactness for Categories of Convex Operational Models
In the categorical approach to the foundations of quantum theory, one begins
with a symmetric monoidal category, the objects of which represent physical
systems, and the morphisms of which represent physical processes. Usually, this
category is taken to be at least compact closed, and more often, dagger
compact, enforcing a certain self-duality, whereby preparation processes
(roughly, states) are inter-convertible with processes of registration
(roughly, measurement outcomes). This is in contrast to the more concrete
"operational" approach, in which the states and measurement outcomes associated
with a physical system are represented in terms of what we here call a "convex
operational model": a certain dual pair of ordered linear spaces -- generally,
{\em not} isomorphic to one another. On the other hand, state spaces for which
there is such an isomorphism, which we term {\em weakly self-dual}, play an
important role in reconstructions of various quantum-information theoretic
protocols, including teleportation and ensemble steering. In this paper, we
characterize compact closure of symmetric monoidal categories of convex
operational models in two ways: as a statement about the existence of
teleportation protocols, and as the principle that every process allowed by
that theory can be realized as an instance of a remote evaluation protocol ---
hence, as a form of classical probabilistic conditioning. In a large class of
cases, which includes both the classical and quantum cases, the relevant
compact closed categories are degenerate, in the weak sense that every object
is its own dual. We characterize the dagger-compactness of such a category
(with respect to the natural adjoint) in terms of the existence, for each
system, of a {\em symmetric} bipartite state, the associated conditioning map
of which is an isomorphism
Optimal-constraint lexicons for requirements specifications
Constrained Natural Languages (CNLs) are becoming an increasingly popular way of writing technical documents such as requirements specifications. This is because CNLs aim to reduce the ambiguity inherent within natural languages, whilst maintaining their readability and expressiveness. The design of existing CNLs appears to be unfocused towards achieving specific quality outcomes, in that the majority of lexical selections have been based upon lexicographer preferences rather than an optimum trade-off between quality factors such as ambiguity, readability, expressiveness, and lexical magnitude. In this paper we introduce the concept of 'replaceability' as a way of identifying the lexical redundancy inherent within a sample of requirements. Our novel and practical approach uses Natural Language Processing (NLP) techniques to enable us to make dynamic trade-offs between quality factors to optimise the resultant CNL. We also challenge the concept of a CNL being a one-dimensional static language, and demonstrate that our optimal-constraint process results in a CNL that can adapt to a changing domain while maintaining its expressiveness. © Springer-Verlag Berlin Heidelberg 2007
Knowledge-based methods for automatic extraction of domain-specific ontologies
Semantic web technology aims at developing methodologies for representing large amount of knowledge in web accessible form. The semantics of knowledge should be easy to interpret and understand by computer programs, so that sharing and utilizing knowledge across the Web would be possible. Domain specific ontologies form the basis for knowledge representation in the semantic web. Research on automated development of ontologies from texts has become increasingly important because manual construction of ontologies is labor intensive and costly, and, at the same time, large amount of texts for individual domains is already available in electronic form. However, automatic extraction of domain specific ontologies is challenging due to the unstructured nature of texts and inherent semantic ambiguities in natural language. Moreover, the large size of texts to be processed renders full-fledged natural language processing methods infeasible. In this dissertation, we develop a set of knowledge-based techniques for automatic extraction of ontological components (concepts, taxonomic and non-taxonomic relations) from domain texts. The proposed methods combine information retrieval metrics, lexical knowledge-base(like WordNet), machine learning techniques, heuristics, and statistical approaches to meet the challenge of the task. These methods are domain-independent and automatic approaches. For extraction of concepts, the proposed WNSCA+{PE, POP} method utilizes the lexical knowledge base WordNet to improve precision and recall over the traditional information retrieval metrics. A WordNet-based approach, the compound term heuristic, and a supervised learning approach are developed for taxonomy extraction. We also developed a weighted word-sense disambiguation method for use with the WordNet-based approach. An unsupervised approach using log-likelihood ratios is proposed for extracting non-taxonomic relations. Further more, a supervised approach is investigated to learn the semantic constraints for identifying relations from prepositional phrases. The proposed methods are validated by experiments with the Electronic Voting and the Tender Offers, Mergers, and Acquisitions domain corpus. Experimental results and comparisons with some existing approaches clearly indicate the superiority of our methods. In summary, a good combination of information retrieval, lexical knowledge base, statistics and machine learning methods in this study has led to the techniques efficient and effective for extracting ontological components automatically
A Machine learning approach to POS tagging
We have applied inductive learning of statistical decision trees
and relaxation labelling to the Natural Language Processing (NLP)
task of morphosyntactic disambiguation (Part Of Speech Tagging).
The learning process is supervised and obtains a language
model oriented to resolve POS ambiguities. This model consists
of a set of statistical decision trees expressing distribution of
tags and words in some relevant contexts.
The acquired language models are complete enough to be directly
used as sets of POS disambiguation rules, and include more complex
contextual information than simple collections of n-grams usually
used in statistical taggers.
We have implemented a quite simple and fast tagger that has been
tested and evaluated on the Wall Street Journal (WSJ) corpus with
a remarkable accuracy.
However, better results can be obtained by translating the trees
into rules to feed a flexible relaxation labelling based tagger.
In this direction we describe a tagger which is able to use
information of any kind (n-grams, automatically acquired constraints,
linguistically motivated manually written constraints, etc.), and in
particular to incorporate the machine learned decision trees.
Simultaneously, we address the problem of tagging when only
small training material is available, which is crucial in any process
of constructing, from scratch, an annotated corpus. We show that quite
high accuracy can be achieved with our system in this situation.Postprint (published version
POS Tagging Using Relaxation Labelling
Relaxation labelling is an optimization technique used in many fields to
solve constraint satisfaction problems. The algorithm finds a combination of
values for a set of variables such that satisfies -to the maximum possible
degree- a set of given constraints. This paper describes some experiments
performed applying it to POS tagging, and the results obtained. It also ponders
the possibility of applying it to word sense disambiguation.Comment: compressed & uuencoded postscript file. Paper length: 39 page
- …