902 research outputs found
Recommended from our members
Ontology learning for Semantic Web Services
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 18/10/2010.The expansion of Semantic Web Services is restricted by traditional ontology engineering methods. Manual ontology development is time consuming, expensive and a resource exhaustive task. Consequently, it is important to support ontology engineers by
automating the ontology acquisition process to help deliver the Semantic Web vision.
Existing Web Services offer an affluent source of domain knowledge for ontology
engineers. Ontology learning can be seen as a plug-in in the Web Service ontology
development process, which can be used by ontology engineers to develop and maintain
an ontology that evolves with current Web Services. Supporting the domain engineer
with an automated tool whilst building an ontological domain model, serves the purpose
of reducing time and effort in acquiring the domain concepts and relations from Web
Service artefacts, whilst effectively speeding up the adoption of Semantic Web Services, thereby allowing current Web Services to accomplish their full potential. With that in mind, a Service Ontology Learning Framework (SOLF) is developed and
applied to a real set of Web Services. The research contributes a rigorous method that
effectively extracts domain concepts, and relations between these concepts, from Web
Services and automatically builds the domain ontology. The method applies pattern-based
information extraction techniques to automatically learn domain concepts and
relations between those concepts. The framework is automated via building a tool that implements the techniques. Applying the SOLF and the tool on different sets of services results in an automatically built domain ontology model that represents semantic knowledge in the underlying domain. The framework effectiveness, in extracting domain concepts and relations, is evaluated
by its appliance on varying sets of commercial Web Services including the financial domain. The standard evaluation metrics, precision and recall, are employed to determine both the accuracy and coverage of the learned ontology models. Both the
lexical and structural dimensions of the models are evaluated thoroughly. The evaluation results are encouraging, providing concrete outcomes in an area that is little researched
Semantic enrichment of knowledge sources supported by domain ontologies
This thesis introduces a novel conceptual framework to support the creation of knowledge representations based on enriched Semantic Vectors, using the classical vector space model approach extended with ontological support. One of the primary research challenges addressed here relates to the process of formalization and representation of document contents, where most existing approaches are limited and only take into account the explicit, word-based information in the document. This research explores how traditional knowledge representations can be enriched through incorporation of implicit information derived from the complex relationships (semantic associations) modelled by domain ontologies with the addition of information presented in documents. The relevant achievements pursued by this thesis are the following: (i) conceptualization of a model that enables the semantic enrichment of knowledge sources supported by domain experts; (ii) development of a method for extending the traditional vector space, using domain ontologies; (iii) development of a method to support ontology learning, based on the discovery of new ontological relations expressed in non-structured information sources; (iv) development of a process to evaluate the semantic enrichment; (v) implementation of a proof-of-concept, named SENSE (Semantic Enrichment kNowledge SourcEs), which enables to validate the ideas established under the scope of this thesis; (vi) publication of several scientific articles and the support to 4 master dissertations carried out by the department of Electrical and Computer Engineering from FCT/UNL. It is worth mentioning that the work developed under the semantic referential covered by this thesis has reused relevant achievements within the scope of research European projects, in order to address approaches which are considered scientifically sound and coherent and avoid “reinventing the wheel”.European research projects - CoSpaces (IST-5-034245), CRESCENDO (FP7-234344) and MobiS (FP7-318452
Knowledge Selection in Category-Based Inductive Reasoning
Current theories of category-based inductive reasoning can be distinguished by the emphasis they place on structured and unstructured knowledge. Theories which draw on unstructured knowledge focus on associative strength, or temporal and spatial contiguity between categories. In contrast, accounts which draw on structured knowledge make reference to the underlying theoretical frameworks which relate categories to one another, such as causal or taxonomic relationships. In this thesis, it is argued that this apparent dichotomy can be resolved if one ascribes different processing characteristics to these two types of knowledge. That is, unstructured knowledge influences inductive reasoning effortlessly and relatively automatically, whereas the use of structured knowledge requires effort and the availability of cognitive resources. Understanding these diverging processes illuminates how background knowledge is selected during the inference process.
The thesis demonstrates that structured and unstructured knowledge are dissociable and influence reasoning in line with their unique processing characteristics. Using secondary task and speeded response paradigms, it shows that unstructured knowledge is most influential when people are cognitively burdened or forced to respond fast, whereas they can draw on more elaborate structured knowledge if they are not cognitively compromised. This is especially evident for the causal asymmetry effect, in which people make stronger inferences from cause to effect categories, than vice versa. This Bayesian normative effect disappears when people have to contend with a secondary task or respond under time pressure.
The next experiments demonstrate that this dissociation between structured and unstructured knowledge is also evident for a more naturalistic inductive reasoning paradigm in which people generate their own inferences.
In the final experiments, it is shown how the selection of appropriate knowledge ties in with more domain-general processes, and especially inhibitory control. When responses based on structured and unstructured knowledge conflict, people’s ability to reason based on appropriate structured knowledge depends upon having relevant background knowledge and on their ability to inhibit the lure from inappropriate unstructured knowledge.
The thesis concludes with a discussion of how the concepts of structured and unstructured knowledge illuminate the processes underlying knowledge selection for category-based inductive reasoning. It also looks at the implications the findings have for different theories of category-based induction, and for our understanding of human reasoning processes more generally
Exploiting Wikipedia Semantics for Computing Word Associations
Semantic association computation is the process of automatically quantifying the strength of a semantic connection between two textual units based on various lexical and semantic relations such as hyponymy (car and vehicle) and functional associations (bank and manager). Humans have can infer implicit relationships between two textual units based on their knowledge about the world and their ability to reason about that knowledge. Automatically imitating this behavior is limited by restricted knowledge and poor ability to infer hidden relations.
Various factors affect the performance of automated approaches to computing semantic association strength. One critical factor is the selection of a suitable knowledge source for extracting knowledge about the implicit semantic relations. In the past few years, semantic association computation approaches have started to exploit web-originated resources as substitutes for conventional lexical semantic resources such as thesauri, machine readable dictionaries and lexical databases. These conventional knowledge sources suffer from limitations such as coverage issues, high construction and maintenance costs and limited availability. To overcome these issues one solution is to use the wisdom of crowds in the form of collaboratively constructed knowledge sources. An excellent example of such knowledge sources is Wikipedia which stores detailed information not only about the concepts themselves but also about various aspects of the relations among concepts.
The overall goal of this thesis is to demonstrate that using Wikipedia for computing word association strength yields better estimates of humans' associations than the approaches based on other structured and unstructured knowledge sources. There are two key challenges to achieve this goal: first, to exploit various semantic association models based on different aspects of Wikipedia in developing new measures of semantic associations; and second, to evaluate these measures compared to human performance in a range of tasks. The focus of the thesis is on exploring two aspects of Wikipedia: as a formal knowledge source, and as an informal text corpus.
The first contribution of the work included in the thesis is that it effectively exploited the knowledge source aspect of Wikipedia by developing new measures of semantic associations based on Wikipedia hyperlink structure, informative-content of articles and combinations of both elements. It was found that Wikipedia can be effectively used for computing noun-noun similarity. It was also found that a model based on hybrid combinations of Wikipedia structure and informative-content based features performs better than those based on individual features. It was also found that the structure based measures outperformed the informative content based measures on both semantic similarity and semantic relatedness computation tasks.
The second contribution of the research work in the thesis is that it effectively exploited the corpus aspect of Wikipedia by developing a new measure of semantic association based on asymmetric word associations. The thesis introduced the concept of asymmetric associations based measure using the idea of directional context inspired by the free word association task. The underlying assumption was that the association strength can change with the changing context. It was found that the asymmetric association based measure performed better than the symmetric measures on semantic association computation, relatedness based word choice and causality detection tasks. However, asymmetric-associations based measures have no advantage for synonymy-based word choice tasks. It was also found that Wikipedia is not a good knowledge source for capturing verb-relations due to its focus on encyclopedic concepts specially nouns.
It is hoped that future research will build on the experiments and discussions presented in this thesis to explore new avenues using Wikipedia for finding deeper and semantically more meaningful associations in a wide range of application areas based on humans' estimates of word associations
Corporate Smart Content Evaluation
Nowadays, a wide range of information sources are available due to the
evolution of web and collection of data. Plenty of these information are
consumable and usable by humans but not understandable and processable by
machines. Some data may be directly accessible in web pages or via data feeds,
but most of the meaningful existing data is hidden within deep web databases
and enterprise information systems. Besides the inability to access a wide
range of data, manual processing by humans is effortful, error-prone and not
contemporary any more. Semantic web technologies deliver capabilities for
machine-readable, exchangeable content and metadata for automatic processing
of content. The enrichment of heterogeneous data with background knowledge
described in ontologies induces re-usability and supports automatic processing
of data. The establishment of “Corporate Smart Content” (CSC) - semantically
enriched data with high information content with sufficient benefits in
economic areas - is the main focus of this study. We describe three actual
research areas in the field of CSC concerning scenarios and datasets
applicable for corporate applications, algorithms and research. Aspect-
oriented Ontology Development advances modular ontology development and
partial reuse of existing ontological knowledge. Complex Entity Recognition
enhances traditional entity recognition techniques to recognize clusters of
related textual information about entities. Semantic Pattern Mining combines
semantic web technologies with pattern learning to mine for complex models by
attaching background knowledge. This study introduces the afore-mentioned
topics by analyzing applicable scenarios with economic and industrial focus,
as well as research emphasis. Furthermore, a collection of existing datasets
for the given areas of interest is presented and evaluated. The target
audience includes researchers and developers of CSC technologies - people
interested in semantic web features, ontology development, automation,
extracting and mining valuable information in corporate environments. The aim
of this study is to provide a comprehensive and broad overview over the three
topics, give assistance for decision making in interesting scenarios and
choosing practical datasets for evaluating custom problem statements. Detailed
descriptions about attributes and metadata of the datasets should serve as
starting point for individual ideas and approaches
- …