9,056 research outputs found
The Semantic Grid: A future e-Science infrastructure
e-Science offers a promising vision of how computer and communication technology can support and enhance the scientific process. It does this by enabling scientists to generate, analyse, share and discuss their insights, experiments and results in an effective manner. The underlying computer infrastructure that provides these facilities is commonly referred to as the Grid. At this time, there are a number of grid applications being developed and there is a whole raft of computer technologies that provide fragments of the necessary functionality. However there is currently a major gap between these endeavours and the vision of e-Science in which there is a high degree of easy-to-use and seamless automation and in which there are flexible collaborations and computations on a global scale. To bridge this practice–aspiration divide, this paper presents a research agenda whose aim is to move from the current state of the art in e-Science infrastructure, to the future infrastructure that is needed to support the full richness of the e-Science vision. Here the future e-Science research infrastructure is termed the Semantic Grid (Semantic Grid to Grid is meant to connote a similar relationship to the one that exists between the Semantic Web and the Web). In particular, we present a conceptual architecture for the Semantic Grid. This architecture adopts a service-oriented perspective in which distinct stakeholders in the scientific process, represented as software agents, provide services to one another, under various service level agreements, in various forms of marketplace. We then focus predominantly on the issues concerned with the way that knowledge is acquired and used in such environments since we believe this is the key differentiator between current grid endeavours and those envisioned for the Semantic Grid
Automatic annotation of bioinformatics workflows with biomedical ontologies
Legacy scientific workflows, and the services within them, often present
scarce and unstructured (i.e. textual) descriptions. This makes it difficult to
find, share and reuse them, thus dramatically reducing their value to the
community. This paper presents an approach to annotating workflows and their
subcomponents with ontology terms, in an attempt to describe these artifacts in
a structured way. Despite a dearth of even textual descriptions, we
automatically annotated 530 myExperiment bioinformatics-related workflows,
including more than 2600 workflow-associated services, with relevant
ontological terms. Quantitative evaluation of the Information Content of these
terms suggests that, in cases where annotation was possible at all, the
annotation quality was comparable to manually curated bioinformatics resources.Comment: 6th International Symposium on Leveraging Applications (ISoLA 2014
conference), 15 pages, 4 figure
An ontology-aided, natural language-based approach for multi-constraint BIM model querying
Being able to efficiently retrieve the required building information is
critical for construction project stakeholders to carry out their engineering
and management activities. Natural language interface (NLI) systems are
emerging as a time and cost-effective way to query Building Information Models
(BIMs). However, the existing methods cannot logically combine different
constraints to perform fine-grained queries, dampening the usability of natural
language (NL)-based BIM queries. This paper presents a novel ontology-aided
semantic parser to automatically map natural language queries (NLQs) that
contain different attribute and relational constraints into computer-readable
codes for querying complex BIM models. First, a modular ontology was developed
to represent NL expressions of Industry Foundation Classes (IFC) concepts and
relationships, and was then populated with entities from target BIM models to
assimilate project-specific information. Hereafter, the ontology-aided semantic
parser progressively extracts concepts, relationships, and value restrictions
from NLQs to fully identify constraint conditions, resulting in standard SPARQL
queries with reasoning rules to successfully retrieve IFC-based BIM models. The
approach was evaluated based on 225 NLQs collected from BIM users, with a 91%
accuracy rate. Finally, a case study about the design-checking of a real-world
residential building demonstrates the practical value of the proposed approach
in the construction industry
Augmented trading:From news articles to stock price predictions using semantics
This thesis tries to answer the question how to predict the reaction of the stock market to news articles using the latest suitable developments in Natural Language Processing. This is done using text classiffication where a new article is matched to a category of articles which have a certain influence on the stock price. The thesis first discusses why analysis of news articles is a feasible approach to predicting the stock market and why analysis of past prices should not be build upon. From related work in this domain two main design choices are extracted; what to take as features for news articles and how to couple them with the changes in stock price. This thesis then suggests which different features are possible to extract from articles resulting in a template for features which can deal with negation, favorability, abstracts from companies and uses domain knowledge and synonyms for generalization. To couple the features to changes in stock price a survey is given of several text classiffication techniques from which it is concluded that Support Vector Machines are very suitable for the domain of stock prices and extensive features. The system has been tested with a unique data set of news articles for which results are reported that are signifficantly better than random. The results improve even more when only headlines of news articles are taken into account. Because the system is only tested with closing prices it cannot concluded that it will work in practice but this can be easily tested if stock prices during the days are available. The main suggestions for feature work are to test the system with this data and to improve the filling of the template so it can also be used in other areas of favorability analysis or maybe even to extract interesting information out of texts
Automatic taxonomy evaluation
This thesis would not be made possible without the generous support of IATA.Les taxonomies sont une représentation essentielle des connaissances, jouant un rôle central dans de nombreuses applications riches en connaissances. Malgré cela, leur construction est laborieuse que ce soit manuellement ou automatiquement, et l'évaluation quantitative de taxonomies est un sujet négligé. Lorsque les chercheurs se concentrent sur la construction d'une taxonomie à partir de grands corpus non structurés, l'évaluation est faite souvent manuellement, ce qui implique des biais et se traduit souvent par une reproductibilité limitée. Les entreprises qui souhaitent améliorer leur taxonomie manquent souvent d'étalon ou de référence, une sorte de taxonomie bien optimisée pouvant service de référence.
Par conséquent, des connaissances et des efforts spécialisés sont nécessaires pour évaluer une taxonomie.
Dans ce travail, nous soutenons que l'évaluation d'une taxonomie effectuée automatiquement et de manière reproductible est aussi importante que la génération automatique de telles taxonomies. Nous proposons deux nouvelles méthodes d'évaluation qui produisent des scores moins biaisés: un modèle de classification de la taxonomie extraite d'un corpus étiqueté, et un modèle de langue non supervisé qui sert de source de connaissances pour évaluer les relations hyperonymiques. Nous constatons que nos substituts d'évaluation corrèlent avec les jugements humains et que les modèles de langue pourraient imiter les experts humains dans les tâches riches en connaissances.Taxonomies are an essential knowledge representation and play an important role in classification and numerous knowledge-rich applications, yet quantitative taxonomy evaluation remains to be overlooked and left much to be desired. While studies focus on automatic taxonomy construction (ATC) for extracting meaningful structures and semantics from large corpora, their evaluation is usually manual and subject to bias and low reproducibility. Companies wishing to improve their domain-focused taxonomies also suffer from lacking ground-truths. In fact, manual taxonomy evaluation requires substantial labour and expert knowledge.
As a result, we argue in this thesis that automatic taxonomy evaluation (ATE) is just as important as taxonomy construction. We propose two novel taxonomy evaluation methods for automatic taxonomy scoring, leveraging supervised classification for labelled corpora and unsupervised language modelling as a knowledge source for unlabelled data. We show that our evaluation proxies can exert similar effects and correlate well with human judgments and that language models can imitate human experts on knowledge-rich tasks
Facilitating design learning through faceted classification of in-service information
The maintenance and service records collected and maintained by engineering companies are a useful
resource for the ongoing support of products. Such records are typically semi-structured and contain
key information such as a description of the issue and the product affected. It is suggested that further
value can be realised from the collection of these records for indicating recurrent and systemic issues
which may not have been apparent previously. This paper presents a faceted classification approach to
organise the information collection that might enhance retrieval and also facilitate learning from in-service
experiences. The faceted classification may help to expedite responses to urgent in-service issues as
well as to allow for patterns and trends in the records to be analysed, either automatically using suitable
data mining algorithms or by manually browsing the classification tree. The paper describes the application
of the approach to aerospace in-service records, where the potential for knowledge discovery is
demonstrated
Context Aware Computing for The Internet of Things: A Survey
As we are moving towards the Internet of Things (IoT), the number of sensors
deployed around the world is growing at a rapid pace. Market research has shown
a significant growth of sensor deployments over the past decade and has
predicted a significant increment of the growth rate in the future. These
sensors continuously generate enormous amounts of data. However, in order to
add value to raw sensor data we need to understand it. Collection, modelling,
reasoning, and distribution of context in relation to sensor data plays
critical role in this challenge. Context-aware computing has proven to be
successful in understanding sensor data. In this paper, we survey context
awareness from an IoT perspective. We present the necessary background by
introducing the IoT paradigm and context-aware fundamentals at the beginning.
Then we provide an in-depth analysis of context life cycle. We evaluate a
subset of projects (50) which represent the majority of research and commercial
solutions proposed in the field of context-aware computing conducted over the
last decade (2001-2011) based on our own taxonomy. Finally, based on our
evaluation, we highlight the lessons to be learnt from the past and some
possible directions for future research. The survey addresses a broad range of
techniques, methods, models, functionalities, systems, applications, and
middleware solutions related to context awareness and IoT. Our goal is not only
to analyse, compare and consolidate past research work but also to appreciate
their findings and discuss their applicability towards the IoT.Comment: IEEE Communications Surveys & Tutorials Journal, 201
- …