4,505 research outputs found
Word add-in for ontology recognition: semantic enrichment of scientific literature
<p>Abstract</p> <p>Background</p> <p>In the current era of scientific research, efficient communication of information is paramount. As such, the nature of scholarly and scientific communication is changing; cyberinfrastructure is now absolutely necessary and new media are allowing information and knowledge to be more interactive and immediate. One approach to making knowledge more accessible is the addition of machine-readable semantic data to scholarly articles.</p> <p>Results</p> <p>The Word add-in presented here will assist authors in this effort by automatically recognizing and highlighting words or phrases that are likely information-rich, allowing authors to associate semantic data with those words or phrases, and to embed that data in the document as XML. The add-in and source code are publicly available at <url>http://www.codeplex.com/UCSDBioLit</url>.</p> <p>Conclusions</p> <p>The Word add-in for ontology term recognition makes it possible for an author to add semantic data to a document as it is being written and it encodes these data using XML tags that are effectively a standard in life sciences literature. Allowing authors to mark-up their own work will help increase the amount and quality of machine-readable literature metadata.</p
An Ontology-Driven Methodology To Derive Cases From Structured And Unstructured Sources
The problem-solving capability of a Case-Based Reasoning (CBR) system largely depends on the richness of its knowledge stored in the form of cases, i.e. the CaseBase (CB). Populating and subsequently maintaining a critical mass of cases in a CB is a tedious manual activity demanding vast human and operational resources. The need for human involvement in populating a CB can be drastically reduced as case-like knowledge already exists in the form of databases and documents and harnessed and transformed into cases that can be operationalized. Nevertheless, the transformation process poses many hurdles due to the disparate structure and the heterogeneous coding standards used. The featured work aims to address knowledge creation from heterogeneous sources and structures. To meet this end, this thesis presents a Multi-Source Case Acquisition and Transformation Info-Structure (MUSCATI). MUSCATI has been implemented as a multi-layer architecture using state-of-the-practice tools and can be perceived as a functional extension to traditional CBR-systems. In principle, MUSCATI can be applied in any domain but in this thesis healthcare was chosen. Thus, Electronic Medical Records (EMRs) were used as the source to generate the knowledge. The results from the experiments showed that the volume and diversity of cases improves the reasoning outcome of the CBR engine. The experiments showed that knowledge found in medical records (regardless of structure) can be leveraged and standardized to enhance the (medical) knowledge of traditional medical CBR systems. Subsequently, the Google search engine proved to be very critical in “fixing” and enriching the domain ontology on-the-fly
Ontology matching: state of the art and future challenges
shvaiko2013aInternational audienceAfter years of research on ontology matching, it is reasonable to consider several questions: is the field of ontology matching still making progress? Is this progress significant enough to pursue some further research? If so, what are the particularly promising directions? To answer these questions, we review the state of the art of ontology matching and analyze the results of recent ontology matching evaluations. These results show a measurable improvement in the field, the speed of which is albeit slowing down. We conjecture that significant improvements can be obtained only by addressing important challenges for ontology matching. We present such challenges with insights on how to approach them, thereby aiming to direct research into the most promising tracks and to facilitate the progress of the field
FAIR Metadata Standards for Low Carbon Energy Research—A Review of Practices and How to Advance
The principles of Findability, Accessibility, Interoperability, and Reusability (FAIR) have
been put forward to guide optimal sharing of data. The potential for industrial and social innovation
is vast. Domain-specific metadata standards are crucial in this context, but are widely missing in the energy sector. This report provides a collaborative response from the low carbon energy research
community for addressing the necessity of advancing FAIR metadata standards. We review and test
existing metadata practices in the domain based on a series of community workshops. We reflect
the perspectives of energy data stakeholders. The outcome is reported in terms of challenges and
elicits recommendations for advancing FAIR metadata standards in the energy domain across a broad
spectrum of stakeholders
Recommended from our members
MapReduce based RDF assisted distributed SVM for high throughput spam filtering
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityElectronic mail has become cast and embedded in our everyday lives. Billions of legitimate emails are sent on a daily basis. The widely established underlying infrastructure, its widespread availability as well as its ease of use have all acted as catalysts to such pervasive proliferation. Unfortunately, the same can be alleged about unsolicited bulk email, or rather spam. Various methods, as well as enabling architectures are available to try to mitigate spam permeation. In this respect, this dissertation compliments existing survey work in this area by contributing an extensive literature review of traditional and emerging spam filtering approaches. Techniques, approaches and architectures employed for spam filtering are appraised, critically assessing respective strengths and weaknesses.
Velocity, volume and variety are key characteristics of the spam challenge. MapReduce (M/R) has become increasingly popular as an Internet scale, data intensive processing platform. In the context of machine learning based spam filter training, support vector machine (SVM) based techniques have been proven effective. SVM training is however a computationally intensive process. In this dissertation, a M/R based distributed SVM algorithm for scalable spam filter training, designated MRSMO, is presented. By distributing and processing subsets of the training data across multiple participating computing nodes, the distributed SVM reduces spam filter training time significantly. To mitigate the accuracy degradation introduced by the adopted approach, a Resource Description Framework (RDF) based feedback loop is evaluated. Experimental results demonstrate that this improves the accuracy levels of the distributed SVM beyond the original sequential counterpart.
Effectively exploiting large scale, ‘Cloud’ based, heterogeneous processing capabilities for M/R in what can be considered a non-deterministic environment requires the consideration of a number of perspectives. In this work, gSched, a Hadoop M/R based, heterogeneous aware task to node matching and allocation scheme is designed. Using MRSMO as a baseline, experimental evaluation indicates that gSched improves on the performance of the out-of-the box Hadoop counterpart in a typical Cloud based infrastructure.
The focal contribution to knowledge is a scalable, heterogeneous infrastructure and machine learning based spam filtering scheme, able to capitalize on collaborative accuracy improvements through RDF based, end user feedback. MapReduce based RDF Assisted Distributed SVM for High Throughput Spam Filterin
Multi-stakeholder development of a serious game to explore the water-energy-food-land-climate nexus: The SIM4NEXUS approach
This is the final version of the article. Available from the publisher via the DOI in this record.Water, energy, food, land and climate form a tightly-connected nexus in which actions on one sector impact other sectors, creating feedbacks and unanticipated consequences. This is especially because at present, much scientific research and many policies are constrained to single discipline/sector silos that are often not interacting (e.g., water-related research/policy). However, experimenting with the interaction and determining how a change in one sector could impact another may require unreasonable time frames, be very difficult in practice and may be potentially dangerous, triggering any one of a number of unanticipated side-effects. Current modelling often neglects knowledge from practice. Therefore, a safe environment is required to test the potential cross-sectoral implications of policy decisions in one sector on other sectors. Serious games offer such an environment by creating realistic 'simulations', where long-term impacts of policies may be tested and rated. This paper describes how the ongoing (2016-2020) Horizon2020 project SIM4NEXUS will develop serious games investigating potential plausible cross-nexus implications and synergies due to policy interventions for 12 multi-scale case studies ranging from regional to global. What sets these games apart is that stakeholders and partners are involved in all aspects of the modelling definition and process, from case study conceptualisation, quantitative model development including the implementation and validation of each serious game. Learning from playing a serious game is justified by adopting a proof-of-concept for a specific regional case study in Sardinia (Italy). The value of multi-stakeholder involvement is demonstrated, and critical lessons learned for serious game development in general are presented.The work described in this paper has been conducted within the project SIM4NEXUS.
This project has received funding from the European Union’s Horizon 2020 research and innovation programme
under Grant Agreement No. 689150 SIM4NEXUS
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
Framework for the semantic alignment of enterprise’s domain knowledge
Nowadays, the consumption of goods and services on the Internet are increasing in a constant motion. Small and Medium Enterprises (SMEs) mostly from the traditional industry sectors are usually make business in weak and fragile market sectors, where customized products and services prevail. To survive and compete in the actual markets they have to readjust their business strategies by creating new manufacturing processes and establishing new business networks through new technological approaches. In order to compete with big enterprises, these partnerships aim the sharing of resources, knowledge and strategies to boost the sector’s business consolidation through the creation of dynamic manufacturing networks.
To facilitate such demand, it is proposed the development of a centralized information system, which allows enterprises to select and create dynamic manufacturing networks that would have the capability to monitor all the manufacturing process, including the assembly, packaging and distribution phases. Even the networking partners that come from the same area have multi and heterogeneous representations of the same knowledge, denoting their own view of the domain. Thus, different conceptual, semantic, and consequently, diverse lexically knowledge representations may occur in the network, causing non-transparent sharing of information and interoperability inconsistencies. The creation of a framework supported by a tool that in a flexible way would enable the identification, classification and resolution of such semantic heterogeneities is required. This tool will support the network in the semantic mapping establishments, to facilitate the various enterprises information systems integration
DESIGN AND EXPLORATION OF NEW MODELS FOR SECURITY AND PRIVACY-SENSITIVE COLLABORATION SYSTEMS
Collaboration has been an area of interest in many domains including education, research, healthcare supply chain, Internet of things, and music etc. It enhances problem solving through expertise sharing, ideas sharing, learning and resource sharing, and improved decision making.
To address the limitations in the existing literature, this dissertation presents a design science artifact and a conceptual model for collaborative environment. The first artifact is a blockchain based collaborative information exchange system that utilizes blockchain technology and semi-automated ontology mappings to enable secure and interoperable health information exchange among different health care institutions. The conceptual model proposed in this dissertation explores the factors that influences professionals continued use of video- conferencing applications. The conceptual model investigates the role the perceived risks and benefits play in influencing professionals’ attitude towards VC apps and consequently its active and automatic use
- …