22,655 research outputs found
Dealing with uncertain entities in ontology alignment using rough sets
This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Ontology alignment facilitates exchange of knowledge among heterogeneous data sources. Many approaches to ontology alignment use multiple similarity measures to map entities between ontologies. However, it remains a key challenge in dealing with uncertain entities for which the employed ontology alignment measures produce conflicting results on similarity of the mapped entities. This paper presents OARS, a rough-set based approach to ontology alignment which achieves a high degree of accuracy in situations where uncertainty arises because of the conflicting results generated by different similarity measures. OARS employs a combinational approach and considers both lexical and structural similarity measures. OARS is extensively evaluated with the benchmark ontologies of the ontology alignment evaluation initiative (OAEI) 2010, and performs best in the aspect of recall in comparison with a number of alignment systems while generating a comparable performance in precision
Recommended from our members
Evaluating the resilience and security of boundaryless, evolving socio-technical Systems of Systems
Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline
From medical charts to national census, healthcare has traditionally operated
under a paper-based paradigm. However, the past decade has marked a long and
arduous transformation bringing healthcare into the digital age. Ranging from
electronic health records, to digitized imaging and laboratory reports, to
public health datasets, today, healthcare now generates an incredible amount of
digital information. Such a wealth of data presents an exciting opportunity for
integrated machine learning solutions to address problems across multiple
facets of healthcare practice and administration. Unfortunately, the ability to
derive accurate and informative insights requires more than the ability to
execute machine learning models. Rather, a deeper understanding of the data on
which the models are run is imperative for their success. While a significant
effort has been undertaken to develop models able to process the volume of data
obtained during the analysis of millions of digitalized patient records, it is
important to remember that volume represents only one aspect of the data. In
fact, drawing on data from an increasingly diverse set of sources, healthcare
data presents an incredibly complex set of attributes that must be accounted
for throughout the machine learning pipeline. This chapter focuses on
highlighting such challenges, and is broken down into three distinct
components, each representing a phase of the pipeline. We begin with attributes
of the data accounted for during preprocessing, then move to considerations
during model building, and end with challenges to the interpretation of model
output. For each component, we present a discussion around data as it relates
to the healthcare domain and offer insight into the challenges each may impose
on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20
Pages, 1 Figur
Expert Elicitation for Reliable System Design
This paper reviews the role of expert judgement to support reliability
assessments within the systems engineering design process. Generic design
processes are described to give the context and a discussion is given about the
nature of the reliability assessments required in the different systems
engineering phases. It is argued that, as far as meeting reliability
requirements is concerned, the whole design process is more akin to a
statistical control process than to a straightforward statistical problem of
assessing an unknown distribution. This leads to features of the expert
judgement problem in the design context which are substantially different from
those seen, for example, in risk assessment. In particular, the role of experts
in problem structuring and in developing failure mitigation options is much
more prominent, and there is a need to take into account the reliability
potential for future mitigation measures downstream in the system life cycle.
An overview is given of the stakeholders typically involved in large scale
systems engineering design projects, and this is used to argue the need for
methods that expose potential judgemental biases in order to generate analyses
that can be said to provide rational consensus about uncertainties. Finally, a
number of key points are developed with the aim of moving toward a framework
that provides a holistic method for tracking reliability assessment through the
design process.Comment: This paper commented in: [arXiv:0708.0285], [arXiv:0708.0287],
[arXiv:0708.0288]. Rejoinder in [arXiv:0708.0293]. Published at
http://dx.doi.org/10.1214/088342306000000510 in the Statistical Science
(http://www.imstat.org/sts/) by the Institute of Mathematical Statistics
(http://www.imstat.org
Formal and Informal Methods for Multi-Core Design Space Exploration
We propose a tool-supported methodology for design-space exploration for
embedded systems. It provides means to define high-level models of applications
and multi-processor architectures and evaluate the performance of different
deployment (mapping, scheduling) strategies while taking uncertainty into
account. We argue that this extension of the scope of formal verification is
important for the viability of the domain.Comment: In Proceedings QAPL 2014, arXiv:1406.156
Technology Readiness Levels for Machine Learning Systems
The development and deployment of machine learning (ML) systems can be
executed easily with modern tools, but the process is typically rushed and
means-to-an-end. The lack of diligence can lead to technical debt, scope creep
and misaligned objectives, model misuse and failures, and expensive
consequences. Engineering systems, on the other hand, follow well-defined
processes and testing standards to streamline development for high-quality,
reliable results. The extreme is spacecraft systems, where mission critical
measures and robustness are ingrained in the development process. Drawing on
experience in both spacecraft engineering and ML (from research through product
across domain areas), we have developed a proven systems engineering approach
for machine learning development and deployment. Our "Machine Learning
Technology Readiness Levels" (MLTRL) framework defines a principled process to
ensure robust, reliable, and responsible systems while being streamlined for ML
workflows, including key distinctions from traditional software engineering.
Even more, MLTRL defines a lingua franca for people across teams and
organizations to work collaboratively on artificial intelligence and machine
learning technologies. Here we describe the framework and elucidate it with
several real world use-cases of developing ML methods from basic research
through productization and deployment, in areas such as medical diagnostics,
consumer computer vision, satellite imagery, and particle physics
Distributed Random Process for a Large-Scale Peer-to-Peer Lottery
Most online lotteries today fail to ensure the verifiability of the random
process and rely on a trusted third party. This issue has received little
attention since the emergence of distributed protocols like Bitcoin that
demonstrated the potential of protocols with no trusted third party. We argue
that the security requirements of online lotteries are similar to those of
online voting, and propose a novel distributed online lottery protocol that
applies techniques developed for voting applications to an existing lottery
protocol. As a result, the protocol is scalable, provides efficient
verification of the random process and does not rely on a trusted third party
nor on assumptions of bounded computational resources. An early prototype
confirms the feasibility of our approach
- …