22,655 research outputs found

    Dealing with uncertain entities in ontology alignment using rough sets

    Get PDF
    This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.Ontology alignment facilitates exchange of knowledge among heterogeneous data sources. Many approaches to ontology alignment use multiple similarity measures to map entities between ontologies. However, it remains a key challenge in dealing with uncertain entities for which the employed ontology alignment measures produce conflicting results on similarity of the mapped entities. This paper presents OARS, a rough-set based approach to ontology alignment which achieves a high degree of accuracy in situations where uncertainty arises because of the conflicting results generated by different similarity measures. OARS employs a combinational approach and considers both lexical and structural similarity measures. OARS is extensively evaluated with the benchmark ontologies of the ontology alignment evaluation initiative (OAEI) 2010, and performs best in the aspect of recall in comparison with a number of alignment systems while generating a comparable performance in precision

    Beyond Volume: The Impact of Complex Healthcare Data on the Machine Learning Pipeline

    Full text link
    From medical charts to national census, healthcare has traditionally operated under a paper-based paradigm. However, the past decade has marked a long and arduous transformation bringing healthcare into the digital age. Ranging from electronic health records, to digitized imaging and laboratory reports, to public health datasets, today, healthcare now generates an incredible amount of digital information. Such a wealth of data presents an exciting opportunity for integrated machine learning solutions to address problems across multiple facets of healthcare practice and administration. Unfortunately, the ability to derive accurate and informative insights requires more than the ability to execute machine learning models. Rather, a deeper understanding of the data on which the models are run is imperative for their success. While a significant effort has been undertaken to develop models able to process the volume of data obtained during the analysis of millions of digitalized patient records, it is important to remember that volume represents only one aspect of the data. In fact, drawing on data from an increasingly diverse set of sources, healthcare data presents an incredibly complex set of attributes that must be accounted for throughout the machine learning pipeline. This chapter focuses on highlighting such challenges, and is broken down into three distinct components, each representing a phase of the pipeline. We begin with attributes of the data accounted for during preprocessing, then move to considerations during model building, and end with challenges to the interpretation of model output. For each component, we present a discussion around data as it relates to the healthcare domain and offer insight into the challenges each may impose on the efficiency of machine learning techniques.Comment: Healthcare Informatics, Machine Learning, Knowledge Discovery: 20 Pages, 1 Figur

    Expert Elicitation for Reliable System Design

    Full text link
    This paper reviews the role of expert judgement to support reliability assessments within the systems engineering design process. Generic design processes are described to give the context and a discussion is given about the nature of the reliability assessments required in the different systems engineering phases. It is argued that, as far as meeting reliability requirements is concerned, the whole design process is more akin to a statistical control process than to a straightforward statistical problem of assessing an unknown distribution. This leads to features of the expert judgement problem in the design context which are substantially different from those seen, for example, in risk assessment. In particular, the role of experts in problem structuring and in developing failure mitigation options is much more prominent, and there is a need to take into account the reliability potential for future mitigation measures downstream in the system life cycle. An overview is given of the stakeholders typically involved in large scale systems engineering design projects, and this is used to argue the need for methods that expose potential judgemental biases in order to generate analyses that can be said to provide rational consensus about uncertainties. Finally, a number of key points are developed with the aim of moving toward a framework that provides a holistic method for tracking reliability assessment through the design process.Comment: This paper commented in: [arXiv:0708.0285], [arXiv:0708.0287], [arXiv:0708.0288]. Rejoinder in [arXiv:0708.0293]. Published at http://dx.doi.org/10.1214/088342306000000510 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Formal and Informal Methods for Multi-Core Design Space Exploration

    Full text link
    We propose a tool-supported methodology for design-space exploration for embedded systems. It provides means to define high-level models of applications and multi-processor architectures and evaluate the performance of different deployment (mapping, scheduling) strategies while taking uncertainty into account. We argue that this extension of the scope of formal verification is important for the viability of the domain.Comment: In Proceedings QAPL 2014, arXiv:1406.156

    Technology Readiness Levels for Machine Learning Systems

    Full text link
    The development and deployment of machine learning (ML) systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. The lack of diligence can lead to technical debt, scope creep and misaligned objectives, model misuse and failures, and expensive consequences. Engineering systems, on the other hand, follow well-defined processes and testing standards to streamline development for high-quality, reliable results. The extreme is spacecraft systems, where mission critical measures and robustness are ingrained in the development process. Drawing on experience in both spacecraft engineering and ML (from research through product across domain areas), we have developed a proven systems engineering approach for machine learning development and deployment. Our "Machine Learning Technology Readiness Levels" (MLTRL) framework defines a principled process to ensure robust, reliable, and responsible systems while being streamlined for ML workflows, including key distinctions from traditional software engineering. Even more, MLTRL defines a lingua franca for people across teams and organizations to work collaboratively on artificial intelligence and machine learning technologies. Here we describe the framework and elucidate it with several real world use-cases of developing ML methods from basic research through productization and deployment, in areas such as medical diagnostics, consumer computer vision, satellite imagery, and particle physics

    Distributed Random Process for a Large-Scale Peer-to-Peer Lottery

    Get PDF
    Most online lotteries today fail to ensure the verifiability of the random process and rely on a trusted third party. This issue has received little attention since the emergence of distributed protocols like Bitcoin that demonstrated the potential of protocols with no trusted third party. We argue that the security requirements of online lotteries are similar to those of online voting, and propose a novel distributed online lottery protocol that applies techniques developed for voting applications to an existing lottery protocol. As a result, the protocol is scalable, provides efficient verification of the random process and does not rely on a trusted third party nor on assumptions of bounded computational resources. An early prototype confirms the feasibility of our approach
    • …
    corecore