300 research outputs found

    Reasoning with Contextual Knowledge and Influence Diagrams

    Full text link
    Influence diagrams (IDs) are well-known formalisms extending Bayesian networks to model decision situations under uncertainty. Although they are convenient as a decision theoretic tool, their knowledge representation ability is limited in capturing other crucial notions such as logical consistency. We complement IDs with the light-weight description logic (DL) EL to overcome such limitations. We consider a setup where DL axioms hold in some contexts, yet the actual context is uncertain. The framework benefits from the convenience of using DL as a domain knowledge representation language and the modelling strength of IDs to deal with decisions over contexts in the presence of contextual uncertainty. We define related reasoning problems and study their computational complexity

    Expertise-based peer selection in Peer-to-Peer networks

    Get PDF
    Peer-to-Peer systems have proven to be an effective way of sharing data. Modern protocols are able to efficiently route a message to a given peer. However, determining the destination peer in the first place is not always trivial. We propose a a message to a given peer. However, determining the destination peer in the first place is not always trivial. We propose a model in which peers advertise their expertise in the Peer-to-Peer network. The knowledge about the expertise of other peers forms a semantic topology. Based on the semantic similarity between the subject of a query and the expertise of other peers, a peer can select appropriate peers to forward queries to, instead of broadcasting the query or sending it to a random set of peers. To calculate our semantic similarity measure, we make the simplifying assumption that the peers share the same ontology. We evaluate the model in a bibliographic scenario, where peers share bibliographic descriptions of publications among each other. In simulation experiments complemented with a real-world field experiment, we show how expertise-based peer selection improves the performance of a Peer-to-Peer system with respect to precision, recall and the number of messages

    Combining Prior Knowledge and Data: Beyond the Bayesian Framework

    Get PDF
    For many tasks such as text categorization and control of robotic systems, state-of-the art learning systems can produce results comparable in accuracy to those of human subjects. However, the amount of training data needed for such systems can be prohibitively large for many practical problems. A text categorization system, for example, may need to see many text postings manually tagged with their subjects before it learns to predict the subject of the next posting with high accuracy. A reinforcement learning (RL) system learning how to drive a car needs a lot of experimentation with the actual car before acquiring the optimal policy. An optimizing compiler targeting a certain platform has to construct, compile, and execute many versions of the same code with different optimization parameters to determine which optimizations work best. Such extensive sampling can be time-consuming, expensive (in terms of both expense of the human expertise needed to label data and wear and tear on the robotic equipment used for exploration in case of RL), and sometimes dangerous (e.g., an RL agent driving the car off the cliff to see if it survives the crash). The goal of this work is to reduce the amount of training data an agent needs in order to learn how to perform a task successfully. This is done by providing the system with prior knowledge about its domain. The knowledge is used to bias the agent towards useful solutions and limit the amount of training needed. We explore this task in three contexts: classification (determining the subject of a newsgroup posting), control (learning to perform tasks such as driving a car up the mountain in simulation), and optimization (optimizing performance of linear algebra operations on different hardware platforms). For the text categorization problem, we introduce a novel algorithm which efficiently integrates prior knowledge into large margin classification. We show that prior knowledge simplifies the problem by reducing the size of the hypothesis space. We also provide formal convergence guarantees for our algorithm. For reinforcement learning, we introduce a novel framework for defining planning problems in terms of qualitative statements about the world (e.g., ``the faster the car is going, the more likely it is to reach the top of the mountain''). We present an algorithm based on policy iteration for solving such qualitative problems and prove its convergence. We also present an alternative framework which allows the user to specify prior knowledge quantitatively in form of a Markov Decision Process (MDP). This prior is used to focus exploration on those regions of the world in which the optimal policy is most sensitive to perturbations in transition probabilities and rewards. Finally, in the compiler optimization problem, the prior is based on an analytic model which determines good optimization parameters for a given platform. This model defines a Bayesian prior which, combined with empirical samples (obtained by measuring the performance of optimized code segments), determines the maximum-a-posteriori estimate of the optimization parameters

    Evaluation of statistical correlation and validation methods for construction of gene co-expression networks

    Get PDF
    High-throughput technologies such as microarrays have led to the rapid accumulation of large scale genomic data providing opportunities to systematically infer gene function and co-expression networks. Typical steps of co-expression network analysis using microarray data consist of estimation of pair-wise gene co-expression using some similarity measure, construction of co-expression networks, identification of clusters of co-expressed genes and post-cluster analyses such as cluster validation. This dissertation is primarily concerned with development and evaluation of approaches for the first and the last steps – estimation of gene co-expression matrices and validation of network clusters. Since clustering methods are not a focus, only a paraclique clustering algorithm will be used in this evaluation. First, a novel Bayesian approach is presented for combining the Pearson correlation with prior biological information from Gene Ontology, yielding a biologically relevant estimate of gene co-expression. The addition of biological information by the Bayesian approach reduced noise in the paraclique gene clusters as indicated by high silhouette and increased homogeneity of clusters in terms of molecular function. Standard similarity measures including correlation coefficients from Pearson, Spearman, Kendall’s Tau, Shrinkage, Partial, and Mutual information, and Euclidean and Manhattan distance measures were evaluated. Based on quality metrics such as cluster homogeneity and stability with respect to ontological categories, clusters resulting from partial correlation and mutual information were more biologically relevant than those from any other correlation measures. Second, statistical quality of clusters was evaluated using approaches based on permutation tests and Mantel correlation to identify significant and informative clusters that capture most of the covariance in the dataset. Third, the utility of statistical contrasts was studied for classification of temporal patterns of gene expression. Specifically, polynomial and Helmert contrast analyses were shown to provide a means of labeling the co-expressed gene sets because they showed similar temporal profiles

    Semantic multimedia analysis using knowledge and context

    Get PDF
    PhDThe difficulty of semantic multimedia analysis can be attributed to the extended diversity in form and appearance exhibited by the majority of semantic concepts and the difficulty to express them using a finite number of patterns. In meeting this challenge there has been a scientific debate on whether the problem should be addressed from the perspective of using overwhelming amounts of training data to capture all possible instantiations of a concept, or from the perspective of using explicit knowledge about the concepts’ relations to infer their presence. In this thesis we address three problems of pattern recognition and propose solutions that combine the knowledge extracted implicitly from training data with the knowledge provided explicitly in structured form. First, we propose a BNs modeling approach that defines a conceptual space where both domain related evi- dence and evidence derived from content analysis can be jointly considered to support or disprove a hypothesis. The use of this space leads to sig- nificant gains in performance compared to analysis methods that can not handle combined knowledge. Then, we present an unsupervised method that exploits the collective nature of social media to automatically obtain large amounts of annotated image regions. By proving that the quality of the obtained samples can be almost as good as manually annotated images when working with large datasets, we significantly contribute towards scal- able object detection. Finally, we introduce a method that treats images, visual features and tags as the three observable variables of an aspect model and extracts a set of latent topics that incorporates the semantics of both visual and tag information space. By showing that the cross-modal depen- dencies of tagged images can be exploited to increase the semantic capacity of the resulting space, we advocate the use of all existing information facets in the semantic analysis of social media

    The Gene Ontology Handbook

    Get PDF
    bioinformatics; biotechnolog

    Knowledge-driven Artificial Intelligence in Steelmaking: Towards Industry 4.0

    Get PDF
    With the ongoing emergence of the Fourth Industrial Revolution, often referred to as Indus-try 4.0, new innovations, concepts, and standards are reshaping manufacturing processes and production, leading to intelligent cyber-physical systems and smart factories. Steel production is one important manufacturing process that is undergoing this digital transfor-mation. Realising this vision in steel production comes with unique challenges, including the seamless interoperability between diverse and complex systems, the uniformity of het-erogeneous data, and a need for standardised human-to-machine and machine-to-machine communication protocols. To address these challenges, international standards have been developed, and new technologies have been introduced and studied in both industry and academia. However, due to the vast quantity, scale, and heterogeneous nature of industrial data and systems, achieving interoperability among components within the context of Industry 4.0 remains a challenge, requiring the need for formal knowledge representation capabilities to enhance the understanding of data and information. In response, semantic-based technologies have been proposed as a method to capture knowledge from data and resolve incompatibility conflicts within Industry 4.0 scenarios. We propose utilising fundamental Semantic Web concepts, such as ontologies and knowledge graphs, specifically to enhance semantic interoperability, improve data integration, and standardise data across heterogeneous systems within the context of steelmaking. Addition-ally, we investigate ongoing trends that involve the integration of Machine Learning (ML)techniques with semantic technologies, resulting in the creation of hybrid models. These models capitalise on the strengths derived from the intersection of these two AI approaches.Furthermore, we explore the need for continuous reasoning over data streams, presenting preliminary research that combines ML and semantic technologies in the context of data streams. In this thesis, we make four main contributions: (1) We discover that a clear under-standing of semantic-based asset administration shells, an international standard within the RAMI 4.0 model, was lacking, and provide an extensive survey on semantic-based implementations of asset administration shells. We focus on literature that utilises semantic technologies to enhance the representation, integration, and exchange of information in an industrial setting. (2) The creation of an ontology, a semantic knowledge base, which specifically captures the cold rolling processes in steelmaking. We demonstrate use cases that leverage these semantic methodologies with real-world industrial data for data access, data integration, data querying, and condition-based maintenance purposes. (3) A frame-work demonstrating one approach for integrating machine learning models with semantic technologies to aid decision-making in the domain of steelmaking. We showcase a novel approach of applying random forest classification using rule-based reasoning, incorporating both meta-data and external domain expert knowledge into the model, resulting in improved knowledge-guided assistance for the human-in-the-loop during steelmaking processes. (4) The groundwork for a continuous data stream reasoning framework, where both domain expert knowledge and random forest classification can be dynamically applied to data streams on the fly. This approach opens up possibilities for real-time condition-based monitoring and real-time decision support for predictive maintenance applications. We demonstrate the adaptability of the framework in the context of dynamic steel production processes. Our contributions have been validated on both real-world data sets with peer-reviewed conferences and journals, as well as through collaboration with domain experts from our industrial partners at Tata Steel

    Kernel Methods for Knowledge Structures

    Get PDF
    corecore