1,865 research outputs found

    Automated Extraction of Fragments of Bayesian Networks from Textual Sources

    Get PDF
    Mining large amounts of unstructured data for extracting meaningful, accurate, and actionable information, is at the core of a variety of research disciplines including computer science, mathematical and statistical modelling, as well as knowledge engineering. In particular, the ability to model complex scenarios based on unstructured datasets is an important step towards an integrated and accurate knowledge extraction approach. This would provide a significant insight in any decision making process driven by Big Data analysis activities. However, there are multiple challenges that need to be fully addressed in order to achieve this, especially when large and unstructured data sets are considered. In this article we propose and analyse a novel method to extract and build fragments of Bayesian networks (BNs) from unstructured large data sources. The results of our analysis show the potential of our approach, and highlight its accuracy and efficiency. More specifically, when compared with existing approaches, our method addresses specific challenges posed by the automated extraction of BNs with extensive applications to unstructured and highly dynamic data sources. The aim of this work is to advance the current state-of-the-art approaches to the automated extraction of BNs from unstructured datasets, which provide a versatile and powerful modelling framework to facilitate knowledge discovery in complex decision scenarios

    Bayesian Information Extraction Network

    Full text link
    Dynamic Bayesian networks (DBNs) offer an elegant way to integrate various aspects of language in one model. Many existing algorithms developed for learning and inference in DBNs are applicable to probabilistic language modeling. To demonstrate the potential of DBNs for natural language processing, we employ a DBN in an information extraction task. We show how to assemble wealth of emerging linguistic instruments for shallow parsing, syntactic and semantic tagging, morphological decomposition, named entity recognition etc. in order to incrementally build a robust information extraction system. Our method outperforms previously published results on an established benchmark domain.Comment: 6 page

    A Four Layer Bayesian Network for Product Model Based Information Mining

    Get PDF
    Business and engineering knowledge in AEC/FM is captured mainly implicitly in project and corporate document repositories. Even with the increasing integration of model-based systems with project information spaces, a large percentage of the information exchange will further on rely on isolated and rather poorly structured text documents. In this paper we propose an approach enabling the use of product model data as a primary source of engineering knowledge to support information externalisation from relevant construction documents, to provide for domain-specific information retrieval, and to help in re-organising and re-contextualising documents in accordance to the user’s discipline-specific tasks and information needs. Suggested is a retrieval and mining framework combining methods for analysing text documents, filtering product models and reasoning on Bayesian networks to explicitly represent the content of text repositories in personalisable semantic content networks. We describe the proposed basic network that can be realised on short-term using minimal product model information as well as various extensions towards a full-fledged added value integration of document-based and model-based information

    Artificial Intuition for Automated Decision-Making

    Get PDF
    Automated decision-making techniques play a crucial role in data science, AI, and general machine learning. However, such techniques need to balance accuracy with computational complexity, as their solution requirements are likely to need exhaustive analysis of the potentially numerous events combinations, which constitute the corresponding scenarios. Intuition is an essential tool in the identification of solutions to problems. More specifically, it can be used to identify, combine and discover knowledge in a “parallel” manner, and therefore more efficiently. As a consequence, the embedding of artificial intuition within data science is likely to provide novel ways to identify and process information. There is extensive research on this topic mainly based on qualitative approaches. However, due to the complexity of this field, limited quantitative models and implementations are available. In this article, the authors have extended the evaluation to include a real-world, multi-disciplinary area in order to provide a more comprehensive assessment. The results demonstrate the value of artificial intuition, when embedded in decision-making and information extraction models and frameworks. In fact, the output produced by the approach discussed in their article was compared with a similar task carried out by a group of experts in the field. This demonstrates comparable results further showing the potential of this framework, as well as artificial intuition as a tool for decision-making and information extraction

    SoccerNet: A Scalable Dataset for Action Spotting in Soccer Videos

    Full text link
    In this paper, we introduce SoccerNet, a benchmark for action spotting in soccer videos. The dataset is composed of 500 complete soccer games from six main European leagues, covering three seasons from 2014 to 2017 and a total duration of 764 hours. A total of 6,637 temporal annotations are automatically parsed from online match reports at a one minute resolution for three main classes of events (Goal, Yellow/Red Card, and Substitution). As such, the dataset is easily scalable. These annotations are manually refined to a one second resolution by anchoring them at a single timestamp following well-defined soccer rules. With an average of one event every 6.9 minutes, this dataset focuses on the problem of localizing very sparse events within long videos. We define the task of spotting as finding the anchors of soccer events in a video. Making use of recent developments in the realm of generic action recognition and detection in video, we provide strong baselines for detecting soccer events. We show that our best model for classifying temporal segments of length one minute reaches a mean Average Precision (mAP) of 67.8%. For the spotting task, our baseline reaches an Average-mAP of 49.7% for tolerances δ\delta ranging from 5 to 60 seconds. Our dataset and models are available at https://silviogiancola.github.io/SoccerNet.Comment: CVPR Workshop on Computer Vision in Sports 201

    Approximate model composition for explanation generation

    Get PDF
    This thesis presents a framework for the formulation of knowledge models to sup¬ port the generation of explanations for engineering systems that are represented by the resulting models. Such models are automatically assembled from instantiated generic component descriptions, known as modelfragments. The model fragments are of suffi¬ cient detail that generally satisfies the requirements of information content as identified by the user asking for explanations. Through a combination of fuzzy logic based evidence preparation, which exploits the history of prior user preferences, and an approximate reasoning inference engine, with a Bayesian evidence propagation mechanism, different uncertainty sources can be han¬ dled. Model fragments, each representing structural or behavioural aspects of a com¬ ponent of the domain system of interest, are organised in a library. Those fragments that represent the same domain system component, albeit with different representation detail, form parts of the same assumption class in the library. Selected fragments are assembled to form an overall system model, prior to extraction of any textual infor¬ mation upon which to base the explanations. The thesis proposes and examines the techniques that support the fragment selection mechanism and the assembly of these fragments into models. In particular, a Bayesian network-based model fragment selection mechanism is de¬ scribed that forms the core of the work. The network structure is manually determined prior to any inference, based on schematic information regarding the connectivity of the components present in the domain system under consideration. The elicitation of network probabilities, on the other hand is completely automated using probability elicitation heuristics. These heuristics aim to provide the information required to select fragments which are maximally compatible with the given evidence of the fragments preferred by the user. Given such initial evidence, an existing evidence propagation algorithm is employed. The preparation of the evidence for the selection of certain fragments, based on user preference, is performed by a fuzzy reasoning evidence fab¬ rication engine. This engine uses a set of fuzzy rules and standard fuzzy reasoning mechanisms, attempting to guess the information needs of the user and suggesting the selection of fragments of sufficient detail to satisfy such needs. Once the evidence is propagated, a single fragment is selected for each of the domain system compo¬ nents and hence, the final model of the entire system is constructed. Finally, a highly configurable XML-based mechanism is employed to extract explanation content from the newly formulated model and to structure the explanatory sentences for the final explanation that will be communicated to the user. The framework is illustratively applied to a number of domain systems and is compared qualitatively to existing compositional modelling methodologies. A further empirical assessment of the performance of the evidence propagation algorithm is carried out to determine its performance limits. Performance is measured against the number of frag¬ ments that represent each of the components of a large domain system, and the amount of connectivity permitted in the Bayesian network between the nodes that stand for the selection or rejection of these fragments. Based on this assessment recommenda¬ tions are made as to how the framework may be optimised to cope with real world applications

    Unsupervised Extraction of Representative Concepts from Scientific Literature

    Full text link
    This paper studies the automated categorization and extraction of scientific concepts from titles of scientific articles, in order to gain a deeper understanding of their key contributions and facilitate the construction of a generic academic knowledgebase. Towards this goal, we propose an unsupervised, domain-independent, and scalable two-phase algorithm to type and extract key concept mentions into aspects of interest (e.g., Techniques, Applications, etc.). In the first phase of our algorithm we propose PhraseType, a probabilistic generative model which exploits textual features and limited POS tags to broadly segment text snippets into aspect-typed phrases. We extend this model to simultaneously learn aspect-specific features and identify academic domains in multi-domain corpora, since the two tasks mutually enhance each other. In the second phase, we propose an approach based on adaptor grammars to extract fine grained concept mentions from the aspect-typed phrases without the need for any external resources or human effort, in a purely data-driven manner. We apply our technique to study literature from diverse scientific domains and show significant gains over state-of-the-art concept extraction techniques. We also present a qualitative analysis of the results obtained.Comment: Published as a conference paper at CIKM 201
    corecore