152 research outputs found

    Exact Inference Techniques for the Analysis of Bayesian Attack Graphs

    Get PDF
    Attack graphs are a powerful tool for security risk assessment by analysing network vulnerabilities and the paths attackers can use to compromise network resources. The uncertainty about the attacker's behaviour makes Bayesian networks suitable to model attack graphs to perform static and dynamic analysis. Previous approaches have focused on the formalization of attack graphs into a Bayesian model rather than proposing mechanisms for their analysis. In this paper we propose to use efficient algorithms to make exact inference in Bayesian attack graphs, enabling the static and dynamic network risk assessments. To support the validity of our approach we have performed an extensive experimental evaluation on synthetic Bayesian attack graphs with different topologies, showing the computational advantages in terms of time and memory use of the proposed techniques when compared to existing approaches.Comment: 14 pages, 15 figure

    Learning tractable multidimensional Bayesian network classifiers

    Get PDF
    Multidimensional classification has become one of the most relevant topics in view of the many domains that require a vector of class values to be assigned to a vector of given features. The popularity of multidimensional Bayesian network classifiers has increased in the last few years due to their expressive power and the existence of methods for learning different families of these models. The problem with this approach is that the computational cost of using the learned models is usually high, especially if there are a lot of class variables. Class-bridge decomposability means that the multidimensional classification problem can be divided into multiple subproblems for these models. In this paper, we prove that class-bridge decomposability can also be used to guarantee the tractability of the models. We also propose a strategy for efficiently bounding their inference complexity, providing a simple learning method with an order-based search that obtains tractable multidimensional Bayesian network classifiers. Experimental results show that our approach is competitive with other methods in the state of the art and ensures the tractability of the learned models

    Probabilistic Independence Networks for Hidden Markov Probability Models

    Get PDF
    Graphical techniques for modeling the dependencies of randomvariables have been explored in a variety of different areas includingstatistics, statistical physics, artificial intelligence, speech recognition, image processing, and genetics.Formalisms for manipulating these models have been developedrelatively independently in these research communities. In this paper weexplore hidden Markov models (HMMs) and related structures within the general framework of probabilistic independencenetworks (PINs). The paper contains a self-contained review of the basic principles of PINs.It is shown that the well-known forward-backward (F-B) and Viterbialgorithms for HMMs are special cases of more general inference algorithms forarbitrary PINs. Furthermore, the existence of inference and estimationalgorithms for more general graphical models provides a set of analysistools for HMM practitioners who wish to explore a richer class of HMMstructures.Examples of relatively complex models to handle sensorfusion and coarticulationin speech recognitionare introduced and treated within the graphical model framework toillustrate the advantages of the general approach

    Graphical models for social behavior modeling in face-to face interaction

    No full text
    International audienceThe goal of this paper is to model the coverbal behavior of a subject involved in face-to-face social interactions. For this end, we present a multimodal behavioral model based on a Dynamic Bayesian Network (DBN). The model was inferred from multimodal data of interacting dyads in a specific scenario designed to foster mutual attention and multimodal deixis of objects and places in a collaborative task. The challenge for this behavioral model is to generate coverbal actions (gaze, hand gestures) for the subject given his verbal productions, the current phase of the interaction and the perceived actions of the partner. In our work, the structure of the DBN was learned from data, which revealed an interesting causality graph describing precisely how verbal and coverbal human behaviors are coordinated during the studied interactions. Using this structure, DBN exhibits better performances compared to classical baseline models such as Hidden Markov Models (HMMs) and Hidden Semi-Markov Models (HSMMs). We outperform the baseline in both measures of performance, i.e. interaction unit recognition and behavior generation. DBN also reproduces more faithfully the coordination patterns between modalities observed in ground truth compared to the baseline models

    Scalable Statistical Modeling and Query Processing over Large Scale Uncertain Databases

    Get PDF
    The past decade has witnessed a large number of novel applications that generate imprecise, uncertain and incomplete data. Examples include monitoring infrastructures such as RFIDs, sensor networks and web-based applications such as information extraction, data integration, social networking and so on. In my dissertation, I addressed several challenges in managing such data and developed algorithms for efficiently executing queries over large volumes of such data. Specifically, I focused on the following challenges. First, for meaningful analysis of such data, we need the ability to remove noise and infer useful information from uncertain data. To address this challenge, I first developed a declarative system for applying dynamic probabilistic models to databases and data streams. The output of such probabilistic modeling is probabilistic data, i.e., data annotated with probabilities of correctness/existence. Often, the data also exhibits strong correlations. Although there is prior work in managing and querying such probabilistic data using probabilistic databases, those approaches largely assume independence and cannot handle probabilistic data with rich correlation structures. Hence, I built a probabilistic database system that can manage large-scale correlations and developed algorithms for efficient query evaluation. Our system allows users to provide uncertain data as input and to specify arbitrary correlations among the entries in the database. In the back end, we represent correlations as a forest of junction trees, an alternative representation for probabilistic graphical models (PGM). We execute queries over the probabilistic database by transforming them into message passing algorithms (inference) over the junction tree. However, traditional algorithms over junction trees typically require accessing the entire tree, even for small queries. Hence, I developed an index data structure over the junction tree called INDSEP that allows us to circumvent this process and thereby scalably evaluate inference queries, aggregation queries and SQL queries over the probabilistic database. Finally, query evaluation in probabilistic databases typically returns output tuples along with their probability values. However, the existing query evaluation model provides very little intuition to the users: for instance, a user might want to know Why is this tuple in my result? or Why does this output tuple have such high probability? or Which are the most influential input tuples for my query ?'' Hence, I designed a query evaluation model, and a suite of algorithms, that provide users with explanations for query results, and enable users to perform sensitivity analysis to better understand the query results

    Understanding the Scalability of Bayesian Network Inference Using Clique Tree Growth Curves

    Get PDF
    One of the main approaches to performing computation in Bayesian networks (BNs) is clique tree clustering and propagation. The clique tree approach consists of propagation in a clique tree compiled from a Bayesian network, and while it was introduced in the 1980s, there is still a lack of understanding of how clique tree computation time depends on variations in BN size and structure. In this article, we improve this understanding by developing an approach to characterizing clique tree growth as a function of parameters that can be computed in polynomial time from BNs, specifically: (i) the ratio of the number of a BN s non-root nodes to the number of root nodes, and (ii) the expected number of moral edges in their moral graphs. Analytically, we partition the set of cliques in a clique tree into different sets, and introduce a growth curve for the total size of each set. For the special case of bipartite BNs, there are two sets and two growth curves, a mixed clique growth curve and a root clique growth curve. In experiments, where random bipartite BNs generated using the BPART algorithm are studied, we systematically increase the out-degree of the root nodes in bipartite Bayesian networks, by increasing the number of leaf nodes. Surprisingly, root clique growth is well-approximated by Gompertz growth curves, an S-shaped family of curves that has previously been used to describe growth processes in biology, medicine, and neuroscience. We believe that this research improves the understanding of the scaling behavior of clique tree clustering for a certain class of Bayesian networks; presents an aid for trade-off studies of clique tree clustering using growth curves; and ultimately provides a foundation for benchmarking and developing improved BN inference and machine learning algorithms

    Statistical and image processing techniques for remote sensing in agricultural monitoring and mapping

    Get PDF
    Throughout most of history, increasing agricultural production has been largely driven by expanded land use, and – especially in the 19th and 20th century – by technological innovation in breeding, genetics and agrochemistry as well as intensification through mechanization and industrialization. More recently, information technology, digitalization and automation have started to play a more significant role in achieving higher productivity with lower environmental impact and reduced use of resources. This includes two trends on opposite scales: precision farming applying detailed observations on sub-field level to support local management, and large-scale agricultural monitoring observing regional patterns in plant health and crop productivity to help manage macroeconomic and environmental trends. In both contexts, remote sensing imagery plays a crucial role that is growing due to decreasing costs and increasing accessibility of both data and means of processing and analysis. The large archives of free imagery with global coverage, can be expected to further increase adoption of remote sensing techniques in coming years. This thesis addresses multiple aspects of remote sensing in agriculture by presenting new techniques in three distinct research topics: (1) remote sensing data assimilation in dynamic crop models; (2) agricultural field boundary detection from remote sensing observations; and (3) contour extraction and field polygon creation from remote sensing imagery. These key objectives are achieved through combining methods of probability analysis, uncertainty quantification, evolutionary learning and swarm intelligence, graph theory, image processing, deep learning and feature extraction. Four new techniques have been developed. Firstly, a new data assimilation technique based on statistical distance metrics and probability distribution analysis to achieve a flexible representation of model- and measurement-related uncertainties. Secondly, a method for detecting boundaries of agricultural fields based on remote sensing observations designed to only rely on image-based information in multi-temporal imagery. Thirdly, an improved boundary detection approach based on deep learning techniques and a variety of image features. Fourthly, a new active contours method called Graph-based Growing Contours (GGC) that allows automatized extractionof complex boundary networks from imagery. The new approaches are tested and evaluated on multiple study areas in the states of Schleswig-Holstein, Niedersachsen and Sachsen-Anhalt, Germany, based on combine harvester measurements, cadastral data and manual mappings. All methods were designed with flexibility and applicability in mind. They proved to perform similarly or better than other existing methods and showed potential for large-scale application and their synergetic use. Thanks to low data requirements and flexible use of inputs, their application is neither constrained to the specific applications presented here nor the use of a specific type of sensor or imagery. This flexibility, in theory, enables their use even outside of the field of remote sensing.Landwirtschaftliche Produktivitätssteigerung wurde historisch hauptsächlich durch Erschließung neuer Anbauflächen und später, insbesondere im 19. und 20. Jahrhundert, durch technologische Innovation in Züchtung, Genetik und Agrarchemie sowie Intensivierung in Form von Mechanisierung und Industrialisierung erreicht. In jüngerer Vergangenheit spielen jedoch Informationstechnologie, Digitalisierung und Automatisierung zunehmend eine größere Rolle, um die Produktivität bei reduziertem Umwelteinfluss und Ressourcennutzung weiter zu steigern. Daraus folgen zwei entgegengesetzte Trends: Zum einen Precision Farming, das mithilfe von Detailbeobachtungen die lokale Feldarbeit unterstützt, und zum anderen großskalige landwirtschaftliche Beobachtung von Bestands- und Ertragsmustern zur Analyse makroökonomischer und ökologischer Trends. In beiden Fällen spielen Fernerkundungsdaten eine entscheidende Rolle und gewinnen dank sinkender Kosten und zunehmender Verfügbarkeit, sowohl der Daten als auch der Möglichkeiten zu ihrer Verarbeitung und Analyse, weiter an Bedeutung. Die Verfügbarkeit großer, freier Archive von globaler Abdeckung werden in den kommenden Jahren voraussichtlich zu einer zunehmenden Verwendung führen. Diese Dissertation behandelt mehrere Aspekte der Fernerkundungsanwendung in der Landwirtschaft und präsentiert neue Methoden zu drei Themenbereichen: (1) Assimilation von Fernerkundungsdaten in dynamischen Agrarmodellen; (2) Erkennung von landwirtschaftlichen Feldgrenzen auf Basis von Fernerkundungsbeobachtungen; und (3) Konturextraktion und Erstellung von Polygonen aus Fernerkundungsaufnahmen. Zur Bearbeitung dieser Zielsetzungen werden verschiedene Techniken aus der Wahrscheinlichkeitsanalyse, Unsicherheitsquantifizierung, dem evolutionären Lernen und der Schwarmintelligenz, der Graphentheorie, dem Bereich der Bildverarbeitung, Deep Learning und Feature-Extraktion kombiniert. Es werden vier neue Methoden vorgestellt. Erstens, eine neue Methode zur Datenassimilation basierend auf statistischen Distanzmaßen und Wahrscheinlichkeitsverteilungen zur flexiblen Abbildung von Modell- und Messungenauigkeiten. Zweitens, eine neue Technik zur Erkennung von Feldgrenzen, ausschließlich auf Basis von Bildinformationen aus multi-temporalen Fernerkundungsdaten. Drittens, eine verbesserte Feldgrenzenerkennung basierend auf Deep Learning Methoden und verschiedener Bildmerkmale. Viertens, eine neue Aktive Kontur Methode namens Graph-based Growing Contours (GGC), die es erlaubt, komplexe Netzwerke von Konturen aus Bildern zu extrahieren. Alle neuen Ansätze werden getestet und evaluiert anhand von Mähdreschermessungen, Katasterdaten und manuellen Kartierungen in verschiedenen Testregionen in den Bundesländern Schleswig-Holstein, Niedersachsen und Sachsen-Anhalt. Alle vorgestellten Methoden sind auf Flexibilität und Anwendbarkeit ausgelegt. Im Vergleich zu anderen Methoden zeigten sie vergleichbare oder bessere Ergebnisse und verdeutlichten das Potenzial zur großskaligen Anwendung sowie kombinierter Verwendung. Dank der geringen Anforderungen und der flexiblen Verwendung verschiedener Eingangsdaten ist die Nutzung nicht nur auf die hier beschriebenen Anwendungen oder bestimmte Sensoren und Bilddaten beschränkt. Diese Flexibilität erlaubt theoretisch eine breite Anwendung, auch außerhalb der Fernerkundung
    • …
    corecore