70 research outputs found

    Learning Topologies of Acyclic Networks with Tree Structures

    Get PDF
    Network topology identification is known as the process of revealing the interconnections of a network where each node is representative of an atomic entity in a complex system. This procedure is an important topic in the study of dynamic networks since it has broad applications spanning different scientific fields. Furthermore, the study of tree structured networks is deemed significant since a large amount of scientific work is devoted to them and the techniques targeting trees can often be further extended to study more general structures. This dissertation considers the problem of learning the unknown structure of a network when the underlying topology is a directed tree, namely, it does not contain any cycles.The first result of this dissertation is an algorithm that consistently learns a tree structure when only a subset of the nodes is observed, given that the unobserved nodes satisfy certain degree conditions. This method makes use of an additive metric and statistics of the observed data only up to the second order. As it is shown, an additive metric can always be defined for networks with special dynamics, for example when the dynamics is linear. However, in the case of generic networks, additive metrics cannot always be defined. Thus, we derive a second result that solves the same problem, but requires the statistics of the observed data up to the third order, as well as stronger degree conditions for the unobserved nodes. Moreover, for both cases, it is shown that the same degree conditions are also necessary for a consistent reconstruction, achieving the fundamental limitations. The third result of this dissertation provides a technique to approximate a complex network via a simpler one when the assumption of linearity is exploited. The goal of this approximation is to highlight the most significant connections which could potentially reveal more information about the network. In order to show the reliability of this method, we consider high frequency financial data and show how well the businesses are clustered together according to their sector

    Automatic Segmentation of Cells of Different Types in Fluorescence Microscopy Images

    Get PDF
    Recognition of different cell compartments, types of cells, and their interactions is a critical aspect of quantitative cell biology. This provides a valuable insight for understanding cellular and subcellular interactions and mechanisms of biological processes, such as cancer cell dissemination, organ development and wound healing. Quantitative analysis of cell images is also the mainstay of numerous clinical diagnostic and grading procedures, for example in cancer, immunological, infectious, heart and lung disease. Computer automation of cellular biological samples quantification requires segmenting different cellular and sub-cellular structures in microscopy images. However, automating this problem has proven to be non-trivial, and requires solving multi-class image segmentation tasks that are challenging owing to the high similarity of objects from different classes and irregularly shaped structures. This thesis focuses on the development and application of probabilistic graphical models to multi-class cell segmentation. Graphical models can improve the segmentation accuracy by their ability to exploit prior knowledge and model inter-class dependencies. Directed acyclic graphs, such as trees have been widely used to model top-down statistical dependencies as a prior for improved image segmentation. However, using trees, a few inter-class constraints can be captured. To overcome this limitation, polytree graphical models are proposed in this thesis that capture label proximity relations more naturally compared to tree-based approaches. Polytrees can effectively impose the prior knowledge on the inclusion of different classes by capturing both same-level and across-level dependencies. A novel recursive mechanism based on two-pass message passing is developed to efficiently calculate closed form posteriors of graph nodes on polytrees. Furthermore, since an accurate and sufficiently large ground truth is not always available for training segmentation algorithms, a weakly supervised framework is developed to employ polytrees for multi-class segmentation that reduces the need for training with the aid of modeling the prior knowledge during segmentation. Generating a hierarchical graph for the superpixels in the image, labels of nodes are inferred through a novel efficient message-passing algorithm and the model parameters are optimized with Expectation Maximization (EM). Results of evaluation on the segmentation of simulated data and multiple publicly available fluorescence microscopy datasets indicate the outperformance of the proposed method compared to state-of-the-art. The proposed method has also been assessed in predicting the possible segmentation error and has been shown to outperform trees. This can pave the way to calculate uncertainty measures on the resulting segmentation and guide subsequent segmentation refinement, which can be useful in the development of an interactive segmentation framework

    Contributions to Vine-Copula Modeling

    Get PDF
    144 p.Regular vine-copula models (R-vines) are a powerful statistical tool for modeling thedependence structure of multivariate distribution functions. In particular, they allow modelingdierent types of dependencies among random variables independently of their marginaldistributions, which is deemed the most valued characteristic of these models. In this thesis, weinvestigate the theoretical properties of R-vines for representing dependencies and extend theiruse to solve supervised classication problems. We focus on three research directions.!In the rst line of research, the relationship between the graphical representations of R-vines!ÁREA LÍNEA1 2 0 3 0 4ÁREA LÍNEA1 2 0 3 1 7ÁREA LÍNEAÁREA LÍNEA!and Bayesian polytree networks is analyzed in terms of how conditional pairwise independence!relationships are represented by both models. In order to do that, we use an extended graphical!representation of R-vines in which the R-vine graph is endowed with further expressiveness,being possible to distinguish between edges representing independence and dependencerelationships. Using this representation, a separation criterion in the R-vine graph, called Rseparation,is dened. The proposed criterion is used in designing methods for building thegraphical structure of polytrees from that of R-vines, and vice versa. Moreover, possiblecorrespondences between the R-vine graph and the associated R-vine copula as well as dierentproperties of R-separation are analyzed. In the second research line, we design methods forlearning the graphical structure of R-vines from dependence lists. The main challenge of thistask lies in the extremely large size of the search space of all possible R-vine structures. Weprovide two strategies to solve the problem of learning R-vines that represent the largestnumber of dependencies in a list. The rst approach is a 0 -1 linear programming formulation forbuilding truncated R-vines with only two trees. The second approach is an evolutionaryalgorithm, which is able to learn complete and truncated R-vines. Experimental results show thesuccess of this strategy in solving the optimization problem posed. In the third research line, weintroduce a supervised classication approach where the dependence structure of the problemfeatures is modeled through R-vines. The ecacy of these classiers is validated in a mentaldecoding problem and in an image recognition task. While Rvines have been extensivelyapplied in elds such as economics, nance and statistics, only recently have they found theirplace in classication tasks. This contribution represents a step forward in understanding R-vinesand the prospect of extending their use to other machine learning tasks

    Exact Inference Techniques for the Analysis of Bayesian Attack Graphs

    Get PDF
    Attack graphs are a powerful tool for security risk assessment by analysing network vulnerabilities and the paths attackers can use to compromise network resources. The uncertainty about the attacker's behaviour makes Bayesian networks suitable to model attack graphs to perform static and dynamic analysis. Previous approaches have focused on the formalization of attack graphs into a Bayesian model rather than proposing mechanisms for their analysis. In this paper we propose to use efficient algorithms to make exact inference in Bayesian attack graphs, enabling the static and dynamic network risk assessments. To support the validity of our approach we have performed an extensive experimental evaluation on synthetic Bayesian attack graphs with different topologies, showing the computational advantages in terms of time and memory use of the proposed techniques when compared to existing approaches.Comment: 14 pages, 15 figure

    Advanced correlation-based character recognition applied to the Archimedes Palimpsest

    Get PDF
    The Archimedes Palimpsest is a manuscript containing the partial text of seven treatises by Archimedes that were copied onto parchment and bound in the tenth-century AD. This work is aimed at providing tools that allow scholars of ancient Greek mathematics to retrieve as much information as possible from images of the remaining degraded text. Acorrelation pattern recognition (CPR) system has been developed to recognize distorted versions of Greek characters in problematic regions of the palimpsest imagery, which have been obscured by damage from mold and fire, overtext, and natural aging. Feature vectors for each class of characters are constructed using a series of spatial correlation algorithms and corresponding performance metrics. Principal components analysis (PCA) is employed prior to classification to remove features corresponding to filtering schemes that performed poorly for the spatial characteristics of the selected region-of-interest. A probability is then assigned to each class, forming a character probability distribution based on relative distances from the class feature vectors to the ROI feature vector in principal component (PC) space. However, the current CPR system does not produce a single classification decision, as is common in most target detection problems, but instead has been designed to provide intermediate results that allow the user to apply his or her own decisions (or evidence) to arrive at a conclusion. To achieve this result, a probabilistic network has been incorporated into the recognition system. A probabilistic network represents a method for modeling the uncertainty in a system, and for this application, it allows information from the existing iv partial transcription and contextual knowledge from the user to be an integral part of the decision-making process. The CPR system was designed to provide a framework for future research in the area of spatial pattern recognition by accommodating a broad range of applications and the development of new filtering methods. For example, during preliminary testing, the CPR system was used to confirm the publication date of a fifteenth-century Hebrew colophon, and demonstrated success in the detection of registration markers in three-dimensional MRI breast imaging. In addition, a new correlation algorithm that exploits the benefits of linear discriminant analysis (LDA) and the inherent shift invariance of spatial correlation has been derived, implemented, and tested. Results show that this composite filtering method provides a high level of class discrimination while maintaining tolerance to withinclass distortions. With the integration of this algorithm into the existing filter library, this work completes each stage of a cyclic workflow using the developed CPR system, and provides the necessary tools for continued experimentation

    Graphical models and message-passing algorithms for network-constrained decision problems

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. [201]-210).Inference problems, typically posed as the computation of summarizing statistics (e.g., marginals, modes, means, likelihoods), arise in a variety of scientific fields and engineering applications. Probabilistic graphical models provide a scalable framework for developing efficient inference methods, such as message-passing algorithms that exploit the conditional independencies encoded by the given graph. Conceptually, this framework extends naturally to a distributed network setting: by associating to each node and edge in the graph a distinct sensor and communication link, respectively, the iterative message-passing algorithms are equivalent to a sequence of purely-local computations and nearest-neighbor communications. Practically, modern sensor networks can also involve distributed resource constraints beyond those satisfied by existing message-passing algorithms, including e.g., a fixed small number of iterations, the presence of low-rate or unreliable links, or a communication topology that differs from the probabilistic graph. The principal focus of this thesis is to augment the optimization problems from which existing message-passing algorithms are derived, explicitly taking into account that there may be decision-driven processing objectives as well as constraints or costs on available network resources. The resulting problems continue to be NP-hard, in general, but under certain conditions become amenable to an established team-theoretic relaxation technique by which a new class of efficient message-passing algorithms can be derived. From the academic perspective, this thesis marks the intersection of two lines of active research, namely approximate inference methods for graphical models and decentralized Bayesian methods for multi-sensor detection.(cont)The respective primary contributions are new message-passing algorithms for (i) "online" measurement processing in which global decision performance degrades gracefully as network constraints become arbitrarily severe and for (ii) "offline" strategy optimization that remain tractable in a larger class of detection objectives and network constraints than previously considered. From the engineering perspective, the analysis and results of this thesis both expose fundamental issues in distributed sensor systems and advance the development of so-called "self-organizing fusion-layer" protocols compatible with emerging concepts in ad-hoc wireless networking.by O. Patrick Kreidl.Ph.D

    Bayesian Polytrees With Learned Deep Features for Multi-Class Cell Segmentation.

    Get PDF
    The recognition of different cell compartments, the types of cells, and their interactions is a critical aspect of quantitative cell biology. However, automating this problem has proven to be non-trivial and requires solving multi-class image segmentation tasks that are challenging owing to the high similarity of objects from different classes and irregularly shaped structures. To alleviate this, graphical models are useful due to their ability to make use of prior knowledge and model inter-class dependences. Directed acyclic graphs, such as trees, have been widely used to model top-down statistical dependences as a prior for improved image segmentation. However, using trees, a few inter-class constraints can be captured. To overcome this limitation, we propose polytree graphical models that capture label proximity relations more naturally compared to tree-based approaches. A novel recursive mechanism based on two-pass message passing was developed to efficiently calculate closed-form posteriors of graph nodes on polytrees. The algorithm is evaluated on simulated data and on two publicly available fluorescence microscopy datasets, outperforming directed trees and three state-of-the-art convolutional neural networks, namely, SegNet, DeepLab, and PSPNet. Polytrees are shown to outperform directed trees in predicting segmentation error by highlighting areas in the segmented image that do not comply with prior knowledge. This paves the way to uncertainty measures on the resulting segmentation and guide subsequent segmentation refinement

    Fault propagation, detection and analysis in process systems

    Get PDF
    Process systems are often complicated and liable to experience faults and their effects. Faults can adversely affect the safety of the plant, its environmental impact and economic operation. As such, fault diagnosis in process systems is an active area of research and development in both academia and industry. The work reported in this thesis contributes to fault diagnosis by exploring the modelling and analysis of fault propagation and detection in process systems. This is done by posing and answering three research questions. What are the necessary ingredients of a fault diagnosis model? What information should a fault diagnosis model yield? Finally, what types of model are appropriate to fault diagnosis? To answer these questions , the assumption of the research is that the behaviour of a process system arises from the causal structure of the process system. On this basis, the research presented in this thesis develops a two-level approach to fault diagnosis based on detailed process information, and modelling and analysis techniques for representing causality. In the first instance, a qualitative approach is developed called a level 1 fusion. The level 1 fusion models the detailed causality of the system using digraphs. The level 1 fusion is a causal map of the process. Such causal maps can be searched to discover and analyse fault propagation paths through the process. By directly building on the level 1 fusion, a quantitative level 2 fusion is developed which uses a type of digraph called a Bayesian network. By associating process variables with fault variables, and using conditional probability theory, it is shown how measured effects can be used to calculate and rank the probability of candidate causes. The novel contributions are the development of a systematic approach to fault diagnosis based on modelling the chemistry, physics, and architecture of the process. It is also shown how the control and instrumentation system constrains the casualty of the process. By demonstrating how digraph models can be reversed, it is shown how both cause-to-effect and effect-to-cause analysis can be carried out. In answering the three research questions, this research shows that it is feasible to gain detailed insights into fault propagation by qualitatively modelling the physical causality of the process system. It is also shown that a qualitative fault diagnosis model can be used as the basis for a quantitative fault diagnosis modelOpen Acces
    corecore