52 research outputs found

    Advances in Learning Bayesian Networks of Bounded Treewidth

    Full text link
    This work presents novel algorithms for learning Bayesian network structures with bounded treewidth. Both exact and approximate methods are developed. The exact method combines mixed-integer linear programming formulations for structure learning and treewidth computation. The approximate method consists in uniformly sampling kk-trees (maximal graphs of treewidth kk), and subsequently selecting, exactly or approximately, the best structure whose moral graph is a subgraph of that kk-tree. Some properties of these methods are discussed and proven. The approaches are empirically compared to each other and to a state-of-the-art method for learning bounded treewidth structures on a collection of public data sets with up to 100 variables. The experiments show that our exact algorithm outperforms the state of the art, and that the approximate approach is fairly accurate.Comment: 23 pages, 2 figures, 3 table

    From 'tree' based Bayesian networks to mutual information classifiers : deriving a singly connected network classifier using an information theory based technique

    Get PDF
    For reasoning under uncertainty the Bayesian network has become the representation of choice. However, except where models are considered 'simple' the task of construction and inference are provably NP-hard. For modelling larger 'real' world problems this computational complexity has been addressed by methods that approximate the model. The Naive Bayes classifier, which has strong assumptions of independence among features, is a common approach, whilst the class of trees is another less extreme example. In this thesis we propose the use of an information theory based technique as a mechanism for inference in Singly Connected Networks. We call this a Mutual Information Measure classifier, as it corresponds to the restricted class of trees built from mutual information. We show that the new approach provides for both an efficient and localised method of classification, with performance accuracies comparable with the less restricted general Bayesian networks. To improve the performance of the classifier, we additionally investigate the possibility of expanding the class Markov blanket by use of a Wrapper approach and further show that the performance can be improved by focusing on the class Markov blanket and that the improvement is not at the expense of increased complexity. Finally, the two methods are applied to the task of diagnosing the 'real' world medical domain, Acute Abdominal Pain. Known to be both a different and challenging domain to classify, the objective was to investigate the optiniality claims, in respect of the Naive Bayes classifier, that some researchers have argued, for classifying in this domain. Despite some loss of representation capabilities we show that the Mutual Information Measure classifier can be effectively applied to the domain and also provides a recognisable qualitative structure without violating 'real' world assertions. In respect of its 'selective' variant we further show that the improvement achieves a comparable predictive accuracy to the Naive Bayes classifier and that the Naive Bayes classifier's 'overall' performance is largely due the contribution of the majority group Non-Specific Abdominal Pain, a group of exclusion

    Methodological developments for probabilistic risk analyses of socio-technical systems

    Get PDF
    International audienceNowadays, the risk analysis of critical systems cannot be focused only on a technical point of view. Indeed, several major accidents have changed this initial way of thinking. As a result, there exist numerous methods that allow to study risks by considering on the main system resources: the technical process, the operator constraining this process, and the organisation conditioning human actions. However, few works propose to jointly use these different methods to study risks in a global approach. In that way, this paper presents a methodology, which is under development between CRAN, EDF and INERIS, allowing an integration of these different methods to probabilistically estimate risks. This integration is based on unification and structuring knowledge concepts; and the quantitative aspect is achieved through the use of Bayesian Networks. An application of this methodology, on an industrial case, demonstrates its feasibility and concludes on model capacities, which are about the necessary consideration of the whole causes for a system weakness treatment, and the classification of these contributors considering their criticality for this system. This tool can thus be used to help decision makers to prioritise their actions

    Событийно-сетевая модель процессов разрушительного характера для риск-ориентированной системы поддержки принятия решений реального времени

    No full text
    Описана формальная правдоподобная модель процесса разрушительного характера в геоэкотехносистемах, представляющих собой территориальные системы, дискретизированные посредством сетки из равновеликих ячеек. Предложенный формализм позволяет совмещать в рамках одной структуры различные оценки правдоподобия – нечеткие, вероятностные или приближенные.Описано формальну правдоподібну модель процесу руйнівного характеру в геоекотехносистемах, що являють собою територіальні системи, дискретизовані сіткою з рівновеликих комірок. Запропонований формалізм дозволяє поєднувати в межах однієї структури різні оцінки правдоподібності – нечіткі, імовірнісні або наближені.Purpose. The purpose of the work is the development of formal plausible model of destructive process propagation, suitable for the tasks solution of the natural emergency counteracting in the real time decision making systems. Method. The authors used the event-based approach for the developing the plausible model of the destructive process. The methods of fuzzy, probabilistic and rough sets were used to assess the likelihood of cell transitions between states. Results. The formal plausible model of the destructive process in geoecotechnosystems is described, having the form of territorial system, which is discretized using the grid of equal cells. The model of destructive process is represented as the formalism of plausible tree network of the events modeling the transitions of cells from one state to another and allowing to assess the likelihood of such transition and the time during which such transition is anticipated. The proposed formalism allows combining the different likelihood assessments, such as fuzzy, probabilistic and rough, in the frame of one structure. Conclusion. The proposed model can be used in decision support systems for the natural emergencies counteracting, which are based on geoinformation technologies. Using the proposed model allows increasing the efficiency of decision making in the natural emergency conditions by means of informational support

    Гибридный метод интеллектуальной диагностики процессов разрушительного характера

    No full text
    В работе представлен метод автоматической диагностики для идентификации ситуации в системе под-держки принятия решений в условиях чрезвычайных ситуаций на основе беспилотных летательных аппаратов, дистанционного зондирования и технологий обработки изображений. Метод позволяет идентифицировать проблему с использованием диагностических критериев, получаемых в ходе мониторинга, и классифицировать ситуацию для соответствующего реагирования на нее. Метод является гибридным, так как он позволяет нахо-дить решение с использованием прецедентного подхода, но осуществлять дискриминацию с помощью подхода, основанного на правилах. Комбинация диагностического метода с методами дистанционного зондирования и размытой модели распространения чрезвычайных ситуаций позволяет обеспечить требуемую надежность и эффективность прогнозирования и реагирования на чрезвычайные ситуации.The method of automatic diagnosis for identification of the situation in decision support system used in disaster management based on unmanned aerial vehicles, remote sensing, and image processing is proposed in the paper. The method allows identifying the problem using the criteria obtained through monitoring, and classifying the situation for further respond. The method is hybrid as it enables to find a solution using case-based approach but perform discrimination using rule-based approach. The combination of the diagnostic method with the methods of remote sensing and approximate model of disaster behavior provides required credibility and efficiency of disaster prediction and response

    Scalable Learning of Bayesian Networks Using Feedback Arc Set-Based Heuristics

    Get PDF
    Bayesianske nettverk er en viktig klasse av probabilistiske grafiske modeller. De består av en struktur (en rettet asyklisk graf) som beskriver betingede uavhengighet mellom stokastiske variabler og deres parametere (lokale sannsynlighetsfordelinger). Med andre ord er Bayesianske nettverk generative modeller som beskriver simultanfordelingene på en kompakt form. Den største utfordringen med å lære et Bayesiansk nettverk skyldes selve strukturen, og på grunn av den kombinatoriske karakteren til asyklisitetsegenskapen er det ingen overraskelse at strukturlæringsproblemet generelt er NP-hardt. Det eksisterer algoritmer som løser dette problemet eksakt: dynamisk programmering og heltalls lineær programmering er de viktigste kandidatene når man ønsker å finne strukturen til små til mellomstore Bayesianske nettverk fra data. På den annen side er heuristikk som bakkeklatringsvarianter ofte brukt når man forsøker å lære strukturen til større nettverk med tusenvis av variabler, selv om disse heuristikkene vanligvis ikke har teoretiske garantier og ytelsen i praksis kan bli uforutsigbar når man arbeider med storskala læring. Denne oppgaven tar for seg utvikling av skalerbare metoder som takler det strukturlæringsproblemet av Bayesianske nettverk, samtidig som det forsøkes å opprettholde et nivå av teoretisk kontroll. Dette ble oppnådd ved bruk av relaterte kombinatoriske problemer, nemlig det maksimale asykliske subgrafproblemet (maximum acyclic subgraph) og det duale problemet (feedback arc set). Selv om disse problemene er NP-harde i seg selv, er de betydelig mer håndterbare i praksis. Denne oppgaven utforsker måter å kartlegge Bayesiansk nettverksstrukturlæring til maksimale asykliske subgrafforekomster og trekke ut omtrentlige løsninger for det første problemet, basert på løsninger oppnådd for det andre. Vår forskning tyder på at selv om økt skalerbarhet kan oppnås på denne måten, er det adskillig mer utfordrende å opprettholde den teoretisk forståelsen med denne tilnærmingen. Videre fant vi ut at å lære strukturen til Bayesianske nettverk basert på maksimal asyklisk subgraf kanskje ikke er den beste metoden generelt, men vi identifiserte en kontekst - lineære strukturelle ligningsmodeller - der vi eksperimentelt kunne validere fordelene med denne tilnærmingen, som fører til rask og skalerbar identifisering av strukturen og med mulighet til å lære komplekse strukturer på en måte som er konkurransedyktig med moderne metoder.Bayesian networks form an important class of probabilistic graphical models. They consist of a structure (a directed acyclic graph) expressing conditional independencies among random variables, as well as parameters (local probability distributions). As such, Bayesian networks are generative models encoding joint probability distributions in a compact form. The main difficulty in learning a Bayesian network comes from the structure itself, owing to the combinatorial nature of the acyclicity property; it is well known and does not come as a surprise that the structure learning problem is NP-hard in general. Exact algorithms solving this problem exist: dynamic programming and integer linear programming are prime contenders when one seeks to recover the structure of small-to-medium sized Bayesian networks from data. On the other hand, heuristics such as hill climbing variants are commonly used when attempting to approximately learn the structure of larger networks with thousands of variables, although these heuristics typically lack theoretical guarantees and their performance in practice may become unreliable when dealing with large scale learning. This thesis is concerned with the development of scalable methods tackling the Bayesian network structure learning problem, while attempting to maintain a level of theoretical control. This was achieved via the use of related combinatorial problems, namely the maximum acyclic subgraph problem and its dual problem the minimum feedback arc set problem. Although these problems are NP-hard themselves, they exhibit significantly better tractability in practice. This thesis explores ways to map Bayesian network structure learning into maximum acyclic subgraph instances and extract approximate solutions for the first problem, based on the solutions obtained for the second. Our research suggests that although increased scalability can be achieved this way, maintaining theoretical understanding based on this approach is much more challenging. Furthermore, we found that learning the structure of Bayesian networks based on maximum acyclic subgraph/minimum feedback arc set may not be the go-to method in general, but we identified a setting - linear structural equation models - in which we could experimentally validate the benefits of this approach, leading to fast and scalable structure recovery with the ability to learn complex structures in a competitive way compared to state-of-the-art baselines.Doktorgradsavhandlin

    ISIPTA'07: Proceedings of the Fifth International Symposium on Imprecise Probability: Theories and Applications

    Get PDF
    B

    Probabilistic Modeling of Process Systems with Application to Risk Assessment and Fault Detection

    Get PDF
    Three new methods of joint probability estimation (modeling), a maximum-likelihood maximum-entropy method, a constrained maximum-entropy method, and a copula-based method called the rolling pin (RP) method, were developed. Compared to many existing probabilistic modeling methods such as Bayesian networks and copulas, the developed methods yield models that have better performance in terms of flexibility, interpretability and computational tractability. These methods can be used readily to model process systems and perform risk analysis and fault detection at steady state conditions, and can be coupled with appropriate mathematical tools to develop dynamic probabilistic models. Also, a method of performing probabilistic inference using RP-estimated joint probability distributions was introduced; this method is superior to Bayesian networks in several aspects. The RP method was also applied successfully to identify regression models that have high level of flexibility and are appealing in terms of computational costs.Ph.D., Chemical Engineering -- Drexel University, 201
    corecore