35 research outputs found

    Scale Up Bayesian Network Learning

    Full text link
    Bayesian networks are widely used graphical models which represent uncertain relations between the random variables in a domain compactly and intuitively. The first step of applying Bayesian networks to real-word problems is typically building the network structure. Optimal structure learning via score-and-search has become an active research topic in recent years. In this context, a scoring function is used to measure the goodness of fit of a structure to given data, and the goal is to find the structure which optimizes the scoring function. The problem has been viewed as a shortest path problem, and has been shown to be NP-hard. The complexity of the structure learning limits the usage of Bayesian networks. Thus, we propose to leverage and model correlations among variables to improve the efficiency of finding optimal structures of Bayesian networks. In particular, the shortest path problem highlights the importance of two research issues: the quality of heuristic functions for guiding the search, and the complexity of search space. This thesis introduces several techniques for addressing the issues. We present effective approaches to reducing the search space by extracting constraints directly from data. We also propose various methods to improve heuristic functions, so as to search over the most promising part of the solution space. Empirical results show that these methods significantly improve the efficiency and scalability of heuristics search-based structure learning

    The development characteristics and mechanisms of the Xigou debris flow in the Three Gorges Reservoir Region

    Get PDF
    Debris flow is a common geological hazard in mountainous areas of China, often causing secondary disasters and seriously threatening residents and infrastructure. This paper uses the Xigou debris flow in the Three Gorges Reservoir Region (TGRR) as an example case study, the development characteristics and initiation pattern of which were analyzed based on field investigation. The disaster dynamics software DAN-W was then used to simulate the entire initiation-movement-accumulation process of the debris flow and conduct the debris flow dynamics analysis. The paper also simulated and predicted the movements of landslides in the formation area of a debris flow after its initiation. The results show that the movement duration of the Xigou debris flow was approximately 40 s, the maximum velocity was 37.1 m/s, the maximum thickness of the accumulation was 18.7 m, and the farthest movement distance was 930 m, which are consistent with the field investigation. When the volumes of landslide transformed into a new source material of debris flow are 5 × 104, 10 × 104, 15 × 104, 20 × 104, and 26 × 104 m3, the movement distances of the debris flows are 250, 280, 300, 340, and 375 m, respectively. When the volume of the source material exceeds 20 × 104 m3, debris flow movement can seriously impact the residential houses at the entrance of the gully. This paper can provide a scientific basis for the prevention and mitigation of the Xigou debris flow

    Risk assessment of the Xigou debris flow in the Three Gorges Reservoir Area

    Get PDF
    On June 18, 2018, under the influence of heavy rainfall, a debris flow disaster broke out in Xigou village of the Three Gorges Reservoir Area in Chongqing, causing some residential houses to be buried along with great economic losses. The on-site investigation found many loose solid material sources in the debris flow gully. Under the conditions of heavy rainfall, debris flows are prone to occur again, which would seriously threaten the lives and property of nearby residents. In this paper, taking the Xigou debris flow as a research case, numerical simulation by rapid mass movements simulation (RAMMS) is used to invert the movement process of the 2018 debris flow event; the dynamic calculation parameters of the Xigou debris flow event are obtained; a quantitative hazard prediction of debris flows with different recurrence intervals (30, 50, and 100 years) is carried out in the study area; and risk assessment is conducted based on the vulnerability characteristics of the disaster-bearing bodies in the study area. The results show that the maximum accumulation thickness of debris flow in the 30-year, 50-year, and 100-year recurrence intervals is 6.54 m, 10.18 m, and 10.00 m, respectively, and the debris flow in the 100-year recurrence interval has the widest influence range and greatest hazard. The low-, medium-, and high-risk areas account for 75%, 23%, and 2%, respectively. The high-risk area mainly includes some buildings near the #1 and #2 gullies. This study provides support for the prevention and control of potential debris flow disasters in Xigou village and a scientific basis for disaster prevention and mitigation in the Three Gorges Reservoir area

    Dimensions of the Use of Volunteered Geographic Information in Mass Crisis Events

    No full text
    Recent studies have suggested that catastrophic events that trigger mass evacuation require surrounding communities to be well-prepared to act as ingress or pass-through areas for potential evacuees; however surrounding rural communities may have insufficient disaster-related logistical resources. In the response phase of disaster management, officials must be able to deploy resources to demand locations in types and quantities based on real-time requirements. Effective cross-jurisdictional disaster management needs real-time information, which is usually unavailable from official, authoritative sources. Conversely, VGI (volunteered geographic information) has the capability to provide real-time and local information in disaster management. This study investigates the possibility of utilizing real-time or near real-time VGI in mass evacuation scenarios. The study identifies a potential VGI data source, Tweets from Twitter and how to search for, discover and select relevant Tweets. The dissertation proposes research methods for harvesting, managing live Tweets and saving them to a distributed geodatabase for further spatio-temporal analysis and dissemination to users, such as responders and evacuees.;The study implements a Web GIS application, which includes a tweets discovery component, a geo-tagged tweets mapping component, and an online geo-tagged tweets operation component. The major research goals include designing an application programing interface (API) to harvest relevant Tweets and implement a distributed geodatabase system for storage, analysis, and display of the harvested Tweets so that vital information can be distributed in near real-time. Two case studies, based on Super Storm Sandy in 2012 and a shooting at Kent State University in 2014, were used to evaluate the pros and cons of Tweets from Twitter for response in emergency management and offered prototypes for the development of the final on-line Web GIS

    An Improved Lower Bound for Bayesian Network Structure Learning

    No full text
    Several heuristic search algorithms such as A* and breadth-first branch and bound have been developed for learning Bayesian network structures that optimize a scoring function. These algorithms rely on a lower bound function called k-cycle conflict heuristic in guiding the search to explore the most promising search spaces. The heuristic takes as input a partition of the random variables of a data set; the importance of the partition opens up opportunities for further research. This work introduces a new partition method based on information extracted from the potential optimal parent sets (POPS) of the variables. Empirical results show that the new partition can significantly improve the efficiency and scalability of heuristic search-based structure learning algorithms

    Tightening Bounds for Bayesian Network Structure Learning

    No full text
    A recent breadth-first branch and bound algorithm (BFBnB)for learning Bayesian network structures (Maloneet al. 2011) uses two bounds to prune the searchspace for better efficiency; one is a lower bound calculatedfrom pattern database heuristics, and the otheris an upper bound obtained by a hill climbing search.Whenever the lower bound of a search path exceeds theupper bound, the path is guaranteed to lead to suboptimalsolutions and is discarded immediately. This paperintroduces methods for tightening the bounds. Thelower bound is tightened by using more informed variablegroupings when creating the pattern databases, andthe upper bound is tightened using an anytime learningalgorithm. Empirical results show that these boundsimprove the efficiency of Bayesian network learning bytwo to three orders of magnitude

    Large-Scale Integration of Single-Cell RNA-Seq Data Reveals Astrocyte Diversity and Transcriptomic Modules across Six Central Nervous System Disorders

    No full text
    The dysfunction of astrocytes in response to environmental factors contributes to many neurological diseases by impacting neuroinflammation responses, glutamate and ion homeostasis, and cholesterol and sphingolipid metabolism, which calls for comprehensive and high-resolution analysis. However, single-cell transcriptome analyses of astrocytes have been hampered by the sparseness of human brain specimens. Here, we demonstrate how large-scale integration of multi-omics data, including single-cell and spatial transcriptomic and proteomic data, overcomes these limitations. We created a single-cell transcriptomic dataset of human brains by integration, consensus annotation, and analyzing 302 publicly available single-cell RNA-sequencing (scRNA-seq) datasets, highlighting the power to resolve previously unidentifiable astrocyte subpopulations. The resulting dataset includes nearly one million cells that span a wide variety of diseases, including Alzheimer’s disease (AD), Parkinson’s disease (PD), Huntington’s disease (HD), multiple sclerosis (MS), epilepsy (Epi), and chronic traumatic encephalopathy (CTE). We profiled the astrocytes at three levels, subtype compositions, regulatory modules, and cell–cell communications, and comprehensively depicted the heterogeneity of pathological astrocytes. We constructed seven transcriptomic modules that are involved in the onset and progress of disease development, such as the M2 ECM and M4 stress modules. We validated that the M2 ECM module could furnish potential markers for AD early diagnosis at both the transcriptome and protein levels. In order to accomplish a high-resolution, local identification of astrocyte subtypes, we also carried out a spatial transcriptome analysis of mouse brains using the integrated dataset as a reference. We found that astrocyte subtypes are regionally heterogeneous. We identified dynamic cell–cell interactions in different disorders and found that astrocytes participate in key signaling pathways, such as NRG3-ERBB4, in epilepsy. Our work supports the utility of large-scale integration of single-cell transcriptomic data, which offers new insights into underlying multiple CNS disease mechanisms where astrocytes are involved
    corecore