4 research outputs found

    Optimal State Estimation for Partially Observed Boolean Dynamical Systems in the Presence of Correlated Observation Noise

    Get PDF
    Recently, state space signal models have been proposed to characterize the behavior of discrete-time boolean dynamical systems. The current system model is one in which the system is observed in the presence of noise. The existing algorithms, however, rely on an assumption of independent and identically distributed (i.i.d.) white noise processes. The existing recursive MMSE process of estimating a Boolean dynamical system (in the presence of i.i.d. noise) is called the Boolean Kalman Filter (BKF). Here we address a different sort of noise, one that is correlated in time to other observation noise, specifically through an AR(1) time series process. In this thesis, we propose modifications to the state-space model that will allow the existing Boolean Kalman Filtering recursive process to adapt to handle time-correlated noise. Additionally, we will propose a modification to the Boolean Particle Filtering approximation to compensate for the same correlated noise AR(1) process. In addition, this document will address a new software package created in the R programming language that will allow the scientific community easier (and free) access to the algorithms created by the Genomic Signal Processing Lab at Texas A&M University. These algorithms will be explained in this document, with results of the algorithms derived from the use of the package

    Fault Detection and Diagnosis in Gene Regulatory Networks and Optimal Bayesian Classification of Metagenomic Data

    Get PDF
    It is well known that the molecular basis of many diseases, particularly cancer, resides in the loss of regulatory power in critical genomic pathways due to DNA mutations. We propose a methodology for model-based fault detection and diagnosis for stochastic Boolean dynamical systems indirectly observed through a single time series of transcriptomic measurements using Next Generation Sequencing (NGS) data. The fault detection consists of an innovations filter followed by a fault certification step, and requires no knowledge about the system faults. The innovations filter uses the optimal Boolean state estimator, called the Boolean Kalman Filter (BKF). We propose an additional step of fault diagnosis based on a multiple model adaptive estimation (MMAE) method consisting of a bank of BKFs running in parallel. The efficacy of the proposed methodology is demonstrated via numerical experiments using a p53-MDM2 negative feedback loop Boolean network. The results indicate the proposed method is promising in monitoring biological changes at the transcriptomic level. Genomic applications in the life sciences experimented an explosive growth with the advent of high-throughput measurement technologies, which are capable of delivering fast and relatively inexpensive profiles of gene and protein activity on a genome-wide or proteome-wide scale. For the study of microbial classification, we propose a Bayesian method for the classification of r16S sequencing pro- files of bacterial abundancies, by using a Dirichlet-Multinomial-Poisson model for microbial community samples. The proposed approach is compared to the kernel SVM, Random Forest and MetaPhyl classification rules as a function of varying sample size, classification difficulty, using synthetic data and real data sets. The proposed Bayesian classifier clearly displays the best performance over different values of between and within class variances that defines the difficulty of the classification

    Bayesian Optimization in Multi-Information Source and Large-Scale Systems

    Get PDF
    The advancements in science and technology in recent years have extended the scale of engineering problems. Discovery of new materials with desirable properties, drug discovery for treat-ment of disease, design of complex aerospace systems containing interactive subsystems, conducting experimental design of complex manufacturing processes, designing complex transportation systems all are examples of complex systems. The significant uncertainty and lack of knowledge about the underlying model due to the complexity necessitate the use of data for analyzing these systems. However, a huge time/economical expense involved in data gathering process avoids ac-quiring large amount of data for analyzing these systems. This dissertation is mainly focused on enabling design and decision making in complex uncertain systems. Design problems are pervasive in scientific and industrial endeavors: scientists design experiments to gain insights into physical and social phenomena, engineers design machines to execute tasks more efficiently, pharmaceutical researchers design new drugs to fight disease, and environ-mentalists design sensor networks to monitor ecological systems. All these design problems are fraught with choices, choices that are often complex and high-dimensional, with interactions that make them difficult for individuals to reason about. Bayesian optimization techniques have been successfully employed for experimental design of these complex systems. In many applications across computational science and engineering, engineers, scientists and decision-makers might have access to a system of interest through several models. These models, often referred to as “information sources", may encompass different resolutions, physics, and modeling assumptions, resulting in different “fidelity" or “skill" with respect to the quantities of interest. Examples of that include different finite-element models in design of complex mechanical structures, and various tools for analyzing DNA and protein sequence data in bioinformatics. Huge computation of the expensive models avoids excessive evaluations across design space. On the other hand, less expensive models fail to represent the objective function accurately. Thus, it is highly desirable to determine which experiment from which model should be conducted at each time point. We have developed a multi-information source Bayesian optimization framework capable of simultaneous selection of design input and information source, handling constraints, and making the balance between information gain and computational cost. The application of the proposed framework has been demonstrated on two different critical problems in engineering: 1) optimization of dual-phase steel to maximize its strength-normalized strain hardening rate in materials science; 2) optimization of NACA 0012 airfoil in aerospace. The design problems are often defined over a large input space, demanding large number of experiments for yielding a proper performance. This is not practical in many real-world problems, due to the budget limitation and data expenses. However, the objective function (i.e., experiment’s outcome) in many cases might not change with the same rate in various directions. We have introduced an adaptive dimensionality reduction Bayesian optimization framework that exponentially reduces the exploration region of the existing techniques. The proposed framework is capable of identifying a small subset of linear combinations of the design inputs that matter the most relative to the objective function and taking advantage of the objective function representation in this lower dimension, but with richer information. A significant increase in the rate of optimization process has been demonstrated on an important problem in aerospace regarding aerostructural design of an aircraft wing modeled based on the NASA Common Research Model (CRM)
    corecore