Search CORE

487 research outputs found

Towards faster numerical solution of Continuous Time Markov Chains stored by symbolic data structures

Author: Schuster Johann
Publication venue: Universität der Bundeswehr München, Fakultät für Informatik
Publication date: 01/01/2011
Field of study

This work considers different aspects of model-based performance- and dependability analysis. This research area analyses systems (e.g. computer-, telecommunication- or production-systems) in order to quantify their performance and reliability. Such an analysis can be carried out already in the planning phase, without a physically existing system. All aspects treated in this work are based on finite state spaces (i.e. the models only have finitely many states) and a representation of the state graphs by Multi-Terminal Binary Decision Diagrams (MTBDDs). Currently, there are many tools that transform high-level model specifications (e.g. process algebra or Petri-Net) to low-level models (e.g. Markov chains). Markov chains can be represented by sparse matrices. For complex models very large state spaces may occur (this phenomenon is called state space explosion in the literature) and accordingly very large matrices representing the state graphs. The problem of building the model from the specification and storing the state graph can be regarded as solved: There are heuristics for compactly storing the state graph by MTBDD or Kronecker data structure and there are efficient algorithms for the model generation and functional analysis. For the quantitative analysis there are still problems due to the size of the underlying state space. This work provides some methods to alleviate the problems in case of MTBDD-based storage of the state graph. It is threefold: 1. For the generation of smaller state graphs in the model generation phase (which usually are easier to solve) a symbolic elimination algorithm is developed. 2. For the calculation of steady-state probabilities of Markov chains a multilevel algorithm is developed which allows for faster solutions. 3. For calculating the most probable paths in a state graph, the mean time to the first failure of a system and related measures, a path-based solver is developed

Universität der Bundeswehr München: AtheneForschung

On the Learning Behavior of Adaptive Networks - Part I: Transient Analysis

Author: Ali H. Sayed
Jianshu Chen
Student Member
Publication venue
Publication date: 20/04/2015
Field of study

This work carries out a detailed transient analysis of the learning behavior of multi-agent networks, and reveals interesting results about the learning abilities of distributed strategies. Among other results, the analysis reveals how combination policies influence the learning process of networked agents, and how these policies can steer the convergence point towards any of many possible Pareto optimal solutions. The results also establish that the learning process of an adaptive network undergoes three (rather than two) well-defined stages of evolution with distinctive convergence rates during the first two stages, while attaining a finite mean-square-error (MSE) level in the last stage. The analysis reveals what aspects of the network topology influence performance directly and suggests design procedures that can optimize performance by adjusting the relevant topology parameters. Interestingly, it is further shown that, in the adaptation regime, each agent in a sparsely connected network is able to achieve the same performance level as that of a centralized stochastic-gradient strategy even for left-stochastic combination strategies. These results lead to a deeper understanding and useful insights on the convergence behavior of coupled distributed learners. The results also lead to effective design mechanisms to help diffuse information more thoroughly over networks.Comment: to appear in IEEE Transactions on Information Theory, 201

arXiv.org e-Print Archive

CiteSeerX

Understanding and mitigating universal adversarial perturbations for computer vision neural networks

Author: Co Kenneth Tan
Publication venue: Computing, Imperial College London
Publication date: 01/03/2023
Field of study

Deep neural networks (DNNs) have become the algorithm of choice for many computer vision applications. They are able to achieve human level performance in many computer vision tasks, and enable the automation and large-scale deployment of applications such as object tracking, autonomous vehicles, and medical imaging. However, DNNs expose software applications to systemic vulnerabilities in the form of Universal Adversarial Perturbations (UAPs): input perturbation attacks that can cause DNNs to make classification errors on large sets of inputs. Our aim is to improve the robustness of computer vision DNNs to UAPs without sacrificing the models' predictive performance. To this end, we increase our understanding of these vulnerabilities by investigating the visual structures and patterns commonly appearing in UAPs. We demonstrate the efficacy and pervasiveness of UAPs by showing how Procedural Noise patterns can be used to generate efficient zero-knowledge attacks for different computer vision models and tasks at minimal cost to the attacker. We then evaluate the UAP robustness of various shape and texture-biased models, and found that applying them in ensembles provides marginal improvement to robustness. To mitigate UAP attacks, we develop two novel approaches. First, we propose the Jacobian of DNNs to measure the sensitivity of computer vision DNNs. We derive theoretical bounds and provide empirical evidence that shows how a combination of Jacobian regularisation and ensemble methods allow for increased model robustness against UAPs without degrading the predictive performance of computer vision DNNs. Our results evince a robustness-accuracy trade-off against UAPs that is better than those of models trained in conventional ways. Finally, we design a detection method that analyses the hidden layer activation values to identify a variety of UAP attacks in real-time with low-latency. We show that our work outperforms existing defences under realistic time and computation constraints.Open Acces

Spiral - Imperial College Digital Repository

Sparse reduced-rank regression for imaging genetics studies: models and applications

Author: Vounou Maria
Vounou Maria
Publication venue: Mathematics, Imperial College London
Publication date: 01/02/2012
Field of study

We present a novel statistical technique; the sparse reduced rank regression (sRRR) model which is a strategy for multivariate modelling of high-dimensional imaging responses and genetic predictors. By adopting penalisation techniques, the model is able to enforce sparsity in the regression coefficients, identifying subsets of genetic markers that best explain the variability observed in subsets of the phenotypes. To properly exploit the rich structure present in each of the imaging and genetics domains, we additionally propose the use of several structured penalties within the sRRR model. Using simulation procedures that accurately reflect realistic imaging genetics data, we present detailed evaluations of the sRRR method in comparison with the more traditional univariate linear modelling approach. In all settings considered, we show that sRRR possesses better power to detect the deleterious genetic variants. Moreover, using a simple genetic model, we demonstrate the potential benefits, in terms of statistical power, of carrying out voxel-wise searches as opposed to extracting averages over regions of interest in the brain. Since this entails the use of phenotypic vectors of enormous dimensionality, we suggest the use of a sparse classification model as a de-noising step, prior to the imaging genetics study. Finally, we present the application of a data re-sampling technique within the sRRR model for model selection. Using this approach we are able to rank the genetic markers in order of importance of association to the phenotypes, and similarly rank the phenotypes in order of importance to the genetic markers. In the very end, we illustrate the application perspective of the proposed statistical models in three real imaging genetics datasets and highlight some potential associations

Spiral - Imperial College Digital Repository

Efficient operator-coarsening multigrid schemes for local discontinuous Galerkin methods

Author: Fortunato Daniel
Rycroft Chris H.
Saye Robert
Publication venue
Publication date: 01/01/2019
Field of study

An efficient

hp

-multigrid scheme is presented for local discontinuous Galerkin (LDG) discretizations of elliptic problems, formulated around the idea of separately coarsening the underlying discrete gradient and divergence operators. We show that traditional multigrid coarsening of the primal formulation leads to poor and suboptimal multigrid performance, whereas coarsening of the flux formulation leads to optimal convergence and is equivalent to a purely geometric multigrid method. The resulting operator-coarsening schemes do not require the entire mesh hierarchy to be explicitly built, thereby obviating the need to compute quadrature rules, lifting operators, and other mesh-related quantities on coarse meshes. We show that good multigrid convergence rates are achieved in a variety of numerical tests on 2D and 3D uniform and adaptive Cartesian grids, as well as for curved domains using implicitly defined meshes and for multi-phase elliptic interface problems with complex geometry. Extension to non-LDG discretizations is briefly discussed

arXiv.org e-Print Archive

eScholarship - University of California

Parameter Estimation with Maximal Updated Densities

Author: Butler Troy
Dawson Clint
del-Castillo-Negrete Carlos
Pilosov Michael
Yen Tian Yu
Publication venue
Publication date: 08/12/2022
Field of study

A recently developed measure-theoretic framework solves a stochastic inverse problem (SIP) for models where uncertainties in model output data are predominantly due to aleatoric (i.e., irreducible) uncertainties in model inputs (i.e., parameters). The subsequent inferential target is a distribution on parameters. Another type of inverse problem is to quantify uncertainties in estimates of "true" parameter values under the assumption that such uncertainties should be reduced as more data are incorporated into the problem, i.e., the uncertainty is considered epistemic. A major contribution of this work is the formulation and solution of such a parameter identification problem (PIP) within the measure-theoretic framework developed for the SIP. The approach is novel in that it utilizes a solution to a stochastic forward problem (SFP) to update an initial density only in the parameter directions informed by the model output data. In other words, this method performs "selective regularization" only in the parameter directions not informed by data. The solution is defined by a maximal updated density (MUD) point where the updated density defines the measure-theoretic solution to the PIP. Another significant contribution of this work is the full theory of existence and uniqueness of MUD points for linear maps with Gaussian distributions. Data-constructed Quantity of Interest (QoI) maps are also presented and analyzed for solving the PIP within this measure-theoretic framework as a means of reducing uncertainties in the MUD estimate. We conclude with a demonstration of the general applicability of the method on two problems involving either spatial or temporal data for estimating uncertain model parameters.Comment: Code: github.com/mathematicalmichael/mud.gi

arXiv.org e-Print Archive

Digital twin development for improved operation of batch process systems

Author: Méndez Blanco Carlos Samuel
Publication venue: Eindhoven University of Technology
Publication date: 21/09/2023
Field of study

Pure OAI Repository

Recommended from our members

Optimisation Methods For Training Deep Neural Networks in Speech Recognition

Author: Haider Mustafa Adnan
Publication venue: University of Cambridge
Publication date: 13/03/2019
Field of study

Automatic Speech Recognition (ASR) is an example of a sequence to sequence level classification task where, given an acoustic waveform, the goal is to produce the correct word level hypotheses. In machine learning, a classification problem such as ASR is solved in two stages: an inference stage that models the uncertainty associated with the choice of hypothesis given the acoustic waveform using a mathematical model, and a decision stage which employs the inference model in conjunction with decision theory to make optimal class assignments. With the advent of careful network initialisation and GPU computing, hybrid Hidden Markov Models (HMMs) augmented with Deep Neural Networks (DNNs) have shown to outperform traditional HMMs using Gaussian Mixture Models (GMMs) in solving the inference problem for ASR. In comparison to GMMs, DNNs possess a better capability to model the underlying non-linear data manifold due to their deep and complex structure. While the structure of such models gives rich modelling capability, it also creates complex dependencies between the parameters which can make learning difficult via first order stochastic gradient descent (SGD). The task of finding the best procedure to train DNNs continues to be an active area of research and has been made even more challenging by the availability of ever more training data. This thesis focuses on designing better optimisation approaches to train hybrid HMM-DNN models using sequence level discriminative criterion which is a natural loss function that preserves the sequential ordering of frames within a spoken utterance. The thesis presents an implementation of the second order Hessian Free (HF) optimisation method, and shows how the method can made efficient through appropriate modifications to the Conjugate Gradient algorithm. To achieve better convergence than SGD, this work explores the Natural Gradient method to train DNNs with discriminative sequence training. In the DNN literature, the method has been applied to train models for the Maximum Likelihood objective criterion. A novel contribution of this thesis is to extend this approach to the domain of Minimum Bayes Risk objective functions for discriminative sequence training. With sigmoid models trained on a 50hr and 200hr training set from the Multi-Genre Broadcast 1 (MGB1) transcription task, the NG method applied in a HF styled optimisation framework is shown to achieve better Word Error Rate (WER) reductions on the MGB1 development set than SGD from sequence training. This thesis also addresses the particular issue of overfitting between the training criterion and WER, that primarily arises during sequence training of DNN models that use Rectified Linear Units (ReLUs) as activation functions. It is shown how by scaling with the Gauss Newton matrix, the HF method unlike other approaches can overcome this issue. Seeing that different optimisers work best with different models, it is attractive to have a consistent optimisation framework that is agnostic to the choice of activation function. To address the issue, this thesis develops the geometry of the underlying function space captured by different realisations of DNN model parameters, and presents the design considerations for an optimisation algorithm to be well defined on this space. Building on this analysis, a novel optimisation technique called NGHF is presented that uses both the direction of steepest descent on a probabilistic manifold and local curvature information to effectively probe the error surface. The basis of the method relies on an alternative derivation of Taylor’s theorem using the concepts of manifolds, tangent vectors and directional derivatives from the perspective of Information Geometry. Apart from being well defined on the function space, when framed within a HF style optimisation framework, the method of NGHF is shown to achieve the greatest WER reductions from sequence training on the MGB1 development set with both sigmoid and ReLU based models trained on the 200hr MGB1 training set. The evaluation of the above optimisation methods in training different DNN model architectures is also presented.IDB Cambridge International Scholarshi

Apollo (Cambridge)