Search CORE

3,587 research outputs found

Application of an efficient Bayesian discretization method to biomedical data

Author: Cooper Gregory F
Gopalakrishnan Vanathi
Lustgarten Jonathan L
Visweswaran Shyam
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background\ud Several data mining methods require data that are discrete, and other methods often perform better with discrete data. We introduce an efficient Bayesian discretization (EBD) method for optimal discretization of variables that runs efficiently on high-dimensional biomedical datasets. The EBD method consists of two components, namely, a Bayesian score to evaluate discretizations and a dynamic programming search procedure to efficiently search the space of possible discretizations. We compared the performance of EBD to Fayyad and Irani's (FI) discretization method, which is commonly used for discretization.\ud \ud Results\ud On 24 biomedical datasets obtained from high-throughput transcriptomic and proteomic studies, the classification performances of the C4.5 classifier and the naïve Bayes classifier were statistically significantly better when the predictor variables were discretized using EBD over FI. EBD was statistically significantly more stable to the variability of the datasets than FI. However, EBD was less robust, though not statistically significantly so, than FI and produced slightly more complex discretizations than FI.\ud \ud Conclusions\ud On a range of biomedical datasets, a Bayesian discretization method (EBD) yielded better classification performance and stability but was less robust than the widely used FI discretization method. The EBD discretization method is easy to implement, permits the incorporation of prior knowledge and belief, and is sufficiently fast for application to high-dimensional data

Crossref

Springer - Publisher Connector

PubMed Central

D-Scholarship@Pitt

A traffic classification method using machine learning algorithm

Author: Chishti Hamayoun Rauf
Publication venue: University of Bedfordshire
Publication date: 01/01/2013
Field of study

Applying concepts of attack investigation in IT industry, this idea has been developed to design a Traffic Classification Method using Data Mining techniques at the intersection of Machine Learning Algorithm, Which will classify the normal and malicious traffic. This classification will help to learn about the unknown attacks faced by IT industry. The notion of traffic classification is not a new concept; plenty of work has been done to classify the network traffic for heterogeneous application nowadays. Existing techniques such as (payload based, port based and statistical based) have their own pros and cons which will be discussed in this literature later, but classification using Machine Learning techniques is still an open field to explore and has provided very promising results up till now

University of Bedfordshire Repository

A Partially Reflecting Random Walk on Spheres Algorithm for Electrical Impedance Tomography

Author: Maire Sylvain
Simon Martin
Publication venue
Publication date: 15/02/2015
Field of study

In this work, we develop a probabilistic estimator for the voltage-to-current map arising in electrical impedance tomography. This novel so-called partially reflecting random walk on spheres estimator enables Monte Carlo methods to compute the voltage-to-current map in an embarrassingly parallel manner, which is an important issue with regard to the corresponding inverse problem. Our method uses the well-known random walk on spheres algorithm inside subdomains where the diffusion coefficient is constant and employs replacement techniques motivated by finite difference discretization to deal with both mixed boundary conditions and interface transmission conditions. We analyze the global bias and the variance of the new estimator both theoretically and experimentally. In a second step, the variance is considerably reduced via a novel control variate conditional sampling technique

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

A Geometric Approach to Pairwise Bayesian Alignment of Functional Data Using Importance Sampling

Author: Kurtek Sebastian
Publication venue
Publication date: 01/01/2017
Field of study

We present a Bayesian model for pairwise nonlinear registration of functional data. We use the Riemannian geometry of the space of warping functions to define appropriate prior distributions and sample from the posterior using importance sampling. A simple square-root transformation is used to simplify the geometry of the space of warping functions, which allows for computation of sample statistics, such as the mean and median, and a fast implementation of a

k

-means clustering algorithm. These tools allow for efficient posterior inference, where multiple modes of the posterior distribution corresponding to multiple plausible alignments of the given functions are found. We also show pointwise

95\%

credible intervals to assess the uncertainty of the alignment in different clusters. We validate this model using simulations and present multiple examples on real data from different application domains including biometrics and medicine

arXiv.org e-Print Archive

Crossref

A TV-Gaussian prior for infinite-dimensional Bayesian inverse problems and its numerical implementations

Author: Aster R C
Dashti M
Feng Z
Gelman A
Helin T
Jinglai Li
Kaipio J
Kass R E
Lassas M
Liu J S
Zhewei Yao
Zixi Hu
Publication venue: 'IOP Publishing'
Publication date: 17/01/2016
Field of study

Many scientific and engineering problems require to perform Bayesian inferences in function spaces, in which the unknowns are of infinite dimension. In such problems, choosing an appropriate prior distribution is an important task. In particular we consider problems where the function to infer is subject to sharp jumps which render the commonly used Gaussian measures unsuitable. On the other hand, the so-called total variation (TV) prior can only be defined in a finite dimensional setting, and does not lead to a well-defined posterior measure in function spaces. In this work we present a TV-Gaussian (TG) prior to address such problems, where the TV term is used to detect sharp jumps of the function, and the Gaussian distribution is used as a reference measure so that it results in a well-defined posterior measure in the function space. We also present an efficient Markov Chain Monte Carlo (MCMC) algorithm to draw samples from the posterior distribution of the TG prior. With numerical examples we demonstrate the performance of the TG prior and the efficiency of the proposed MCMC algorithm

arXiv.org e-Print Archive

University of Liverpool Repository

Crossref

University of Birmingham Research Portal

Tensor Computation: A New Framework for High-Dimensional Problems in EDA

Author: Batselier Kim
Daniel Luca
Liu Haotian
Wong Ngai
Zhang Zheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2016
Field of study

Many critical EDA problems suffer from the curse of dimensionality, i.e. the very fast-scaling computational burden produced by large number of parameters and/or unknown variables. This phenomenon may be caused by multiple spatial or temporal factors (e.g. 3-D field solvers discretizations and multi-rate circuit simulation), nonlinearity of devices and circuits, large number of design or optimization parameters (e.g. full-chip routing/placement and circuit sizing), or extensive process variations (e.g. variability/reliability analysis and design for manufacturability). The computational challenges generated by such high dimensional problems are generally hard to handle efficiently with traditional EDA core algorithms that are based on matrix and vector computation. This paper presents "tensor computation" as an alternative general framework for the development of efficient EDA algorithms and tools. A tensor is a high-dimensional generalization of a matrix and a vector, and is a natural choice for both storing and solving efficiently high-dimensional EDA problems. This paper gives a basic tutorial on tensors, demonstrates some recent examples of EDA applications (e.g., nonlinear circuit modeling and high-dimensional uncertainty quantification), and suggests further open EDA problems where the use of tensor computation could be of advantage.Comment: 14 figures. Accepted by IEEE Trans. CAD of Integrated Circuits and System

arXiv.org e-Print Archive

DSpace@MIT

Crossref

HKU Scholars Hub