Search CORE

836,828 research outputs found

Noise and nonlinearities in high-throughput data

Author: Bagnoli F
Bagnoli F
Franco Bagnoli
Koukolíková-Nicola Z
Minka T
Nguyen V-A Nicola-Koulikova Z Bagnoli F Lió P Ho Tu Bao Zhou Zhi-Hua
Pietro Lió
Rajan J J
Viet-Anh Nguyen
Zdena Koukolíková-Nicola
Publication venue: 'IOP Publishing'
Publication date: 05/01/2010
Field of study

High-throughput data analyses are becoming common in biology, communications, economics and sociology. The vast amounts of data are usually represented in the form of matrices and can be considered as knowledge networks. Spectra-based approaches have proved useful in extracting hidden information within such networks and for estimating missing data, but these methods are based essentially on linear assumptions. The physical models of matching, when applicable, often suggest non-linear mechanisms, that may sometimes be identified as noise. The use of non-linear models in data analysis, however, may require the introduction of many parameters, which lowers the statistical weight of the model. According to the quality of data, a simpler linear analysis may be more convenient than more complex approaches. In this paper, we show how a simple non-parametric Bayesian model may be used to explore the role of non-linearities and noise in synthetic and experimental data sets.Comment: 12 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Consensus and meta-analysis regulatory networks for combining multiple microarray gene expression datasets

Author: Akaike
Allan Tucker
Beissbarth
Conlon
Courcelle
DerSimonian
Eisen
Emma Steele
Faith
Friedman
Gasch
Grigull
Hanley
Hartemink
Jarvinen
Khil
Kuo
Matzkevich
Ng
Pearl
Pearl
Pennock
Pe’er
Pe’er
Quillardet
Salgado
Sangurdekar
Smyth
Soinov
Spellman
Stoica
Sutton
Teixeira
Wang
Yauk
Publication venue: 'Elsevier BV'
Publication date: 01/12/2008
Field of study

Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with greater confidence, and place less reliance on a single dataset. However, combining datasets directly can be difficult as experiments are often conducted on different microarray platforms, and in different laboratories leading to inherent biases in the data that are not always removed through pre-processing such as normalisation. In this paper we compare two frameworks for combining microarray datasets to model regulatory networks: pre- and post-learning aggregation. In pre-learning approaches, such as using simple scale-normalisation prior to the concatenation of datasets, a model is learnt from a combined dataset, whilst in post-learning aggregation individual models are learnt from each dataset and the models are combined. We present two novel approaches for post-learning aggregation, each based on aggregating high-level features of Bayesian network models that have been generated from different microarray expression datasets. Meta-analysis Bayesian networks are based on combining statistical confidences attached to network edges whilst Consensus Bayesian networks identify consistent network features across all datasets. We apply both approaches to multiple datasets from synthetic and real (Escherichia coli and yeast) networks and demonstrate that both methods can improve on networks learnt from a single dataset or an aggregated dataset formed using a standard scale-normalisation

Elsevier - Publisher Connector

Crossref

Brunel University Research Archive

On "Sexual contacts and epidemic thresholds," models and inference for Sexual partnership distributions

Author: Handcock Mark S.
Jones James Holland
Morris Martina
Publication venue
Publication date: 29/05/2003
Field of study

Recent work has focused attention on statistical inference for the population distribution of the number of sexual partners based on survey data. The characteristics of these distributions are of interest as components of mathematical models for the transmission dynamics of sexually-transmitted diseases (STDs). Such information can be used both to calibrate theoretical models, to make predictions for real populations, and as a tool for guiding public health policy. Our previous work on this subject has developed likelihood-based statistical methods for inference that allow for low-dimensional, semi-parametric models. Inference has been based on several proposed stochastic process models for the formation of sexual partnership networks. We have also developed model selection criteria to choose between competing models, and assessed the fit of different models to three populations: Uganda, Sweden, and the USA. Throughout this work, we have emphasized the correct assessment of the uncertainty of the estimates based on the data analyzed. We have also widened the question of interest to the limitations of inferences from such data, and the utility of degree-based epidemiological models more generally. In this paper we address further statistical issues that are important in this area, and a number of confusions that have arisen in interpreting our work. In particular, we consider the use of cumulative lifetime partner distributions, heaping and other issues raised by Liljeros et al. in a recent working paper.Comment: 22 pages, 5 figures in linked working pape

arXiv.org e-Print Archive

eScholarship - University of California

Forecasting of financial data: a novel fuzzy logic neural network based on error-correction concept and statistics

Author: Marček Dušan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

First, this paper investigates the effect of good and bad news on volatility in the BUX return time series using asymmetric ARCH models. Then, the accuracy of forecasting models based on statistical (stochastic), machine learning methods, and soft/granular RBF network is investigated. To forecast the high-frequency financial data, we apply statistical ARMA and asymmetric GARCH-class models. A novel RBF network architecture is proposed based on incorporation of an error-correction mechanism, which improves forecasting ability of feed-forward neural networks. These proposed modelling approaches and SVM models are applied to predict the high-frequency time series of the BUX stock index. We found that it is possible to enhance forecast accuracy and achieve significant risk reduction in managerial decision making by applying intelligent forecasting models based on latest information technologies. On the other hand, we showed that statistical GARCH-class models can identify the presence of leverage effects, and react to the good and bad news.Web of Science421049

Crossref

DSpace at VSB Technical University of Ostrava

A Global Model of $\beta^-$ -Decay Half-Lives Using Neural Networks

Author: Clark J. W.
Costiris N.
Gernoth K. A.
Mavrommatis E.
Publication venue
Publication date: 01/01/2007
Field of study

Statistical modeling of nuclear data using artificial neural networks (ANNs) and, more recently, support vector machines (SVMs), is providing novel approaches to systematics that are complementary to phenomenological and semi-microscopic theories. We present a global model of

\beta^-

-decay halflives of the class of nuclei that decay 100% by

\beta^-

mode in their ground states. A fully-connected multilayered feed forward network has been trained using the Levenberg-Marquardt algorithm, Bayesian regularization, and cross-validation. The halflife estimates generated by the model are discussed and compared with the available experimental data, with previous results obtained with neural networks, and with estimates coming from traditional global nuclear models. Predictions of the new neural-network model are given for nuclei far from stability, with particular attention to those involved in r-process nucleosynthesis. This study demonstrates that in the framework of the

\beta^-

-decay problem considered here, global models based on ANNs can at least match the predictive performance of the best conventional global models rooted in nuclear theory. Accordingly, such statistical models can provide a valuable tool for further mapping of the nuclidic chart.Comment: Proceedings of the 16th Panhellenic Symposium of the Hellenic Nuclear Physics Societ

arXiv.org e-Print Archive

The University of Manchester - Institutional Repository

CERN Document Server