836,828 research outputs found
Noise and nonlinearities in high-throughput data
High-throughput data analyses are becoming common in biology, communications,
economics and sociology. The vast amounts of data are usually represented in
the form of matrices and can be considered as knowledge networks. Spectra-based
approaches have proved useful in extracting hidden information within such
networks and for estimating missing data, but these methods are based
essentially on linear assumptions. The physical models of matching, when
applicable, often suggest non-linear mechanisms, that may sometimes be
identified as noise. The use of non-linear models in data analysis, however,
may require the introduction of many parameters, which lowers the statistical
weight of the model. According to the quality of data, a simpler linear
analysis may be more convenient than more complex approaches.
In this paper, we show how a simple non-parametric Bayesian model may be used
to explore the role of non-linearities and noise in synthetic and experimental
data sets.Comment: 12 pages, 3 figure
Consensus and meta-analysis regulatory networks for combining multiple microarray gene expression datasets
Microarray data is a key source of experimental data for modelling gene regulatory interactions from expression levels. With the rapid increase of publicly available microarray data comes the opportunity to produce regulatory network models based on multiple datasets. Such models are potentially more robust with greater confidence, and place less reliance on a single dataset. However, combining datasets directly can be difficult as experiments are often conducted on different microarray platforms, and in different laboratories leading to inherent biases in the data that are not always removed through pre-processing such as normalisation. In this paper we compare two frameworks for combining microarray datasets to model regulatory networks: pre- and post-learning aggregation. In pre-learning approaches, such as using simple scale-normalisation prior to the concatenation of datasets, a model is learnt from a combined dataset, whilst in post-learning aggregation individual models are learnt from each dataset and the models are combined. We present two novel approaches for post-learning aggregation, each based on aggregating high-level features of Bayesian network models that have been generated from different microarray expression datasets. Meta-analysis Bayesian networks are based on combining statistical confidences attached to network edges whilst Consensus Bayesian networks identify consistent network features across all datasets. We apply both approaches to multiple datasets from synthetic and real (Escherichia coli and yeast) networks and demonstrate that both methods can improve on networks learnt from a single dataset or an aggregated dataset formed using a standard scale-normalisation
On "Sexual contacts and epidemic thresholds," models and inference for Sexual partnership distributions
Recent work has focused attention on statistical inference for the population
distribution of the number of sexual partners based on survey data.
The characteristics of these distributions are of interest as components of
mathematical models for the transmission dynamics of sexually-transmitted
diseases (STDs). Such information can be used both to calibrate theoretical
models, to make predictions for real populations, and as a tool for guiding
public health policy.
Our previous work on this subject has developed likelihood-based statistical
methods for inference that allow for low-dimensional, semi-parametric models.
Inference has been based on several proposed stochastic process models for the
formation of sexual partnership networks. We have also developed model
selection criteria to choose between competing models, and assessed the fit of
different models to three populations: Uganda, Sweden, and the USA. Throughout
this work, we have emphasized the correct assessment of the uncertainty of the
estimates based on the data analyzed. We have also widened the question of
interest to the limitations of inferences from such data, and the utility of
degree-based epidemiological models more generally.
In this paper we address further statistical issues that are important in
this area, and a number of confusions that have arisen in interpreting our
work. In particular, we consider the use of cumulative lifetime partner
distributions, heaping and other issues raised by Liljeros et al. in a recent
working paper.Comment: 22 pages, 5 figures in linked working pape
Forecasting of financial data: a novel fuzzy logic neural network based on error-correction concept and statistics
First, this paper investigates the effect of good and bad news on volatility in the BUX return time series using asymmetric ARCH models. Then, the accuracy of forecasting models based on statistical (stochastic), machine learning methods, and soft/granular RBF network is investigated. To forecast the high-frequency financial data, we apply statistical ARMA and asymmetric GARCH-class models. A novel RBF network architecture is proposed based on incorporation of an error-correction mechanism, which improves forecasting ability of feed-forward neural networks. These proposed modelling approaches and SVM models are applied to predict the high-frequency time series of the BUX stock index. We found that it is possible to enhance forecast accuracy and achieve significant risk reduction in managerial decision making by applying intelligent forecasting models based on latest information technologies. On the other hand, we showed that statistical GARCH-class models can identify the presence of leverage effects, and react to the good and bad news.Web of Science421049
A Global Model of -Decay Half-Lives Using Neural Networks
Statistical modeling of nuclear data using artificial neural networks (ANNs)
and, more recently, support vector machines (SVMs), is providing novel
approaches to systematics that are complementary to phenomenological and
semi-microscopic theories. We present a global model of -decay
halflives of the class of nuclei that decay 100% by mode in their
ground states. A fully-connected multilayered feed forward network has been
trained using the Levenberg-Marquardt algorithm, Bayesian regularization, and
cross-validation. The halflife estimates generated by the model are discussed
and compared with the available experimental data, with previous results
obtained with neural networks, and with estimates coming from traditional
global nuclear models. Predictions of the new neural-network model are given
for nuclei far from stability, with particular attention to those involved in
r-process nucleosynthesis. This study demonstrates that in the framework of the
-decay problem considered here, global models based on ANNs can at
least match the predictive performance of the best conventional global models
rooted in nuclear theory. Accordingly, such statistical models can provide a
valuable tool for further mapping of the nuclidic chart.Comment: Proceedings of the 16th Panhellenic Symposium of the Hellenic Nuclear
Physics Societ
- …