17 research outputs found
Searching for the scale of homogeneity
We introduce a statistical quantity, known as the function, related to
the integral of the two--point correlation function. It gives us
straightforward information about the scale where clustering dominates and the
scale at which homogeneity is reached. We evaluate the correlation dimension,
, as the local slope of the log--log plot of the function. We apply
this statistic to several stochastic point fields, to three numerical
simulations describing the distribution of clusters and finally to real galaxy
redshift surveys. Four different galaxy catalogues have been analysed using
this technique: the Center for Astrophysics I, the Perseus--Pisces redshift
surveys (these two lying in our local neighbourhood), the Stromlo--APM and the
1.2 Jy {\it IRAS} redshift surveys (these two encompassing a larger volume). In
all cases, this cumulant quantity shows the fingerprint of the transition to
homogeneity. The reliability of the estimates is clearly demonstrated by the
results from controllable point sets, such as the segment Cox processes. In the
cluster distribution models, as well as in the real galaxy catalogues, we never
see long plateaus when plotting as a function of the scale, leaving no
hope for unbounded fractal distributions.Comment: 9 pages, 11 figures, MNRAS, in press; minor revision and added
reference
Genetic networks in Parkinson's and Alzheimer's disease
Parkinson\u27s disease (PD) and Alzheimer\u27s disease (AD) are the most common neurodegenerative diseases and there is increasing evidence that they share common physiological and pathological links. Here we have conducted the largest network analysis of PD and AD based on their gene expressions in blood to date. We identified modules that were not preserved between disease and healthy control (HC) networks, and important hub genes and transcription factors (TFs) in these modules. We highlighted that the PD module not preserved in HCs was associated with insulin resistance, and HDAC6 was identified as a hub gene in this module which may have the role of influencing tau phosphorylation and autophagic flux in neurodegenerative disease. The AD module associated with regulation of lipolysis in adipocytes and neuroactive ligand-receptor interaction was not preserved in healthy and mild cognitive impairment networks and the key hubs TRPC5 and BRAP identified as potential targets for therapeutic treatments of AD. Our study demonstrated that PD and AD share common disrupted genetics and identified novel pathways, hub genes and TFs that may be new areas for mechanistic study and important targets in both diseases
Bayesian statistics in the design and analysis of cluster randomised controlled trials and their reporting quality: a methodological systematic review
Background In a cluster randomised controlled trial (CRCT), randomisation units are “clusters” such as schools or GP practices. This has methodological implications for study design and statistical analysis, since clustering often leads to correlation between observations which, if not accounted for, can lead to spurious conclusions of efficacy/effectiveness. Bayesian methodology offers a flexible, intuitive framework to deal with such issues, but its use within CRCT design and analysis appears limited. This review aims to explore and quantify the use of Bayesian methodology in the design and analysis of CRCTs, and appraise the quality of reporting against CONSORT guidelines. Methods We sought to identify all reported/published CRCTs that incorporated Bayesian methodology and papers reporting development of new Bayesian methodology in this context, without restriction on publication date or location. We searched Medline and Embase and the Cochrane Central Register of Controlled Trials (CENTRAL). Reporting quality metrics according to the CONSORT extension for CRCTs were collected, as well as demographic data, type and nature of Bayesian methodology used, journal endorsement of CONSORT guidelines, and statistician involvement. Results Twenty-seven publications were included, six from an additional hand search. Eleven (40.7%) were reports of CRCT results: seven (25.9%) were primary results papers and four (14.8%) reported secondary results. Thirteen papers (48.1%) reported Bayesian methodological developments, the remaining three (11.1%) compared different methods. Four (57.1%) of the primary results papers described the method of sample size calculation; none clearly accounted for clustering. Six (85.7%) clearly accounted for clustering in the analysis. All results papers reported use of Bayesian methods in the analysis but none in the design or sample size calculation. Conclusions The popularity of the CRCT design has increased rapidly in the last twenty years but this has not been mirrored by an uptake of Bayesian methodology in this context. Of studies using Bayesian methodology, there were some differences in reporting quality compared to CRCTs in general, but this study provided insufficient data to draw firm conclusions. There is an opportunity to further develop Bayesian methodology for the design and analysis of CRCTs in order to expand the accessibility, availability, and, ultimately, use of this approach
Spatial variation of Anopheles-transmitted Wuchereria bancrofti and Plasmodium falciparum infection densities in Papua New Guinea.
RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.The spatial variation of Wuchereria bancrofti and Plasmodium falciparum infection densities was measured in a rural area of Papua New Guinea where they share anopheline vectors. The spatial correlation of W. bancrofti was found to reduce by half over an estimated distance of 1.7 km, much smaller than the 50 km grid used by the World Health Organization rapid mapping method. For P. falciparum, negligible spatial correlation was found. After mass treatment with anti-filarial drugs, there was negligible correlation between the changes in the densities of the two parasites
Gene expression meta-analysis of Parkinson’s disease and its relationship with Alzheimer’s disease
Abstract Parkinson’s disease (PD) and Alzheimer’s disease (AD) are the most common neurodegenerative diseases and have been suggested to share common pathological and physiological links. Understanding the cross-talk between them could reveal potentials for the development of new strategies for early diagnosis and therapeutic intervention thus improving the quality of life of those affected. Here we have conducted a novel meta-analysis to identify differentially expressed genes (DEGs) in PD microarray datasets comprising 69 PD and 57 control brain samples which is the biggest cohort for such studies to date. Using identified DEGs, we performed pathway, upstream and protein-protein interaction analysis. We identified 1046 DEGs, of which a majority (739/1046) were downregulated in PD. YWHAZ and other genes coding 14–3-3 proteins are identified as important DEGs in signaling pathways and in protein-protein interaction networks (PPIN). Perturbed pathways also include mitochondrial dysfunction and oxidative stress. There was a significant overlap in DEGs between PD and AD, and over 99% of these were differentially expressed in the same up or down direction across the diseases. REST was identified as an upstream regulator in both diseases. Our study demonstrates that PD and AD share significant common DEGs and pathways, and identifies novel genes, pathways and upstream regulators which may be important targets for therapy in both diseases
Bayesian modelling of ultra high-frequency financial data
The availability of ultra high-frequency (UHF) data on transactions has revolutionised data processing and statistical modelling techniques in finance. The unique characteristics of such data, e.g. discrete structure of price change, unequally spaced time intervals and multiple transactions have introduced new theoretical and computational challenges. In this study, we develop a Bayesian framework for modelling integer-valued variables to capture the fundamental properties of price change. We propose the application of the zero inflated Poisson difference (ZPD) distribution for modelling UHF data and assess the effect of covariates on the behaviour of price change. For this purpose, we present two modelling schemes; the first one is based on the analysis of the data after the market closes for the day and is referred to as off-line data processing. In this case, the Bayesian interpretation and analysis are undertaken using Markov chain Monte Carlo methods. The second modelling scheme introduces the dynamic ZPD model which is implemented through Sequential Monte Carlo methods (also known as particle filters). This procedure enables us to update our inference from data as new transactions take place and is known as online data processing. We apply our models to a set of FTSE100 index changes. Based on the probability integral transform, modified for the case of integer-valued random variables, we show that our models are capable of explaining well the observed distribution of price change. We then apply the deviance information criterion and introduce its sequential version for the purpose of model comparison for off-line and online modelling, respectively. Moreover, in order to add more flexibility to the tails of the ZPD distribution, we introduce the zero inflated generalised Poisson difference distribution and outline its possible application for modelling UHF data.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Bayesian quantile regression
The paper introduces the idea of Bayesian quantile regression employing a likelihood function that is based on the asymmetric Laplace distribution. It is shown that irrespective of the original distribution of the data, the use of the asymmetric Laplace distribution is a very natural and effective way for modelling Bayesian quantile regression. The paper also demonstrates that improper uniform priors for the unknown model parameters yield a proper joint posterior. The approach is illustrated via a simulated and two real data sets.Asymmetric Laplace distribution Bayesian inference Markov chain Monte Carlo methods Quantile regression
Blood biomarker-based classification study for neurodegenerative diseases
AbstractAs the population ages, neurodegenerative diseases are becoming more prevalent, making it crucial to comprehend the underlying disease mechanisms and identify biomarkers to allow for early diagnosis and effective screening for clinical trials. Thanks to advancements in gene expression profiling, it is now possible to search for disease biomarkers on an unprecedented scale.Here we applied a selection of five machine learning (ML) approaches to identify blood-based biomarkers for Alzheimer's (AD) and Parkinson's disease (PD) with the application of multiple feature selection methods. Based on ROC AUC performance, one optimal random forest (RF) model was discovered for AD with 159 gene markers (ROC-AUC = 0.886), while one optimal RF model was discovered for PD (ROC-AUC = 0.743). Additionally, in comparison to traditional ML approaches, deep learning approaches were applied to evaluate their potential applications in future works. We demonstrated that convolutional neural networks perform consistently well across both the Alzheimer's (ROC AUC = 0.810) and Parkinson's (ROC AUC = 0.715) datasets, suggesting its potential in gene expression biomarker detection with increased tuning of their architecture.</jats:p
Bayesian nonparametric quantile regression using splines
A new technique based on Bayesian quantile regression that models the dependence of a quantile of one variable on the values of another using a natural cubic spline is presented. Inference is based on the posterior density of the spline and an associated smoothing parameter and is performed by means of a Markov chain Monte Carlo algorithm. Examples of the application of the new technique to two real environmental data sets and to simulated data for which polynomial modelling is inappropriate are given. An aid for making a good choice of proposal density in the Metropolis-Hastings algorithm is discussed. The new nonparametric methodology provides more flexible modelling than the currently used Bayesian parametric quantile regression approach.