16 research outputs found

    Searching for the scale of homogeneity

    Get PDF
    We introduce a statistical quantity, known as the KK function, related to the integral of the two--point correlation function. It gives us straightforward information about the scale where clustering dominates and the scale at which homogeneity is reached. We evaluate the correlation dimension, D2D_2, as the local slope of the log--log plot of the KK function. We apply this statistic to several stochastic point fields, to three numerical simulations describing the distribution of clusters and finally to real galaxy redshift surveys. Four different galaxy catalogues have been analysed using this technique: the Center for Astrophysics I, the Perseus--Pisces redshift surveys (these two lying in our local neighbourhood), the Stromlo--APM and the 1.2 Jy {\it IRAS} redshift surveys (these two encompassing a larger volume). In all cases, this cumulant quantity shows the fingerprint of the transition to homogeneity. The reliability of the estimates is clearly demonstrated by the results from controllable point sets, such as the segment Cox processes. In the cluster distribution models, as well as in the real galaxy catalogues, we never see long plateaus when plotting D2D_2 as a function of the scale, leaving no hope for unbounded fractal distributions.Comment: 9 pages, 11 figures, MNRAS, in press; minor revision and added reference

    Spatial variation of Anopheles-transmitted Wuchereria bancrofti and Plasmodium falciparum infection densities in Papua New Guinea.

    Get PDF
    RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.The spatial variation of Wuchereria bancrofti and Plasmodium falciparum infection densities was measured in a rural area of Papua New Guinea where they share anopheline vectors. The spatial correlation of W. bancrofti was found to reduce by half over an estimated distance of 1.7 km, much smaller than the 50 km grid used by the World Health Organization rapid mapping method. For P. falciparum, negligible spatial correlation was found. After mass treatment with anti-filarial drugs, there was negligible correlation between the changes in the densities of the two parasites

    Gene expression meta-analysis of Parkinson’s disease and its relationship with Alzheimer’s disease

    Get PDF
    Abstract Parkinson’s disease (PD) and Alzheimer’s disease (AD) are the most common neurodegenerative diseases and have been suggested to share common pathological and physiological links. Understanding the cross-talk between them could reveal potentials for the development of new strategies for early diagnosis and therapeutic intervention thus improving the quality of life of those affected. Here we have conducted a novel meta-analysis to identify differentially expressed genes (DEGs) in PD microarray datasets comprising 69 PD and 57 control brain samples which is the biggest cohort for such studies to date. Using identified DEGs, we performed pathway, upstream and protein-protein interaction analysis. We identified 1046 DEGs, of which a majority (739/1046) were downregulated in PD. YWHAZ and other genes coding 14–3-3 proteins are identified as important DEGs in signaling pathways and in protein-protein interaction networks (PPIN). Perturbed pathways also include mitochondrial dysfunction and oxidative stress. There was a significant overlap in DEGs between PD and AD, and over 99% of these were differentially expressed in the same up or down direction across the diseases. REST was identified as an upstream regulator in both diseases. Our study demonstrates that PD and AD share significant common DEGs and pathways, and identifies novel genes, pathways and upstream regulators which may be important targets for therapy in both diseases

    Bayesian modelling of ultra high-frequency financial data

    No full text
    The availability of ultra high-frequency (UHF) data on transactions has revolutionised data processing and statistical modelling techniques in finance. The unique characteristics of such data, e.g. discrete structure of price change, unequally spaced time intervals and multiple transactions have introduced new theoretical and computational challenges. In this study, we develop a Bayesian framework for modelling integer-valued variables to capture the fundamental properties of price change. We propose the application of the zero inflated Poisson difference (ZPD) distribution for modelling UHF data and assess the effect of covariates on the behaviour of price change. For this purpose, we present two modelling schemes; the first one is based on the analysis of the data after the market closes for the day and is referred to as off-line data processing. In this case, the Bayesian interpretation and analysis are undertaken using Markov chain Monte Carlo methods. The second modelling scheme introduces the dynamic ZPD model which is implemented through Sequential Monte Carlo methods (also known as particle filters). This procedure enables us to update our inference from data as new transactions take place and is known as online data processing. We apply our models to a set of FTSE100 index changes. Based on the probability integral transform, modified for the case of integer-valued random variables, we show that our models are capable of explaining well the observed distribution of price change. We then apply the deviance information criterion and introduce its sequential version for the purpose of model comparison for off-line and online modelling, respectively. Moreover, in order to add more flexibility to the tails of the ZPD distribution, we introduce the zero inflated generalised Poisson difference distribution and outline its possible application for modelling UHF data.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Bayesian quantile regression

    No full text
    The paper introduces the idea of Bayesian quantile regression employing a likelihood function that is based on the asymmetric Laplace distribution. It is shown that irrespective of the original distribution of the data, the use of the asymmetric Laplace distribution is a very natural and effective way for modelling Bayesian quantile regression. The paper also demonstrates that improper uniform priors for the unknown model parameters yield a proper joint posterior. The approach is illustrated via a simulated and two real data sets.Asymmetric Laplace distribution Bayesian inference Markov chain Monte Carlo methods Quantile regression

    Blood biomarker-based classification study for neurodegenerative diseases

    No full text
    AbstractAs the population ages, neurodegenerative diseases are becoming more prevalent, making it crucial to comprehend the underlying disease mechanisms and identify biomarkers to allow for early diagnosis and effective screening for clinical trials. Thanks to advancements in gene expression profiling, it is now possible to search for disease biomarkers on an unprecedented scale.Here we applied a selection of five machine learning (ML) approaches to identify blood-based biomarkers for Alzheimer's (AD) and Parkinson's disease (PD) with the application of multiple feature selection methods. Based on ROC AUC performance, one optimal random forest (RF) model was discovered for AD with 159 gene markers (ROC-AUC = 0.886), while one optimal RF model was discovered for PD (ROC-AUC = 0.743). Additionally, in comparison to traditional ML approaches, deep learning approaches were applied to evaluate their potential applications in future works. We demonstrated that convolutional neural networks perform consistently well across both the Alzheimer's (ROC AUC = 0.810) and Parkinson's (ROC AUC = 0.715) datasets, suggesting its potential in gene expression biomarker detection with increased tuning of their architecture.</jats:p

    Bayesian nonparametric quantile regression using splines

    No full text
    A new technique based on Bayesian quantile regression that models the dependence of a quantile of one variable on the values of another using a natural cubic spline is presented. Inference is based on the posterior density of the spline and an associated smoothing parameter and is performed by means of a Markov chain Monte Carlo algorithm. Examples of the application of the new technique to two real environmental data sets and to simulated data for which polynomial modelling is inappropriate are given. An aid for making a good choice of proposal density in the Metropolis-Hastings algorithm is discussed. The new nonparametric methodology provides more flexible modelling than the currently used Bayesian parametric quantile regression approach.

    Additional file 1: of Gene expression meta-analysis of Parkinson’s disease and its relationship with Alzheimer’s disease

    No full text
    Table S1. Information about each study used in our meta-analysis after removal of outlier samples. Table S2. Differentially expressed genes identified in our meta-analysis that have been identified as PD risk genes in a recent GWAS meta-analysis [33]. Table S3. IPA canonical pathway analysis for significant pathways identified using all PD DEGs, included with the information for pathways shared with those identified as significant using all AD DEGs. Table S4. IPA canonical pathway analysis for significant pathways identified using down-regulated PD DEGs. Table S5. IPA upstream regulator analysis for up and down regulated PD DEGs analysed separately. Table S6. Top 10 hubs found in the protein-protein interaction network (PPIN) analysis subnetwork created using the top 30 PD DEGs. Table S7. The direction of differential expression between the common DEGs found between AD and PD. Figure S1. Selecting filtering threshold for microarray data. The percentage of studies called absent in a mas5 present absent call for each probe was calculated, and threshold determined by minimizing Anderson-Darling normality tests and giving optimal Q-Q plot of the Z-scores after meta-analysis. The Q-Q plot for (A) 5%, (B) 10%, (C) 15%, (D) 20% and (E) 30% filtering. After 15% filtering A-D p-values were minimized (F) and the 15% Q-Q plot gave closest values to normality. A-D is Anderson-Darling normality test. Figure S2. RNAseq data vs. microarray gene expression data. Average absolute expression level of RNA-seq log2(TPM) of SN tissue from GTEx database plotted against RMA normalised and filtered intensity of microarray control and PD data used in this meta-analysis. The Pearson correlation coefficient between the control microarray data and healthy RNA-seq data (A) is 0.70 (pvalu
    corecore