1,856 research outputs found

    Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics.

    Get PDF
    BackgroundSingle-cell transcriptomics allows researchers to investigate complex communities of heterogeneous cells. It can be applied to stem cells and their descendants in order to chart the progression from multipotent progenitors to fully differentiated cells. While a variety of statistical and computational methods have been proposed for inferring cell lineages, the problem of accurately characterizing multiple branching lineages remains difficult to solve.ResultsWe introduce Slingshot, a novel method for inferring cell lineages and pseudotimes from single-cell gene expression data. In previously published datasets, Slingshot correctly identifies the biological signal for one to three branching trajectories. Additionally, our simulation study shows that Slingshot infers more accurate pseudotimes than other leading methods.ConclusionsSlingshot is a uniquely robust and flexible tool which combines the highly stable techniques necessary for noisy single-cell data with the ability to identify multiple trajectories. Accurate lineage inference is a critical step in the identification of dynamic temporal gene expression

    Bringing Statistical Learning Machines Together for Hydro-Climatological Predictions - Case Study for Sacramento San Joaquin River Basin, California

    Get PDF
    Study region: Sacramento San Joaquin River Basin, California Study focus: The study forecasts the streamflow at a regional scale within SSJ river basin with largescale climate variables. The proposed approach eliminates the bias resulting from predefined indices at regional scale. The study was performed for eight unimpaired streamflow stations from 1962–2016. First, the Singular Valued Decomposition (SVD) teleconnections of the streamflow corresponding to 500 mbar geopotential height, sea surface temperature, 500 mbar specific humidity (SHUM500), and 500 mbar U-wind (U500) were obtained. Second, the skillful SVD teleconnections were screened non-parametrically. Finally, the screened teleconnections were used as the streamflow predictors in the non-linear regression models (K-nearest neighbor regression and data-driven support vector machine). New hydrological insights: The SVD results identified new spatial regions that have not been included in existing predefined indices. The nonparametric model indicated the teleconnections of SHUM500 and U500 being better streamflow predictors compared to other climate variables. The regression models were capable to apprehend most of the sustained low flows, proving the model to be effective for drought-affected regions. It was also observed that the proposed approach showed better forecasting skills with preprocessed large scale climate variables rather than using the predefined indices. The proposed study is simple, yet robust in providing qualitative streamflow forecasts that may assist water managers in making policy-related decisions when planning and managing watersheds

    Bootstrap Methods for Heavy-Tail or Autocorrelated Distributions with an Empirical Application

    Get PDF
    Chapter One: The Truncated Wild Bootstrap for the Asymmetric Infinite Variance Case The wild bootstrap method proposed by Cavaliere et al. (2013) to perform hypothesis testing for the location parameter in the location model, with errors in the domain of attraction of asymmetric stable law, is inappropriate. Hence, we are introducing a new bootstrap test procedure that overcomes the failure of Efron’s (1979) resampling bootstrap. This bootstrap test exploits the Wild Bootstrap of Cavaliere et al. (2013) and the central limit theorem of trimmed variables of Berkes et al. (2012) to deliver confidence sets with correct asymptotic coverage probabilities for asymmetric heavy-tailed data. The methodology of this bootstrap method entails locating cut-off values such that all data between these two values satisfy the central limit theorem conditions. Therefore, the proposed bootstrap will be termed the Truncated Wild Bootstrap (TWB) since it takes advantage of both findings. Simulation evidence to assess the quality of inference of available bootstrap tests for this particular model reveals that, on most occasions, the TWB performs better than the Parametric bootstrap (PB) of Cornea-Madeira & Davidson (2015). In addition, TWB test scheme is superior to the PB because this procedure can test the location parameter when the index of stability is below one, whereas the PB has no power in such a case. Moreover, the TWB is also superior to the PB when the tail index is close to 1 and the distribution is heavily skewed, unless the tail index is exactly 1 and the scale parameter is very high. Chapter Two: A frequency domain wild bootstrap for dependent data In this chapter a resampling method is proposed for a stationary dependent time series, based on Rademacher wild bootstrap draws from the Fourier transform of the data. The main distinguishing feature of our method is that the bootstrap draws share their periodogram identically with the sample, implying sound properties under dependence of arbitrary form. A drawback of the basic procedure is that the bootstrap distribution of the mean is degenerate. We show that a simple Gaussian augmentation overcomes this difficulty. Monte Carlo evidence indicates a favourable comparison with alternative methods in tests of location and significance in a regression model with autocorrelated shocks, and also of unit roots. Chapter 3: Frequency-based Bootstrap Methods for DC Pension Plan Strategy Evaluation The use of conventional bootstrap methods, such as Standard Bootstrap and Moving Block Bootstrap, to produce long run returns to rank one strategy over the others based on its associated reward and risk, might be misleading. Therefore, in this chapter, we will use a simple pension model that is mainly concerned with long-term accumulation wealth to assess, for the first time in pension literature, different bootstrap methods. We find that the Multivariate Fourier Bootstrap gives the most satisfactory result in its ability to mimic the true distribution using Cramer-von-mises statistics. We also address the disagreement in the pension literature on selecting the best pension plan strategy. We present a comprehensive study to compare different strategies using a different bootstrap procedures with different Cash-flow performance measures across a range of countries. We find that bootstrap methods play a critical role in determining the optimal strategy. Additionally, different CFP measures rank pension plans differently across countries and bootstrap methods.ESR

    Langevin and Hamiltonian based Sequential MCMC for Efficient Bayesian Filtering in High-dimensional Spaces

    Full text link
    Nonlinear non-Gaussian state-space models arise in numerous applications in statistics and signal processing. In this context, one of the most successful and popular approximation techniques is the Sequential Monte Carlo (SMC) algorithm, also known as particle filtering. Nevertheless, this method tends to be inefficient when applied to high dimensional problems. In this paper, we focus on another class of sequential inference methods, namely the Sequential Markov Chain Monte Carlo (SMCMC) techniques, which represent a promising alternative to SMC methods. After providing a unifying framework for the class of SMCMC approaches, we propose novel efficient strategies based on the principle of Langevin diffusion and Hamiltonian dynamics in order to cope with the increasing number of high-dimensional applications. Simulation results show that the proposed algorithms achieve significantly better performance compared to existing algorithms
    • …
    corecore