3,953 research outputs found
Multi-sample nonparametric treatments comparison in medical follow-up study with unequal observation processes through simulation and bladder tumour case study
In medical follow-up study, the diseases recurrent processes evolved in continuous time and the patients are usually monitor at distinct and different intervals. Therefore, most of the existing methods that assumed identical observation processes might provide misleading results in this case. To address this, a nonparametric test based on integrated weighted different between the mean cumulative functions which characterized both the recurrent processes and observation processes with condition on treatment is proposed to allow unequal observation processes. The empirical power of the proposed test has been investigated via Monte Carlo simulation study and bladder tumour case study. The results arein line with earlier research; the proposed test procedure works well for practical situations and had a good power in detecting treatment difference.Keywords: nonparametric; unequal observation; multi-sample; treatments comparison
New multi-sample nonparametric tests for panel count data
This paper considers the problem of multi-sample nonparametric comparison of
counting processes with panel count data, which arise naturally when recurrent
events are considered. Such data frequently occur in medical follow-up studies
and reliability experiments, for example. For the problem considered, we
construct two new classes of nonparametric test statistics based on the
accumulated weighted differences between the rates of increase of the estimated
mean functions of the counting processes over observation times, wherein the
nonparametric maximum likelihood approach is used to estimate the mean function
instead of the nonparametric maximum pseudo-likelihood. The asymptotic
distributions of the proposed statistics are derived and their finite-sample
properties are examined through Monte Carlo simulations. The simulation results
show that the proposed methods work quite well and are more powerful than the
existing test procedures. Two real data sets are analyzed and presented as
illustrative examples.Comment: Published in at http://dx.doi.org/10.1214/08-AOS599 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Semiparametric and nonparametric methods for the analysis of panel count data
Panel count data are one type of event-history data concerning recurrent events. Ideally for an event-history study, subjects should be monitored continuously, so for the events that may happen recurrently over time, the exact time of each event occurrence is recordable. Data obtained in such cases are commonly referred to as recurrent event data (Cook and Lawless, 2007). In reality, however, subjects may only be observed at their clinical visits or discrete times. As a result, instead of observing the exact event times, one only knows the numbers of events that happen between the observation times. Such interval-censored recurrent event data are usually referred to as panel count data (Kalbfleisch and Lawless, 1985; Sun and Kalbfleisch, 1995; Thall and Lachin, 1988). The primary interest with panel count data is about the underlying recurrent event process. Meanwhile for the analysis, one needs to consider the times when the observations occur, which can be regarded as realizations of an observation process with follow-up times. This dissertation consists of four parts. In the first part, we will consider regression analysis of panel count data with dependent observation processes while the follow-up times may be subject to a terminal event like death. A semiparametric transformation model is presented for the mean function of the underlying recurrent event process among survivals. To estimate the regression parameters, an estimating equation approach is proposed and the inverse survival probability weighting technique is used. In addition, the asymptotic distribution of the proposed estimate is derived and a model checking procedure is presented. Simulation studies are conducted to evaluate finite sample properties of the proposed approach, and the approach is applied to a bladder cancer study. The second part will focus on regression analysis of multivariate panel count data in the presence of a terminal event. Both the observation process and the terminal event may be correlated with recurrent event process
Bayesian Estimation Under Informative Sampling
Bayesian analysis is increasingly popular for use in social science and other
application areas where the data are observations from an informative sample.
An informative sampling design leads to inclusion probabilities that are
correlated with the response variable of interest. Model inference performed on
the observed sample taken from the population will be biased for the population
generative model under informative sampling since the balance of information in
the sample data is different from that for the population. Typical approaches
to account for an informative sampling design under Bayesian estimation are
often difficult to implement because they require re-parameterization of the
hypothesized generating model, or focus on design, rather than model-based,
inference. We propose to construct a pseudo-posterior distribution that
utilizes sampling weights based on the marginal inclusion probabilities to
exponentiate the likelihood contribution of each sampled unit, which weights
the information in the sample back to the population. Our approach provides a
nearly automated estimation procedure applicable to any model specified by the
data analyst for the population and retains the population model
parameterization and posterior sampling geometry. We construct conditions on
known marginal and pairwise inclusion probabilities that define a class of
sampling designs where consistency of the pseudo posterior is
guaranteed. We demonstrate our method on an application concerning the Bureau
of Labor Statistics Job Openings and Labor Turnover Survey.Comment: 24 pages, 3 figure
Studies in Astronomical Time Series Analysis. VI. Bayesian Block Representations
This paper addresses the problem of detecting and characterizing local
variability in time series and other forms of sequential data. The goal is to
identify and characterize statistically significant variations, at the same
time suppressing the inevitable corrupting observational errors. We present a
simple nonparametric modeling technique and an algorithm implementing it - an
improved and generalized version of Bayesian Blocks (Scargle 1998) - that finds
the optimal segmentation of the data in the observation interval. The structure
of the algorithm allows it to be used in either a real-time trigger mode, or a
retrospective mode. Maximum likelihood or marginal posterior functions to
measure model fitness are presented for events, binned counts, and measurements
at arbitrary times with known error distributions. Problems addressed include
those connected with data gaps, variable exposure, extension to piecewise
linear and piecewise exponential representations, multi-variate time series
data, analysis of variance, data on the circle, other data modes, and dispersed
data. Simulations provide evidence that the detection efficiency for weak
signals is close to a theoretical asymptotic limit derived by (Arias-Castro,
Donoho and Huo 2003). In the spirit of Reproducible Research (Donoho et al.
2008) all of the code and data necessary to reproduce all of the figures in
this paper are included as auxiliary material.Comment: Added some missing script files and updated other ancillary data
(code and data files). To be submitted to the Astophysical Journa
Economic Decision-making in Poverty Depletes Behavioral Control
Economic theory and common sense suggest that time preference can cause or per- petuate poverty. Might poverty also or instead cause impatient or impulsive behavior? This paper reports a randomized lab experiment and a partially randomized field ex- periment, both in India, and analysis of the American Time Use Survey. In all three studies, poverty is associated with diminished behavioral control. The primary contri- bution is to isolate the direction of causality from poverty to behavior; three theoretical mechanisms from psychology cannot be deffinitively separated. One supported expla- nation is that poverty, by making economic decision-making more difficult for the poor, depletes cognitive control.impatient, impulsive behavior, poverty, psychology, cognative control
Normalization and microbial differential abundance strategies depend upon data characteristics
BackgroundData from 16S ribosomal RNA (rRNA) amplicon sequencing present challenges to ecological and statistical interpretation. In particular, library sizes often vary over several ranges of magnitude, and the data contains many zeros. Although we are typically interested in comparing relative abundance of taxa in the ecosystem of two or more groups, we can only measure the taxon relative abundance in specimens obtained from the ecosystems. Because the comparison of taxon relative abundance in the specimen is not equivalent to the comparison of taxon relative abundance in the ecosystems, this presents a special challenge. Second, because the relative abundance of taxa in the specimen (as well as in the ecosystem) sum to 1, these are compositional data. Because the compositional data are constrained by the simplex (sum to 1) and are not unconstrained in the Euclidean space, many standard methods of analysis are not applicable. Here, we evaluate how these challenges impact the performance of existing normalization methods and differential abundance analyses.ResultsEffects on normalization: Most normalization methods enable successful clustering of samples according to biological origin when the groups differ substantially in their overall microbial composition. Rarefying more clearly clusters samples according to biological origin than other normalization techniques do for ordination metrics based on presence or absence. Alternate normalization measures are potentially vulnerable to artifacts due to library size. Effects on differential abundance testing: We build on a previous work to evaluate seven proposed statistical methods using rarefied as well as raw data. Our simulation studies suggest that the false discovery rates of many differential abundance-testing methods are not increased by rarefying itself, although of course rarefying results in a loss of sensitivity due to elimination of a portion of available data. For groups with large (~10×) differences in the average library size, rarefying lowers the false discovery rate. DESeq2, without addition of a constant, increased sensitivity on smaller datasets (<20 samples per group) but tends towards a higher false discovery rate with more samples, very uneven (~10×) library sizes, and/or compositional effects. For drawing inferences regarding taxon abundance in the ecosystem, analysis of composition of microbiomes (ANCOM) is not only very sensitive (for >20 samples per group) but also critically the only method tested that has a good control of false discovery rate.ConclusionsThese findings guide which normalization and differential abundance techniques to use based on the data characteristics of a given study
- …