Search CORE

3,901 research outputs found

Nonparametric relevance-shifted multiple testing procedures for the analysis of high-dimensional multivariate data with small sample sizes

Author: AI Fleishman
C Frömke
C Li
Cornelia Frömke
D Hauschke
DC Polacek
DJ Schaid
E Witt
J Khan
JF Chich
L Guo
LA Hothorn
Ludwig A Hothorn
N Zimmermann
NF Cariello
OG Troyanskaya
PH Westfall
PH Westfall
S Dudoit
S Dudoit
S Holm
S Kropf
S Kropf
S Lange
Siegfried Kropf
T Speed
VR Iyer
Y Benjamini
Y Ge
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background In many research areas it is necessary to find differences between treatment groups with several variables. For example, studies of microarray data seek to find a significant difference in location parameters from zero or one for ratios thereof for each variable. However, in some studies a significant deviation of the difference in locations from zero (or 1 in terms of the ratio) is biologically meaningless. A relevant difference or ratio is sought in such cases. Results This article addresses the use of relevance-shifted tests on ratios for a multivariate parallel two-sample group design. Two empirical procedures are proposed which embed the relevance-shifted test on ratios. As both procedures test a hypothesis for each variable, the resulting multiple testing problem has to be considered. Hence, the procedures include a multiplicity correction. Both procedures are extensions of available procedures for point null hypotheses achieving exact control of the familywise error rate. Whereas the shift of the null hypothesis alone would give straight-forward solutions, the problems that are the reason for the empirical considerations discussed here arise by the fact that the shift is considered in both directions and the whole parameter space in between these two limits has to be accepted as null hypothesis. Conclusion The first algorithm to be discussed uses a permutation algorithm, and is appropriate for designs with a moderately large number of observations. However, many experiments have limited sample sizes. Then the second procedure might be more appropriate, where multiplicity is corrected according to a concept of data-driven order of hypotheses.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Institutionelles Repositorium der Leibniz Universität Hannover

Server für wissenschaftliche Schriften der Hochschule Hannover

Relevance-shifted tests for high dimensional data with small sample sizes

Author: Frömke Cornelia
Publication venue: Hannover : Gottfried Wilhelm Leibniz Universität Hannover
Publication date: 01/01/2006
Field of study

[no abstract

Institutionelles Repositorium der Leibniz Universität Hannover

Diverse correlation structures in gene expression data and their utility in improving statistical inference

Author: Klebanov Lev
Yakovlev Andrei
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 13/12/2007
Field of study

It is well known that correlations in microarray data represent a serious nuisance deteriorating the performance of gene selection procedures. This paper is intended to demonstrate that the correlation structure of microarray data provides a rich source of useful information. We discuss distinct correlation substructures revealed in microarray gene expression data by an appropriate ordering of genes. These substructures include stochastic proportionality of expression signals in a large percentage of all gene pairs, negative correlations hidden in ordered gene triples, and a long sequence of weakly dependent random variables associated with ordered pairs of genes. The reported striking regularities are of general biological interest and they also have far-reaching implications for theory and practice of statistical methods of microarray data analysis. We illustrate the latter point with a method for testing differential expression of nonoverlapping gene pairs. While designed for testing a different null hypothesis, this method provides an order of magnitude more accurate control of type 1 error rate compared to conventional methods of individual gene expression profiling. In addition, this method is robust to the technical noise. Quantitative inference of the correlation structure has the potential to extend the analysis of microarray data far beyond currently practiced methods.Comment: Published in at http://dx.doi.org/10.1214/07-AOAS120 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Dependence in macroeconomic variables: Assessing instantaneous and persistent relations between and within time series

Author: Maxand Simone
Publication venue
Publication date: 29/08/2017
Field of study

Georg-August-University Göttingen

Change-point Problem and Regression: An Annotated Bibliography

Author: Asgharian Masoud
Khodadadi Ahmad
Publication venue: Collection of Biostatistics Research Archive
Publication date: 12/11/2008
Field of study

The problems of identifying changes at unknown times and of estimating the location of changes in stochastic processes are referred to as the change-point problem or, in the Eastern literature, as disorder . The change-point problem, first introduced in the quality control context, has since developed into a fundamental problem in the areas of statistical control theory, stationarity of a stochastic process, estimation of the current position of a time series, testing and estimation of change in the patterns of a regression model, and most recently in the comparison and matching of DNA sequences in microarray data analysis. Numerous methodological approaches have been implemented in examining change-point models. Maximum-likelihood estimation, Bayesian estimation, isotonic regression, piecewise regression, quasi-likelihood and non-parametric regression are among the methods which have been applied to resolving challenges in change-point problems. Grid-searching approaches have also been used to examine the change-point problem. Statistical analysis of change-point problems depends on the method of data collection. If the data collection is ongoing until some random time, then the appropriate statistical procedure is called sequential. If, however, a large finite set of data is collected with the purpose of determining if at least one change-point occurred, then this may be referred to as non-sequential. Not surprisingly, both the former and the latter have a rich literature with much of the earlier work focusing on sequential methods inspired by applications in quality control for industrial processes. In the regression literature, the change-point model is also referred to as two- or multiple-phase regression, switching regression, segmented regression, two-stage least squares (Shaban, 1980), or broken-line regression. The area of the change-point problem has been the subject of intensive research in the past half-century. The subject has evolved considerably and found applications in many different areas. It seems rather impossible to summarize all of the research carried out over the past 50 years on the change-point problem. We have therefore confined ourselves to those articles on change-point problems which pertain to regression. The important branch of sequential procedures in change-point problems has been left out entirely. We refer the readers to the seminal review papers by Lai (1995, 2001). The so called structural change models, which occupy a considerable portion of the research in the area of change-point, particularly among econometricians, have not been fully considered. We refer the reader to Perron (2005) for an updated review in this area. Articles on change-point in time series are considered only if the methodologies presented in the paper pertain to regression analysis

Collection Of Biostatistics Research Archive

Earthquake modelling at the country level using aggregated spatio-temporal point processes

Author: Lieshout M.N.M.
Stein A.
Publication venue: Springer
Publication date: 01/01/2011
Field of study

The goal of this paper is to derive a hazard map for earthquake occurrences in Pakistan from a catalogue that contains spatial coordinates of shallow earthquakes of magnitude 4.5 or larger aggregated over calendar years. We test relative temporal stationarity by the KPSS statistic and use the inhomogeneous J-function to test for inter-point interactions. We then formulate a cluster model, and de-convolve in order to calculate the hazard map, and verify that no particular year has an undue influence on the map. Within the borders of the single country, the KPSS test did not show any deviation from homogeneity in the spatial intensities. The inhomogeneous J-function indicated clustering that could not be attributed to inhomogeneity, and the analysis of aftershocks showed some evidence of two major shocks instead of one during the 2005 Kashmir earthquake disaster. Thus, the spatial point pattern analysis carried out for these data was insightful in various aspects and the hazard map that was obtained may lead to improved measures to protect the population against the disastrous effects of earthquakes

Repository TU/e

Springer - Publisher Connector

CWI's Institutional Repository

Pure OAI Repository

University of Twente Research Information

Recommended from our members

Econometrics: A bird's eye view

Author: Geweke J
Horowitz JL
Pesaran MH
Publication venue: Macmillan
Publication date: 01/01/2008
Field of study

As a unified discipline, econometrics is still relatively young and has been transforming and expanding very rapidly over the past few decades. Major advances have taken place in the analysis of cross sectional data by means of semi-parametric and non-parametric techniques. Heterogeneity of economic relations across individuals, firms and industries is increasingly acknowledge and attempts have been made to take them into account either by integrating out their effects or by modeling the sources of heterogeneity when suitable panel data exists. The counterfactual considerations that underlie policy analysis and treatment evaluation have been given a more satisfactory foundation. New time series econometric techniques have been developed and employed extensively in the areas of macroeconometrics and finance. Non-linear econometric techniques are used increasingly in the analysis of cross section and time series observations. Applications of Bayesian techniques to econometric problems have been given new impetus largely thanks to advances in computer power and computational techniques. The use of Bayesian techniques have in turn provided the investigators with a unifying framework where the tasks and forecasting, decision making, model evaluation and learning can be considered as parts of the same interactive and iterative process; thus paving the way for establishing the foundation of the "real time econometrics". This paper attempts to provide an overview of some of these developments

Apollo (Cambridge)

Bayesian Cluster Analysis

Author: Wade Sara K
Publication venue
Publication date: 15/05/2023
Field of study

Edinburgh Research Explorer