Search CORE

27,238 research outputs found

Modeling association between DNA copy number and gene expression with constrained piecewise linear regression splines

Author: Leday Gwenaël G. R.
van de Wiel Mark A.
van der Vaart Aad W.
van Wieringen Wessel N.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2013
Field of study

DNA copy number and mRNA expression are widely used data types in cancer studies, which combined provide more insight than separately. Whereas in existing literature the form of the relationship between these two types of markers is fixed a priori, in this paper we model their association. We employ piecewise linear regression splines (PLRS), which combine good interpretation with sufficient flexibility to identify any plausible type of relationship. The specification of the model leads to estimation and model selection in a constrained, nonstandard setting. We provide methodology for testing the effect of DNA on mRNA and choosing the appropriate model. Furthermore, we present a novel approach to obtain reliable confidence bands for constrained PLRS, which incorporates model uncertainty. The procedures are applied to colorectal and breast cancer data. Common assumptions are found to be potentially misleading for biologically relevant genes. More flexible models may bring more insight in the interaction between the two markers.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS605 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

VU Research Portal

Crossref

Leiden University Scholary Publications

Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases

Author: Navarro Fabien
Saumard Adrien
Publication venue: 'EDP Sciences'
Publication date: 02/09/2016
Field of study

We investigate the optimality for model selection of the so-called slope heuristics,

V

-fold cross-validation and

V

-fold penalization in a heteroscedastic with random design regression context. We consider a new class of linear models that we call strongly localized bases and that generalize histograms, piecewise polynomials and compactly supported wavelets. We derive sharp oracle inequalities that prove the asymptotic optimality of the slope heuristics---when the optimal penalty shape is known---and

V

-fold penalization. Furthermore,

V

-fold cross-validation seems to be suboptimal for a fixed value of

V

since it recovers asymptotically the oracle learned from a sample size equal to

1-V^{-1}

of the original amount of data. Our results are based on genuine concentration inequalities for the true and empirical excess risks that are of independent interest. We show in our experiments the good behavior of the slope heuristics for the selection of linear wavelet models. Furthermore,

V

-fold cross-validation and

V

-fold penalization have comparable efficiency

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Numérisation de Documents Anciens Mathématiques

Moving sum procedure for change point detection under piecewise linearity

Author: Cho Haeran
Kim Joonpyo
Oh Hee-Seok
Publication venue
Publication date: 20/01/2023
Field of study

We propose a computationally and statistically efficient procedure for segmenting univariate data under piecewise linearity. The proposed moving sum (MOSUM) methodology detects multiple change points where the underlying signal undergoes discontinuous jumps and/or slope changes. Theoretically, it controls the family-wise error rate at a given significance level asymptotically and achieves consistency in multiple change point detection, as well as matching the minimax optimal rate of estimation when the signal is piecewise linear and continuous, all under weak assumptions permitting serial dependence and heavy-tailedness. Computationally, the complexity of the MOSUM procedure is

O(n)

which, combined with its good performance on simulated datasets, making it highly attractive in comparison with the existing methods. We further demonstrate its good performance on a real data example on rolling element-bearing prognostics

arXiv.org e-Print Archive

Moving sum procedure for change point detection under piecewise linearity

Author: Cho Haeran
Kim Joonpyo
Oh Hee-Seok
Publication venue
Publication date: 09/08/2022
Field of study

O(n)

arXiv.org e-Print Archive

Explore Bristol Research

The MINI mixed finite element for the Stokes problem: An experimental investigation

Author: Boffi Daniele
Cioncolini Andrea
Publication venue: 'Elsevier BV'
Publication date: 26/12/2018
Field of study

Super-convergence of order 1.5 in pressure and velocity has been experimentally investigated for the two-dimensional Stokes problem discretised with the MINI mixed finite element. Even though the classic mixed finite element theory for the MINI element guarantees linear convergence for the total error, recent theoretical results indicate that super-convergence of order 1.5 in pressure and of the linear part of the computed velocity to the piecewise linear nodal interpolation of the exact velocity is in fact possible with structured, three-directional triangular meshes. The numerical experiments presented here suggest a more general validity of super-convergence of order 1.5, possibly to automatically generated and unstructured triangulations. In addition, the approximating properties of the complete computed velocity have been compared with the approximating properties of the piecewise-linear part of the computed velocity, finding that the former is generally closer to the exact velocity, whereas the latter conserves mass better

arXiv.org e-Print Archive

Crossref

The University of Manchester - Institutional Repository

Recommended from our members

Longitudinal analysis on AQI in 3 main economic zones of China

Author: Wu Kailin
Publication venue
Publication date: 09/10/2014
Field of study

textIn modern China, air pollution has become an essential environmental problem. Over the last 2 years, the air pollution problem, as measured by PM 2.5 (particulate matter) is getting worse. My report aims to carry out a longitudinal data analysis of the air quality index (AQI) in 3 main economic zones in China. Longitudinal data, or repeated measures data, can be viewed as multilevel data with repeated measurements nested within individuals. I arrive at some conclusions about why the 3 areas have different AQI, mainly attributed to factors like population, GDP, temperature, humidity, and other factors like whether the area is inland or by the sea. The residual variance is partitioned into a between-zone component (the variance of the zone-level residuals) and a within-zone component (the variance of the city-level residuals). The zone residuals represent unobserved zone characteristics that affect AQI. In this report, the model building is mainly according to the sequence described by West et al (2007) with respect to the bottom-up procedures and the reference by Singer, J. D., & Willett, J. B (2003) which includes the non-linear situations. This report also compares the quartic curve model with piecewise growth model with respect to this data. The final model I reached is a piece wise model with time-level and zone-level predictors and also with temperature by time interactions.Statistic

Texas ScholarWorks