905 research outputs found
Concentration inequalities for order statistics
This note describes non-asymptotic variance and tail bounds for order
statistics of samples of independent identically distributed random variables.
Those bounds are checked to be asymptotically tight when the sampling
distribution belongs to a maximum domain of attraction. If the sampling
distribution has non-decreasing hazard rate (this includes the Gaussian
distribution), we derive an exponential Efron-Stein inequality for order
statistics: an inequality connecting the logarithmic moment generating function
of centered order statistics with exponential moments of Efron-Stein
(jackknife) estimates of variance. We use this general connection to derive
variance and tail bounds for order statistics of Gaussian sample. Those bounds
are not within the scope of the Tsirelson-Ibragimov-Sudakov
Gaussian concentration inequality. Proofs are elementary and combine
R\'enyi's representation of order statistics and the so-called entropy approach
to concentration inequalities popularized by M. Ledoux.Comment: 13 page
Tail index estimation, concentration and adaptivity
This paper presents an adaptive version of the Hill estimator based on
Lespki's model selection method. This simple data-driven index selection method
is shown to satisfy an oracle inequality and is checked to achieve the lower
bound recently derived by Carpentier and Kim. In order to establish the oracle
inequality, we derive non-asymptotic variance bounds and concentration
inequalities for Hill estimators. These concentration inequalities are derived
from Talagrand's concentration inequality for smooth functions of independent
exponentially distributed random variables combined with three tools of Extreme
Value Theory: the quantile transform, Karamata's representation of slowly
varying functions, and R\'enyi's characterisation of the order statistics of
exponential samples. The performance of this computationally and conceptually
simple method is illustrated using Monte-Carlo simulations
Real-time prediction of severe influenza epidemics using Extreme Value Statistics
Each year, seasonal influenza epidemics cause hundreds of thousands of deaths
worldwide and put high loads on health care systems. A main concern for
resource planning is the risk of exceptionnally severe epidemics. Taking
advantage of the weekly influenza cases reporting in France, we use recent
results on multivariate GP models in Extreme Value Statistics to develop
methods for real-time prediction of the risk that an ongoing epidemic will be
exceptionally severe and for real-time detection of anomalous epidemics.
Quality of predictions is assessed on observed and simulated data
Predicting extremes: influenza epidemics in France
Influenza epidemics each year cause hundreds of thousands of deaths worldwide and put high loads on health care systems, in France and elsewhere. A main concern for resource planning in public health is the risk of an extreme and dangerous epidemic. Sizes of epidemics are measured by the number of visits to doctors caused by Influenza Like Illness (ILI), and health care planning relies on prediction of ILI rates. We use recent results on the multivariate Generalized Pareto (GP) distributions in Extreme Value Statistics to develop methods for real-time prediction of risks of exceeding very high levels and for detection of unusual and potentially very dangerous epidemics. Based on the observation of the two first weeks of the epidemic, the GP method for real-time prediction is employed to predict ILI rates of the third week and the total size of the epidemic for extreme influenza epidemics in France. We then apply a general anomaly detection framework to the ILI rates during the three first weeks of the epidemic for early detection of unusual extreme epidemics. As an additional input to resource planning we use standard methods from extreme value statistics to estimate risk of exceedance of high ILI levels in future years. The new methods are expected to be broadly applicable in health care planning and in many other areas of science and technology
Dynamics and rheology of vesicles in a shear flow under gravity and microgravity
International audienceThe behaviour of a vesicle suspension in a simple shear flow between plates (Couette flow) was investigated experimentally in parabolic flight and sounding rocket experiments by Digital Holographic Microscopy. The lift force which pushes deformable vesicles away from walls was quantitatively investigated and is found to be rather well described by a theoretical model by Olla [1]. At longer shearing times, vesicles reach a steady distribution about the center plane of the shear flow chamber, through a balance between the lift force and shear induced diffusion due to hydrodynamic interactions between vesicles. This steady distribution was investigated in the BIOMICS experiment in the MASER 11 sounding rocket. The results allow an estimation of self-diffusion coefficients in vesicle suspensions and reveal possible segregation phenomena in polydisperse suspensions
A mixed-integer heuristic for the structural optimization of a cruise ship
peer reviewedA heuristic approach is proposed to solve the structural optimization problem of a cruise ship.
The challenge of optimization is to define the scantling of the structure of a ship in order to minimize the weight or the production cost. The variables are the dimensions and positions of the constitutive elements of the structure: they are discrete by nature. The objective functions are nonlinear functions. The structure is submitted to geometric constraints and to structural constraints. The geometric constraints are linear functions and the structural constraints are implicit functions requiring a high computation cost. The problem belongs to the class of mixed-integer nonlinear problems (MINLP).
A local heuristic of the type “dive and fix” is combined with a solver based on approximation methods. The solver is used as a black-box tool to perform the structural analysis and solve the nonlinear optimization problems (NLP) defined by the heuristic. The heuristic is designed to always provide a discrete feasible solution. Experiments on a real-size structure demonstrate that the optimal value of the mixed-integer problem is of the same magnitude as the optimal value of the optimization problem for which all the variables can take continuous values
High resolution imaging of massive young stellar objects and a sample of molecular outflow sources
This thesis contains a study of millimetre wavelength observations of massive young stellar objects (MYSOs) both via interferometric and single dish observations. First, the high angular resolution observations ( up to ∼0.1”) from a variety of interferometers of the MYSO, S140
IRS1, are presented. This source is one of only two prototypes that have ionised equatorial emission from a radiatively driven disc wind. The observations confirm that IRS1 has a dusty disc at a position angle compatible with that of the disc wind emission, and confirms the disc
wind nature for the first time.
Secondly, the observations of S140 IRS1 are modelled using a 2D axisymmetric radiative transfer code. Extensive models producing synthetic data at millimetre wavelengths were developed. These models show that on the largest scales, typically accessible with single dish observations or compact interferometric configurations, the spectral
energy distribution is relatively unchanged by the addition of a compact dust disc. However, a disc is required to match the interferometric visibilities at the smaller scales. The position angle of the disc is well constrained via a newly developed 2D visibility fitting method. The
models however, are degenerate and there are a range of realistic best fitting discs.
The third section presents the single dish observations of the core material traced by C18O around 99 MYSOs and compact HII regions from the RMS survey. A method to calculate the core masses and velocity extent is reported. The method is accurate and robust, and can be applied to any molecular line emission. An updated distance limited
sample contains 87 sources and is complete to 103 L⊙. It is a representative sample of MYSOs and HII regions. All of the cores harbour at least one massive protostar.
Finally, methodologies to establish outflow parameters via 12CO (3-2) and 13CO (3-2) data are investigated. Multiple techniques are trialed for a well studied test source, IRAS 20126+4104, and a repeatable outflow analysis pathway is described. In more complex regions using the 12CO emission to identify outflows and determine the mass is more difficult and an alternative method is suggested. Moreover, the dynamical timescale of the outflows and the dynamical parameters are estimated in a spatial sense rather than using a simple average. Such analysis will aid in categorising different outflows from the full sample
Exploiting the noise: improving biomarkers with ensembles of data analysis methodologies.
BackgroundThe advent of personalized medicine requires robust, reproducible biomarkers that indicate which treatment will maximize therapeutic benefit while minimizing side effects and costs. Numerous molecular signatures have been developed over the past decade to fill this need, but their validation and up-take into clinical settings has been poor. Here, we investigate the technical reasons underlying reported failures in biomarker validation for non-small cell lung cancer (NSCLC).MethodsWe evaluated two published prognostic multi-gene biomarkers for NSCLC in an independent 442-patient dataset. We then systematically assessed how technical factors influenced validation success.ResultsBoth biomarkers validated successfully (biomarker #1: hazard ratio (HR) 1.63, 95% confidence interval (CI) 1.21 to 2.19, P = 0.001; biomarker #2: HR 1.42, 95% CI 1.03 to 1.96, P = 0.030). Further, despite being underpowered for stage-specific analyses, both biomarkers successfully stratified stage II patients and biomarker #1 also stratified stage IB patients. We then systematically evaluated reasons for reported validation failures and find they can be directly attributed to technical challenges in data analysis. By examining 24 separate pre-processing techniques we show that minor alterations in pre-processing can change a successful prognostic biomarker (HR 1.85, 95% CI 1.37 to 2.50, P < 0.001) into one indistinguishable from random chance (HR 1.15, 95% CI 0.86 to 1.54, P = 0.348). Finally, we develop a new method, based on ensembles of analysis methodologies, to exploit this technical variability to improve biomarker robustness and to provide an independent confidence metric.ConclusionsBiomarkers comprise a fundamental component of personalized medicine. We first validated two NSCLC prognostic biomarkers in an independent patient cohort. Power analyses demonstrate that even this large, 442-patient cohort is under-powered for stage-specific analyses. We then use these results to discover an unexpected sensitivity of validation to subtle data analysis decisions. Finally, we develop a novel algorithmic approach to exploit this sensitivity to improve biomarker robustness
Suicide assisted by right-to-die associations: a population based cohort study
Background: In Switzerland, assisted suicide is legal but there is concern that vulnerable or disadvantaged groups are more likely to die in this way than other people. We examined socio-economic factors associated with assisted suicide. Methods: We linked the suicides assisted by right-to-die associations during 2003-08 to a census-based longitudinal study of the Swiss population. We used Cox and logistic regression models to examine associations with gender, age, marital status, education, religion, type of household, urbanization, neighbourhood socio-economic position and other variables. Separate analyses were done for younger (25 to 64 years) and older (65 to 94 years) people. Results: Analyses were based on 5 004 403 Swiss residents and 1301 assisted suicides (439 in the younger and 862 in the older group). In 1093 (84.0%) assisted suicides, an underlying cause was recorded; cancer was the most common cause (508, 46.5%). In both age groups, assisted suicide was more likely in women than in men, those living alone compared with those living with others and in those with no religious affiliation compared with Protestants or Catholics. The rate was also higher in more educated people, in urban compared with rural areas and in neighbourhoods of higher socio-economic position. In older people, assisted suicide was more likely in the divorced compared with the married; in younger people, having children was associated with a lower rate. Conclusions: Assisted suicide in Switzerland was associated with female gender and situations that may indicate greater vulnerability such as living alone or being divorced, but also with higher education and higher socio-economic positio
- …