54 research outputs found

    Semiparametric curve alignment and shift density estimation for biological data

    Full text link
    Assume that we observe a large number of curves, all of them with identical, although unknown, shape, but with a different random shift. The objective is to estimate the individual time shifts and their distribution. Such an objective appears in several biological applications like neuroscience or ECG signal processing, in which the estimation of the distribution of the elapsed time between repetitive pulses with a possibly low signal-noise ratio, and without a knowledge of the pulse shape is of interest. We suggest an M-estimator leading to a three-stage algorithm: we split our data set in blocks, on which the estimation of the shifts is done by minimizing a cost criterion based on a functional of the periodogram; the estimated shifts are then plugged into a standard density estimator. We show that under mild regularity assumptions the density estimate converges weakly to the true shift distribution. The theory is applied both to simulations and to alignment of real ECG signals. The estimator of the shift distribution performs well, even in the case of low signal-to-noise ratio, and is shown to outperform the standard methods for curve alignment.Comment: 30 pages ; v5 : minor changes and correction in the proof of Proposition 3.

    A Robbins-Monro procedure for estimation in semiparametric regression models

    Get PDF
    This paper is devoted to the parametric estimation of a shift together with the nonparametric estimation of a regression function in a semiparametric regression model. We implement a very efficient and easy to handle Robbins-Monro procedure. On the one hand, we propose a stochastic algorithm similar to that of Robbins-Monro in order to estimate the shift parameter. A preliminary evaluation of the regression function is not necessary to estimate the shift parameter. On the other hand, we make use of a recursive Nadaraya-Watson estimator for the estimation of the regression function. This kernel estimator takes into account the previous estimation of the shift parameter. We establish the almost sure convergence for both Robbins-Monro and Nadaraya--Watson estimators. The asymptotic normality of our estimates is also provided. Finally, we illustrate our semiparametric estimation procedure on simulated and real data.Comment: Published in at http://dx.doi.org/10.1214/12-AOS969 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Semiparametric Curve Alignment and Shift Density Estimation: ECG Data Processing Revisited

    Get PDF
    We address in this contribution a problem stemming from functional data analysis. Assuming that we dispose of a large number of shifted recorded curves with identical shape, the objective is to estimate the time shifts as well as their distribution. Such an objective appears in several biological applications, for example in ECG signal processing. We are interested in the estimation of the distribution of elapsed durations between repetitive pulses, but wish to estimate it with a possibly low signal-to-noise ratio, or without any knowledge of the pulse shape. This problem is solved within a semiparametric framework, that is without any knowledge of the shape. We suggest an M-estimator leading to two different algorithms whose main steps are as follows: we split our dataset in blocks, on which the estimation of the shifts is done by minimizing a cost criterion, based on a functional of the periodogram. The estimated shifts are then plugged into a standard density estimator. Some theoretical insights are presented, which show that under mild assumptions the alignment can be done efficiently. Results are presented on simulations, as well as on real data for the alignment of ECG signals, and these algorithms are compared to the methods used by practitioners in this framework. It is shown in the results that the presented method outperforms the standard ones, thus leading to a more accurate estimation of the average heart pulse and of the distribution of elapsed times between heart pulses, even in the case of low Signal-to- Noise Ratio (SNR)

    Fréchet means of curves for signal averaging and application to ECG data analysis

    Get PDF
    Signal averaging is the process that consists in computing a mean shape from a set of noisy signals. In the presence of geometric variability in time in the data, the usual Euclidean mean of the raw data yields a mean pattern that does not reflect the typical shape of the observed signals. In this setting, it is necessary to use alignment techniques for a precise synchronization of the signals, and then to average the aligned data to obtain a consistent mean shape. In this paper, we study the numerical performances of Fréchet means of curves which are extensions of the usual Euclidean mean to spaces endowed with non-Euclidean metrics. This yields a new algorithm for signal averaging and for the estimation of the time variability of a set of signals. We apply this approach to the analysis of heartbeats from ECG records

    Wavelet methods for locally stationary time series

    Get PDF
    Time series data can often possess complex and dynamic characteristics. Two key statistical properties of time series -- the mean (first-order) and autocovariance (second-order) -- commonly change over time. Modelling this evolution of so-called nonstationary time series is crucial to making informed inference on the data. This thesis focuses on wavelet-based methodology for the simultaneous modelling of first and second-order nonstationary time series, for which we provide three main contributions. First, we propose a method using differencing to jointly estimate the time-varying trend and second-order structure of a time series, within the locally stationary wavelet processes framework. We discuss a wavelet-based estimator of the second-order structure of the original time series by employing differencing, and show how this can be incorporated into the estimation of the trend of the time series. Second, we propose a framework for modelling series with simultaneous time-varying first and second-order structure by removing the restrictive zero-mean assumption of locally stationary wavelet (LSW) processes and extending the applicability of the locally stationary wavelet model to include a trend component. We develop associated estimation theory for both first and second-order time series quantities and show that our estimators achieve good properties in isolation of each other by making appropriate assumptions on the series trend. Last, we consider simultaneous modelling of first and second-order structure in the scenario where the mean function is piecewise constant. We propose a likelihood-based method using wavelets to detect changes in mean in time series that exhibit time-varying autocovariance. This allows for a more flexible model for mean changepoint detection, since commonly the second-order structure is assumed to be independent and identically distributed. The performance of the method is investigated via simulation, and is shown to perform well in a variety of time series scenarios

    Abstracts of Papers, 79th Annual Meeting of the Virginia Academy of Science, May 22-25, 2001, James Madison University, Harrisonburg, Virginia

    Get PDF
    Abstracts of papers that were presented at the 79th Annual Meeting of the Virginia Academy of Science, May 22-25, 2001, James Madison University, Harrisonburg, Virginia

    Functional Analysis of Genomic Variation and Impact on Molecular and Higher Order Phenotypes

    Get PDF
    Reverse genetics methods, particularly the production of gene knockouts and knockins, have revolutionized the understanding of gene function. High throughput sequencing now makes it practical to exploit reverse genetics to simultaneously study functions of thousands of normal sequence variants and spontaneous mutations that segregate in intercross and backcross progeny generated by mating completely sequenced parental lines. To evaluate this new reverse genetic method we resequenced the genome of one of the oldest inbred strains of mice—DBA/2J—the father of the large family of BXD recombinant inbred strains. We analyzed ~100X wholegenome sequence data for the DBA/2J strain, relative to C57BL/6J, the reference strain for all mouse genomics and the mother of the BXD family. We generated the most detailed picture of molecular variation between the two mouse strains to date and identified 5.4 million sequence polymorphisms, including, 4.46 million single nucleotide polymorphisms (SNPs), 0.94 million intersections/deletions (indels), and 20,000 structural variants. We systematically scanned massive databases of molecular phenotypes and ~4,000 classical phenotypes to detect linked functional consequences of sequence variants. In majority of cases we successfully recovered known genotype-to-phenotype associations and in several cases we linked sequence variants to novel phenotypes (Ahr, Fh1, Entpd2, and Col6a5). However, our most striking and consistent finding is that apparently deleterious homozygous SNPs, indels, and structural variants have undetectable or very modest additive effects on phenotypes

    Event impact analysis for time series

    Get PDF
    Time series arise in a variety of application domains—whenever data points are recorded over time and stored for subsequent analysis. A critical question is whether the occurrence of events like natural disasters, technical faults, or political interventions leads to changes in a time series, for example, temporary deviations from its typical behavior. The vast majority of existing research on this topic focuses on the specific impact of a single event on a time series, while methods to generically capture the impact of a recurring event are scarce. In this thesis, we fill this gap by introducing a novel framework for event impact analysis in the case of randomly recurring events. We develop a statistical perspective on the problem and provide a generic notion of event impacts based on a statistical independence relation. The main problem we address is that of establishing the presence of event impacts in stationary time series using statistical independence tests. Tests for event impacts should be generic, powerful, and computationally efficient. We develop two algorithmic test strategies for event impacts that satisfy these properties. The first is based on coincidences between events and peaks in the time series, while the second is based on multiple marginal associations. We also discuss a selection of follow-up questions, including ways to measure, model and visualize event impacts, and the relationship between event impact analysis and anomaly detection in time series. At last, we provide a first method to study event impacts in nonstationary time series. We evaluate our methodological contributions on several real-world datasets and study their performance within large-scale simulation studies

    An assessment of the welfare of non-human primates used in neuroscience research

    Get PDF
    The Animals (Scientific Procedures) Act 1986 governs the use of animals in scientific research in the UK. Embedded within this is a requirement to implement the 3Rs: replacement, reduction, and refinement; a key mechanism for minimising the pain, suffering, distress, and lasting harm of research models. To adhere fully to these principles, it is imperative to assess animal welfare. Non-human primates (NHP) human similarities make them both an essential biomedical research model and a species particularly vulnerable to welfare challenges. This thesis investigates non-invasive, objective methods of welfare assessment and applies them alongside neuroscience research to monitor the welfare of rhesus macaques (Macaca mulatta). Accelerometers monitored changes in activity levels following general anaesthesia (GA) and revealed a decline in activity for 5.7 days following GA alone. Additionally, 2.6 days of activity decline were observed after surgery under GA, followed by an activity increase, suggesting post-operative behavioural change. Additionally, accelerometers were used to create a rule-based model for automated behavioural assessment. This model groups macaque activity into 5 species-typical behavioural categories with an overall 69% accuracy. Physiological welfare parameters were assessed by detecting cortisol in faeces and hair. Faecal cortisol levels were significantly elevated for several days following GA, with a longer and more profound increase after surgery. Hair samples are a valuable measure of chronic stress and cumulative experience, facilitating longitudinal cortisol assessment without influence from transient stressors. Hair cortisol levels were significantly elevated post-surgery and, in some cases, following social disruption. A customised, neck-based ECG was designed to monitor heart rate variability. The ECG was trialled on anaesthetised and restrained NHPs, dogs, and sheep. Proof of concept was achieved, with R-waves detectable in all species. Overall, this thesis demonstrates how welfare and neuroscience research can be conducted in parallel for a better understanding of the life-time experience of animal models. This is an essential step towards implementing refinements and improving welfare, which may ultimately improve public perception and help to form the evidence-base required to inform and drive policy change
    • …
    corecore