103 research outputs found
Survival analysis with delayed entry in selected families with application to human longevity
In the field of aging research, family-based sampling study designs are commonly used to study the lifespans of long-lived family members. However, the specific sampling procedure should be carefully taken into account in order to avoid biases. This work is motivated by the Leiden Longevity Study, a family-based cohort of long-lived siblings. Families were invited to participate in the study if at least two siblings were ‘long-lived’, where ‘long-lived’ meant being older than 89 years for men or older than 91 years for women. As a result, more than 400 families were included in the study and followed for around 10 years. For estimation of marker-specific survival probabilities and correlations among life times of family members, delayed entry due to outcome-dependent sampling mechanisms has to be taken into account. We consider shared frailty models to model left-truncated correlated survival data. The treatment of left truncation in shared frailty models is still an open issue and the literature on this topic is scarce. We show that the current approaches provide, in general, biased estimates and we propose a new method to tackle this selection problem by applying a correction on the likelihood estimation by means of inverse probability weighting at the family level
Sequential double cross-validation for assessment of added predictive ability in high-dimensional omic applications
Enriching existing predictive models with new biomolecular markers is an important task in the new multi-omic era. Clinical studies increasingly include new sets of omic measurements which may prove their added value in terms of predictive performance. We introduce a two-step approach for the assessment of the added predictive ability of omic predictors, based on sequential double cross-validation and regularized regression models. We propose several performance indices to summarize the two-stage prediction procedure and a permutation test to formally assess the added predictive value of a second omic set of predictors over a primary omic source. The performance of the test is investigated through simulations. We illustrate the new method through the systematic assessment and comparison of the performance of transcriptomics and metabolomics sources in the prediction of body mass index (BMI) using longitudinal data from the Dietary, Lifestyle, and Genetic determinants of Obesity and Metabolic syndrome (DILGOM) study, a population-based cohort from Finland
Genetic, household and spatial clustering of leprosy on an island in Indonesia: a population-based study
Abstract Background It is generally accepted that genetic factors play a role in susceptibility to both leprosy per se and leprosy type, but only few studies have tempted to quantify this. Estimating the contribution of genetic factors to clustering of leprosy within families is difficult since these persons often share the same environment. The first aim of this study was to test which correlation structure (genetic, household or spatial) gives the best explanation for the distribution of leprosy patients and seropositive persons and second to quantify the role of genetic factors in the occurrence of leprosy and seropositivity. Methods The three correlation structures were proposed for population data (n = 560), collected on a geographically isolated island highly endemic for leprosy, to explain the distribution of leprosy per se, leprosy type and persons harbouring Mycobacterium leprae-specific antibodies. Heritability estimates and risk ratios for siblings were calculated to quantify the genetic effect. Leprosy was clinically diagnosed and specific anti-M. leprae antibodies were measured using ELISA. Results For leprosy per se in the total population the genetic correlation structure fitted best. In the population with relative stable household status (persons under 21 years and above 39 years) all structures were significant. For multibacillary leprosy (MB) genetic factors seemed more important than for paucibacillary leprosy. Seropositivity could be explained best by the spatial model, but the genetic model was also significant. Heritability was 57% for leprosy per se and 31% for seropositivity. Conclusion Genetic factors seem to play an important role in the clustering of patients with a more advanced form of leprosy, and they could explain more than half of the total phenotypic variance.</p
The TRAF1/C5 region is a risk factor for polyarthritis in juvenile idiopathic arthritis
Juvenile idiopathic arthritis (JIA) is a chronic disorder in which both genetic and environmental factors are involved. Recently, we identified the TRAF1/C5 region (located on chromosome 9q33-34) as a risk factor for rheumatoid arthritis (RA) (p(combined) = 1.4 x 10(-8)). In the present study the association of the TRAF1/C5 region with the susceptibility to JIA was investigated. A case-control association study was performed in 338 Caucasian patients with JIA and 511 healthy individuals. We genotyped the single nucleotide polymorphism rs10818488 as a marker for the TRAF1/C5 region. The A allele was associated with the susceptibility to rheumatoid factor-negative polyarthritis with an 11% increase in allele frequency (OR 1.54, 95% CI 1.09 to 2.18; p = 0.012). This association was stronger when combining subtypes with a polyarticular phenotype (OR 1.46, 95% CI 1.12 to 1.90; p = 0.004). In addition, we observed a trend towards an increase in A allele frequency in patients with extended oligoarthritis versus persistent oligoarthritis (49%, 38% respectively); p = 0.055. Apart from being a well replicated risk factor for RA, TRAF1/C5 also appears to be a risk factor for the rheumatoid factor-negative polyarthritis subtype of JIA and, more generally, seems to be associated with subtypes of JIA characterised by a polyarticular cours
Estimating Constraints for Protection Factors from HDX-MS Data
Hydrogen/deuterium exchange monitored by mass spectrometry is a promising technique for rapidly fingerprinting structural and dynamical properties of proteins. The time-dependent change in the mass of any fragment of the polypeptide chain depends uniquely on the rate of exchange of its amide hydrogens, but determining the latter from the former is generally not possible. Here, we show that, if time-resolved measurements are available for a number of overlapping peptides that cover the whole sequence, rate constants for each amide hydrogen exchange (or equivalently, their protection factors) may be extracted and the uniqueness of the solutions obtained depending on the degree of peptide overlap. However, in most cases, the solution is not unique, and multiple alternatives must be considered. We provide a statistical method that clusters the solutions to further reduce their number. Such analysis always provides meaningful constraints on protection factors and can be used in situations in which obtaining more refined experimental data is impractical. It also provides a systematic way to improve data collection strategies to obtain unambiguous information at single-residue level (e.g., for assessing protein structure predictions at atomistic level)
The mixed model for the analysis of a repeated‐measurement multivariate count data
Clustered overdispersed multivariate count data are challenging to model due to the presence of correlation within and between samples. Typically, the first source of correlation needs to be addressed but its quantification is of less interest. Here, we focus on the correlation between time points. In addition, the effects of covariates on the multivariate counts distribution need to be assessed. To fulfill these requirements, a regression model based on the Dirichlet‐multinomial distribution for association between covariates and the categorical counts is extended by using random effects to deal with the additional clustering. This model is the Dirichlet‐multinomial mixed regression model. Alternatively, a negative binomial regression mixed model can be deployed where the corresponding likelihood is conditioned on the total count. It appears that these two approaches are equivalent when the total count is fixed and independent of the random effects. We consider both subject‐specific and categorical‐specific random effects. However, the latter has a larger computational burden when the number of categories increases. Our work is motivated by microbiome data sets obtained by sequencing of the amplicon of the bacterial 16S rRNA gene. These data have a compositional structure and are typically overdispersed. The microbiome data set is from an epidemiological study carried out in a helminth‐endemic area in Indonesia. The conclusions are as follows: time has no statistically significant effect on microbiome composition, the correlation between subjects is statistically significant, and treatment has a significant effect on the microbiome composition only in infected subjects who remained infected
The role of hemochromatosis C282Y and H63D gene mutations in type 2 diabetes: findings from the Rotterdam Study and meta-analysis
The role of hemochromatosis C282Y and H63D gene mutations in type 2 diabetes: findings from the Rotterdam Study and meta-analysis
A genome-wide search for linkage-disequilibrium with type I diabetes in a recent genetically isolated population from the Netherlands
- …
