204 research outputs found

    On Comparing the Clustering of Regression Models Method with K-means Clustering

    Get PDF
    Gene clustering is a common question addressed with microarray data. Previous methods, such as K-means clustering and hierarchical clustering, base gene clustering directly on the observed measurements. A new model-based clustering method, the clustering of regression models (CORM) method, bases the clustering of genes on their relationship to covariates. It explicitly models different sources of variations and bases gene clustering solely on the systematic variation. Both being partitional clustering, CORM is closely related to K-means clustering. In this paper, we discuss the relationship between the two clustering methods in terms of both model formulation and implications on other important aspects of cluster analysis. We show that the two methods can both be considered as solutions to a least squares problem with missing data but they each concern a different type of least squares. We also show that CORM tends to provide stable clusters across samples and is particularly useful if the cluster averages are used as predictors for sample classification. Finally we illustrate the application of CORM to a set of time course data measured on four yeast samples, which has a complicated experimental design and is difficult for K-means to handle

    Finding gene clusters for a replicated time course study

    Get PDF
    BACKGROUND: Finding genes that share similar expression patterns across samples is an important question that is frequently asked in high-throughput microarray studies. Traditional clustering algorithms such as K-means clustering and hierarchical clustering base gene clustering directly on the observed measurements and do not take into account the specific experimental design under which the microarray data were collected. A new model-based clustering method, the clustering of regression models method, takes into account the specific design of the microarray study and bases the clustering on how genes are related to sample covariates. It can find useful gene clusters for studies from complicated study designs such as replicated time course studies. FINDINGS: In this paper, we applied the clustering of regression models method to data from a time course study of yeast on two genotypes, wild type and YOX1 mutant, each with two technical replicates, and compared the clustering results with K-means clustering. We identified gene clusters that have similar expression patterns in wild type yeast, two of which were missed by K-means clustering. We further identified gene clusters whose expression patterns were changed in YOX1 mutant yeast compared to wild type yeast. CONCLUSIONS: The clustering of regression models method can be a valuable tool for identifying genes that are coordinately transcribed by a common mechanism

    Direct Inference of SNP Heterozygosity Rates and Resolution of LOH Detection

    Get PDF
    Single nucleotide polymorphisms (SNPs) have been increasingly utilized to investigate somatic genetic abnormalities in premalignancy and cancer. LOH is a common alteration observed during cancer development, and SNP assays have been used to identify LOH at specific chromosomal regions. The design of such studies requires consideration of the resolution for detecting LOH throughout the genome and identification of the number and location of SNPs required to detect genetic alterations in specific genomic regions. Our study evaluated SNP distribution patterns and used probability models, Monte Carlo simulation, and real human subject genotype data to investigate the relationships between the number of SNPs, SNP HET rates, and the sensitivity (resolution) for detecting LOH. We report that variances of SNP heterozygosity rate in dbSNP are high for a large proportion of SNPs. Two statistical methods proposed for directly inferring SNP heterozygosity rates require much smaller sample sizes (intermediate sizes) and are feasible for practical use in SNP selection or verification. Using HapMap data, we showed that a region of LOH greater than 200 kb can be reliably detected, with losses smaller than 50 kb having a substantially lower detection probability when using all SNPs currently in the HapMap database. Higher densities of SNPs may exist in certain local chromosomal regions that provide some opportunities for reliably detecting LOH of segment sizes smaller than 50 kb. These results suggest that the interpretation of the results from genome-wide scans for LOH using commercial arrays need to consider the relationships among inter-SNP distance, detection probability, and sample size for a specific study. New experimental designs for LOH studies would also benefit from considering the power of detection and sample sizes required to accomplish the proposed aims

    The Clustering of Regression Models Method with Applications in Gene Expression Data

    Get PDF
    Identification of differentially expressed genes and clustering of genes are two important and complementary objectives addressed with gene expression data. For the differential expression question, many per-gene analytic methods have been proposed. These methods can generally be characterized as using a regression function to independently model the observations for each gene; various adjustments for multiplicity are then used to interpret the statistical significance of these per-gene regression models over the collection of genes analyzed. Motivated by this common structure of per-gene models, we propose a new model-based clustering method -- the clustering of regression models method, which groups genes that share a similar relationship to the covariate(s). This method provides a unified approach for a family of clustering procedures and can be applied for data collected with various experimental designs. In addition, when combined with per-gene methods for assessing differential expression that employ the same regression modeling structure, an integrated framework for the analysis of microarray data is obtained. The proposed methodology was applied to two real microarray datasets, one from a breast cancer study and the other from a yeast cell cycle study

    Power to Detect the Effects of HIV Vaccination in Repeated Low‐Dose Challenge Experiments

    Get PDF
    Simulation studies were conducted to estimate the statistical power of repeated low-dose challenge experiments in non-human primates to detect a candidate HIV vaccine’s effect. The effect of various design parameters on power was explored. Simulation results indicate repeated low-dose challenge studies with total sample size 50 (25 per arm) typically provide adequate power to detect a 50% reduction in the per-exposure probability of infection due to vaccination. Power generally increases with the maximum number of allowable challenges per animal, the per-exposure risk of infection in controls, and the proportion susceptible to infection

    Equivalence of ELISpot Assays Demonstrated between Major HIV Network Laboratories

    Get PDF
    The Comprehensive T Cell Vaccine Immune Monitoring Consortium (CTC-VIMC) was created to provide standardized immunogenicity monitoring services for HIV vaccine trials. The ex vivo interferon-gamma (IFN-γ) ELISpot is used extensively as a primary immunogenicity assay to assess T cell-based vaccine candidates in trials for infectious diseases and cancer. Two independent, GCLP-accredited central laboratories of CTC-VIMC routinely use their own standard operating procedures (SOPs) for ELISpot within two major networks of HIV vaccine trials. Studies are imperatively needed to assess the comparability of ELISpot measurements across laboratories to benefit optimal advancement of vaccine candidates.We describe an equivalence study of the two independently qualified IFN-g ELISpot SOPs. The study design, data collection and subsequent analysis were managed by independent statisticians to avoid subjectivity. The equivalence of both response rates and positivity calls to a given stimulus was assessed based on pre-specified acceptance criteria derived from a separate pilot study.Detection of positive responses was found to be equivalent between both laboratories. The 95% C.I. on the difference in response rates, for CMV (-1.5%, 1.5%) and CEF (-0.4%, 7.8%) responses, were both contained in the pre-specified equivalence margin of interval [-15%, 15%]. The lower bound of the 95% C.I. on the proportion of concordant positivity calls for CMV (97.2%) and CEF (89.5%) were both greater than the pre-specified margin of 70%. A third CTC-VIMC central laboratory already using one of the two SOPs also showed comparability when tested in a smaller sub-study.The described study procedure provides a prototypical example for the comparison of bioanalytical methods in HIV vaccine and other disease fields. This study also provides valuable and unprecedented information for future vaccine candidate evaluations on the comparison and pooling of ELISpot results generated by the CTC-VIMC central core laboratories

    Multivariate selection and intersexual genetic constraints in a wild bird population

    Get PDF
    When selection differs between the sexes for traits that are genetically correlated between the sexes, there is potential for the effect of selection in one sex to be altered by indirect selection in the other sex, a situation commonly referred to as intralocus sexual conflict (ISC). While potentially common, ISC has rarely been studied in wild populations. Here, we studied ISC over a set of morphological traits (wing length, tarsus length, bill depth, and bill length) in a wild population of great tits (Parus major) from Wytham Woods, UK. Specifically, we quantified the microevolutionary impacts of ISC by combining intra- and inter-sex additive genetic (co)variances and sex-specific selection estimates in a multivariate framework. Large genetic correlations between homologous male and female traits combined with evidence for sex-specific multivariate survival selection suggested that ISC could play an appreciable role in the evolution of this population. Together, multivariate sex-specific selection and additive genetic (co)variance for the traits considered accounted for additive genetic variance in fitness was uncorrelated between the sexes (cross-sex genetic correlation = -0.003, 95% CI = -0.83, 0.83). Gender load, defined as the reduction in a population’s rate of adaptation due to sex-specific effects, was estimated at 50% (95% CI = 13%, 86%). This study provides novel insights into the evolution of sexual dimorphism in wild populations and illustrates how quantitative genetics and selection analyses can be combined in a multivariate framework to quantify the microevolutionary impacts of ISC.PostprintPeer reviewe

    Mapping HIV-1 Vaccine Induced T-Cell Responses: Bias towards Less-Conserved Regions and Potential Impact on Vaccine Efficacy in the Step Study

    Get PDF
    T cell directed HIV vaccines are based upon the induction of CD8+ T cell memory responses that would be effective in inhibiting infection and subsequent replication of an infecting HIV-1 strain, a process that requires a match or near-match between the epitope induced by vaccination and the infecting viral strain. We compared the frequency and specificity of the CTL epitope responses elicited by the replication-defective Ad5 gag/pol/nef vaccine used in the Step trial with the likelihood of encountering those epitopes among recently sequenced Clade B isolates of HIV-1. Among vaccinees with detectable 15-mer peptide pool ELISpot responses, there was a median of four (one Gag, one Nef and two Pol) CD8 epitopes per vaccinee detected by 9-mer peptide ELISpot assay. Importantly, frequency analysis of the mapped epitopes indicated that there was a significant skewing of the T cell response; variable epitopes were detected more frequently than would be expected from an unbiased sampling of the vaccine sequences. Correspondingly, the most highly conserved epitopes in Gag, Pol, and Nef (defined by presence in >80% of sequences currently in the Los Alamos database www.hiv.lanl.gov) were detected at a lower frequency than unbiased sampling, similar to the frequency reported for responses to natural infection, suggesting potential epitope masking of these responses. This may be a generic mechanism used by the virus in both contexts to escape effective T cell immune surveillance. The disappointing results of the Step trial raise the bar for future HIV vaccine candidates. This report highlights the bias towards less-conserved epitopes present in the same vaccine used in the Step trial. Development of vaccine strategies that can elicit a greater breadth of responses, and towards conserved regions of the genome in particular, are critical requirements for effective T-cell based vaccines against HIV-1

    Creation of an Open-Access, Mutation-Defined Fibroblast Resource for Neurological Disease Research

    Get PDF
    Our understanding of the molecular mechanisms of many neurological disorders has been greatly enhanced by the discovery of mutations in genes linked to familial forms of these diseases. These have facilitated the generation of cell and animal models that can be used to understand the underlying molecular pathology. Recently, there has been a surge of interest in the use of patient-derived cells, due to the development of induced pluripotent stem cells and their subsequent differentiation into neurons and glia. Access to patient cell lines carrying the relevant mutations is a limiting factor for many centres wishing to pursue this research. We have therefore generated an open-access collection of fibroblast lines from patients carrying mutations linked to neurological disease. These cell lines have been deposited in the National Institute for Neurological Disorders and Stroke (NINDS) Repository at the Coriell Institute for Medical Research and can be requested by any research group for use in in vitro disease modelling. There are currently 71 mutation-defined cell lines available for request from a wide range of neurological disorders and this collection will be continually expanded. This represents a significant resource that will advance the use of patient cells as disease models by the scientific community

    Ontogeny of Toll-Like Receptor Mediated Cytokine Responses of Human Blood Mononuclear Cells

    Get PDF
    Newborns and young infants suffer increased infectious morbidity and mortality as compared to older children and adults. Morbidity and mortality due to infection are highest during the first weeks of life, decreasing over several years. Furthermore, most vaccines are not administered around birth, but over the first few years of life. A more complete understanding of the ontogeny of the immune system over the first years of life is thus urgently needed. Here, we applied the most comprehensive analysis focused on the innate immune response following TLR stimulation over the first 2 years of life in the largest such longitudinal cohort studied to-date (35 subjects). We found that innate TLR responses (i) known to support Th17 adaptive immune responses (IL-23, IL-6) peaked around birth and declined over the following 2 years only to increase again by adulthood; (ii) potentially supporting antiviral defense (IFN-α) reached adult level function by 1 year of age; (iii) known to support Th1 type immunity (IL-12p70, IFN-γ) slowly rose from a low at birth but remained far below adult responses even at 2 years of age; (iv) inducing IL-10 production steadily declined from a high around birth to adult levels by 1 or 2 years of age, and; (v) leading to production of TNF-α or IL-1β varied by stimuli. Our data contradict the notion of a linear progression from an ‘immature’ neonatal to a ‘mature’ adult pattern, but instead indicate the existence of qualitative and quantitative age-specific changes in innate immune reactivity in response to TLR stimulation
    corecore