124 research outputs found
On Comparing the Clustering of Regression Models Method with K-means Clustering
Gene clustering is a common question addressed with microarray data. Previous methods, such as K-means clustering and hierarchical clustering, base gene clustering directly on the observed measurements. A new model-based clustering method, the clustering of regression models (CORM) method, bases the clustering of genes on their relationship to covariates. It explicitly models different sources of variations and bases gene clustering solely on the systematic variation. Both being partitional clustering, CORM is closely related to K-means clustering. In this paper, we discuss the relationship between the two clustering methods in terms of both model formulation and implications on other important aspects of cluster analysis. We show that the two methods can both be considered as solutions to a least squares problem with missing data but they each concern a different type of least squares. We also show that CORM tends to provide stable clusters across samples and is particularly useful if the cluster averages are used as predictors for sample classification. Finally we illustrate the application of CORM to a set of time course data measured on four yeast samples, which has a complicated experimental design and is difficult for K-means to handle
Finding gene clusters for a replicated time course study
BACKGROUND: Finding genes that share similar expression patterns across samples is an important question that is frequently asked in high-throughput microarray studies. Traditional clustering algorithms such as K-means clustering and hierarchical clustering base gene clustering directly on the observed measurements and do not take into account the specific experimental design under which the microarray data were collected. A new model-based clustering method, the clustering of regression models method, takes into account the specific design of the microarray study and bases the clustering on how genes are related to sample covariates. It can find useful gene clusters for studies from complicated study designs such as replicated time course studies. FINDINGS: In this paper, we applied the clustering of regression models method to data from a time course study of yeast on two genotypes, wild type and YOX1 mutant, each with two technical replicates, and compared the clustering results with K-means clustering. We identified gene clusters that have similar expression patterns in wild type yeast, two of which were missed by K-means clustering. We further identified gene clusters whose expression patterns were changed in YOX1 mutant yeast compared to wild type yeast. CONCLUSIONS: The clustering of regression models method can be a valuable tool for identifying genes that are coordinately transcribed by a common mechanism
Direct Inference of SNP Heterozygosity Rates and Resolution of LOH Detection
Single nucleotide polymorphisms (SNPs) have been increasingly utilized to investigate somatic genetic abnormalities in premalignancy and cancer. LOH is a common alteration observed during cancer development, and SNP assays have been used to identify LOH at specific chromosomal regions. The design of such studies requires consideration of the resolution for detecting LOH throughout the genome and identification of the number and location of SNPs required to detect genetic alterations in specific genomic regions. Our study evaluated SNP distribution patterns and used probability models, Monte Carlo simulation, and real human subject genotype data to investigate the relationships between the number of SNPs, SNP HET rates, and the sensitivity (resolution) for detecting LOH. We report that variances of SNP heterozygosity rate in dbSNP are high for a large proportion of SNPs. Two statistical methods proposed for directly inferring SNP heterozygosity rates require much smaller sample sizes (intermediate sizes) and are feasible for practical use in SNP selection or verification. Using HapMap data, we showed that a region of LOH greater than 200 kb can be reliably detected, with losses smaller than 50 kb having a substantially lower detection probability when using all SNPs currently in the HapMap database. Higher densities of SNPs may exist in certain local chromosomal regions that provide some opportunities for reliably detecting LOH of segment sizes smaller than 50 kb. These results suggest that the interpretation of the results from genome-wide scans for LOH using commercial arrays need to consider the relationships among inter-SNP distance, detection probability, and sample size for a specific study. New experimental designs for LOH studies would also benefit from considering the power of detection and sample sizes required to accomplish the proposed aims
Power to Detect the Effects of HIV Vaccination in Repeated LowβDose Challenge Experiments
Simulation studies were conducted to estimate the statistical power of repeated low-dose challenge experiments in non-human primates to detect a candidate HIV vaccineβs effect. The effect of various design parameters on power was explored. Simulation results indicate repeated low-dose challenge studies with total sample size 50 (25 per arm) typically provide adequate power to detect a 50% reduction in the per-exposure probability of infection due to vaccination. Power generally increases with the maximum number of allowable challenges per animal, the per-exposure risk of infection in controls, and the proportion susceptible to infection
The Clustering of Regression Models Method with Applications in Gene Expression Data
Identification of differentially expressed genes and clustering of genes are two important and complementary objectives addressed with gene expression data. For the differential expression question, many per-gene analytic methods have been proposed. These methods can generally be characterized as using a regression function to independently model the observations for each gene; various adjustments for multiplicity are then used to interpret the statistical significance of these per-gene regression models over the collection of genes analyzed. Motivated by this common structure of per-gene models, we propose a new model-based clustering method -- the clustering of regression models method, which groups genes that share a similar relationship to the covariate(s). This method provides a unified approach for a family of clustering procedures and can be applied for data collected with various experimental designs. In addition, when combined with per-gene methods for assessing differential expression that employ the same regression modeling structure, an integrated framework for the analysis of microarray data is obtained. The proposed methodology was applied to two real microarray datasets, one from a breast cancer study and the other from a yeast cell cycle study
Equivalence of ELISpot Assays Demonstrated between Major HIV Network Laboratories
The Comprehensive T Cell Vaccine Immune Monitoring Consortium (CTC-VIMC) was created to provide standardized immunogenicity monitoring services for HIV vaccine trials. The ex vivo interferon-gamma (IFN-Ξ³) ELISpot is used extensively as a primary immunogenicity assay to assess T cell-based vaccine candidates in trials for infectious diseases and cancer. Two independent, GCLP-accredited central laboratories of CTC-VIMC routinely use their own standard operating procedures (SOPs) for ELISpot within two major networks of HIV vaccine trials. Studies are imperatively needed to assess the comparability of ELISpot measurements across laboratories to benefit optimal advancement of vaccine candidates.We describe an equivalence study of the two independently qualified IFN-g ELISpot SOPs. The study design, data collection and subsequent analysis were managed by independent statisticians to avoid subjectivity. The equivalence of both response rates and positivity calls to a given stimulus was assessed based on pre-specified acceptance criteria derived from a separate pilot study.Detection of positive responses was found to be equivalent between both laboratories. The 95% C.I. on the difference in response rates, for CMV (-1.5%, 1.5%) and CEF (-0.4%, 7.8%) responses, were both contained in the pre-specified equivalence margin of interval [-15%, 15%]. The lower bound of the 95% C.I. on the proportion of concordant positivity calls for CMV (97.2%) and CEF (89.5%) were both greater than the pre-specified margin of 70%. A third CTC-VIMC central laboratory already using one of the two SOPs also showed comparability when tested in a smaller sub-study.The described study procedure provides a prototypical example for the comparison of bioanalytical methods in HIV vaccine and other disease fields. This study also provides valuable and unprecedented information for future vaccine candidate evaluations on the comparison and pooling of ELISpot results generated by the CTC-VIMC central core laboratories
Mapping HIV-1 Vaccine Induced T-Cell Responses: Bias towards Less-Conserved Regions and Potential Impact on Vaccine Efficacy in the Step Study
T cell directed HIV vaccines are based upon the induction of CD8+ T cell memory responses that would be effective in inhibiting infection and subsequent replication of an infecting HIV-1 strain, a process that requires a match or near-match between the epitope induced by vaccination and the infecting viral strain. We compared the frequency and specificity of the CTL epitope responses elicited by the replication-defective Ad5 gag/pol/nef vaccine used in the Step trial with the likelihood of encountering those epitopes among recently sequenced Clade B isolates of HIV-1. Among vaccinees with detectable 15-mer peptide pool ELISpot responses, there was a median of four (one Gag, one Nef and two Pol) CD8 epitopes per vaccinee detected by 9-mer peptide ELISpot assay. Importantly, frequency analysis of the mapped epitopes indicated that there was a significant skewing of the T cell response; variable epitopes were detected more frequently than would be expected from an unbiased sampling of the vaccine sequences. Correspondingly, the most highly conserved epitopes in Gag, Pol, and Nef (defined by presence in >80% of sequences currently in the Los Alamos database www.hiv.lanl.gov) were detected at a lower frequency than unbiased sampling, similar to the frequency reported for responses to natural infection, suggesting potential epitope masking of these responses. This may be a generic mechanism used by the virus in both contexts to escape effective T cell immune surveillance. The disappointing results of the Step trial raise the bar for future HIV vaccine candidates. This report highlights the bias towards less-conserved epitopes present in the same vaccine used in the Step trial. Development of vaccine strategies that can elicit a greater breadth of responses, and towards conserved regions of the genome in particular, are critical requirements for effective T-cell based vaccines against HIV-1
Viral Protein Fragmentation May Broaden T-Cell Responses to HIV Vaccines
High mutation rates of human immunodeficiency virus (HIV) allows escape from T cell recognition preventing development of effective T cell vaccines. Vaccines that induce diverse T cell immune responses would help overcome this problem. Using SIV gag as a model vaccine, we investigated two approaches to increase the breadth of the CD8 T cell response. Namely, fusion of vaccine genes to ubiquitin to target the proteasome and increase levels of MHC class I peptide complexes and gene fragmentation to overcome competition between epitopes for presentation and recognition.three vaccines were compared: full-length unmodified SIV-mac239 gag, full-length gag fused at the N-terminus to ubiquitin and 7 gag fragments of equal size spanning the whole of gag with ubiquitin-fused to the N-terminus of each fragment. Genes were cloned into a replication defective adenovirus vector and immunogenicity assessed in an in vitro human priming system. The breadth of the CD8 T cell response, defined by the number of distinct epitopes, was assessed by IFN-Ξ³-ELISPOT and memory phenotype and cytokine production evaluated by flow cytometry. We observed an increase of two- to six-fold in the number of epitopes recognised in the ubiquitin-fused fragments compared to the ubiquitin-fused full-length gag. In contrast, although proteasomal targeting was achieved, there was a marked reduction in the number of epitopes recognised in the ubiquitin-fused full-length gag compared to the full-length unmodified gene, but there were no differences in the number of epitope responses induced by non-ubiquitinated full-length gag and the ubiquitin-fused mini genes. Fragmentation and ubiquitination did not affect T cell memory differentiation and polyfunctionality, though most responses were directed against the Ad5 vector.Fragmentation but not fusion with ubiquitin increases the breadth of the CD8 T vaccine response against SIV-mac239 gag. Thus gene fragmentation of HIV vaccines may maximise responses
Ontogeny of Toll-Like Receptor Mediated Cytokine Responses of Human Blood Mononuclear Cells
Newborns and young infants suffer increased infectious morbidity and mortality as compared to older children and adults. Morbidity and mortality due to infection are highest during the first weeks of life, decreasing over several years. Furthermore, most vaccines are not administered around birth, but over the first few years of life. A more complete understanding of the ontogeny of the immune system over the first years of life is thus urgently needed. Here, we applied the most comprehensive analysis focused on the innate immune response following TLR stimulation over the first 2 years of life in the largest such longitudinal cohort studied to-date (35 subjects). We found that innate TLR responses (i) known to support Th17 adaptive immune responses (IL-23, IL-6) peaked around birth and declined over the following 2 years only to increase again by adulthood; (ii) potentially supporting antiviral defense (IFN-Ξ±) reached adult level function by 1 year of age; (iii) known to support Th1 type immunity (IL-12p70, IFN-Ξ³) slowly rose from a low at birth but remained far below adult responses even at 2 years of age; (iv) inducing IL-10 production steadily declined from a high around birth to adult levels by 1 or 2 years of age, and; (v) leading to production of TNF-Ξ± or IL-1Ξ² varied by stimuli. Our data contradict the notion of a linear progression from an βimmatureβ neonatal to a βmatureβ adult pattern, but instead indicate the existence of qualitative and quantitative age-specific changes in innate immune reactivity in response to TLR stimulation
Hox10 Genes Function in Kidney Development in the Differentiation and Integration of the Cortical Stroma
Organogenesis requires the differentiation and integration of distinct populations of cells to form a functional organ. In the kidney, reciprocal interactions between the ureter and the nephrogenic mesenchyme are required for organ formation. Additionally, the differentiation and integration of stromal cells are also necessary for the proper development of this organ. Much remains to be understood regarding the origin of cortical stromal cells and the pathways involved in their formation and function. By generating triple mutants in the Hox10 paralogous group genes, we demonstrate that Hox10 genes play a critical role in the developing kidney. Careful examination of control kidneys show that Foxd1-expressing stromal precursor cells are first observed in a cap-like pattern anterior to the metanephric mesenchyme and these cells subsequently integrate posteriorly into the kidney periphery as development proceeds. While the initial cap-like pattern of Foxd1-expressing cortical stromal cells is unaffected in Hox10 mutants, these cells fail to become properly integrated into the kidney, and do not differentiate to form the kidney capsule. Consistent with loss of cortical stromal cell function, Hox10 mutant kidneys display reduced and aberrant ureter branching, decreased nephrogenesis. These data therefore provide critical novel insights into the cellular and genetic mechanisms governing cortical cell development during kidney organogenesis. These results, combined with previous evidence demonstrating that Hox11 genes are necessary for patterning the metanephric mesenchyme, support a model whereby distinct populations in the nephrogenic cord are regulated by unique Hox codes, and that differential Hox function along the AP axis of the nephrogenic cord is critical for the differentiation and integration of these cell types during kidney organogenesis
- β¦