112 research outputs found

    Internal Rotation Effects and Nuclear Hyperfine Structure in the Microwave Spectrum of Propyne-HF

    Get PDF
    The microwave spectrum of the weakly bound propyne-HF/DF complex in the region between 6 and 16 GHz was analyzed. The spectrum was characteristic of a distorted T-shaped asymmetric top exhibiting torsional splitting caused by a low barrier to internal rotation of the methyl top relative to the propyne-HF frame. Deuterium substitution of HF confirms that the acid proton of HF is located between the F atom and the propyne triple bond. The spectroscopic constants given below are consistent with the fluorine atom being displaced toward the methyl group from a line perpendicular to and bisecting the propyne triple bond, suggesting a weak hydrogen bond interaction between fluorine and the methyl protons

    Internal Rotation Effects and Nuclear Hyperfine Structure in the Microwave Spectrum of Propyne-HF

    Get PDF
    The microwave spectrum of the weakly bound propyne-HF/DF complex in the region between 6 and 16 GHz was analyzed. The spectrum was characteristic of a distorted T-shaped asymmetric top exhibiting torsional splitting caused by a low barrier to internal rotation of the methyl top relative to the propyne-HF frame. Deuterium substitution of HF confirms that the acid proton of HF is located between the F atom and the propyne triple bond. The spectroscopic constants given below are consistent with the fluorine atom being displaced toward the methyl group from a line perpendicular to and bisecting the propyne triple bond, suggesting a weak hydrogen bond interaction between fluorine and the methyl protons

    Microwave and tunable far-infrared laser spectroscopy of the ammonia–water dimer

    Get PDF
    Microwave and far-infrared spectra of the H3N–HOH dimer have been recorded from 36 to 86 GHz and 520 to 800 GHz with a planar supersonic jet/tunable laser sideband spectrometer. The a-type pure rotational microwave data extend the previous m=0, K=0 A symmetry manifold measurements of Herbine and Dyke [J. Chem. Phys. 83, 3768 (1980)] to higher frequency and also provide an additional set of microwave transitions in the mK=+1 E symmetry manifold. Two sets of five b-type rotation–tunneling bands, one set shifted from the other by an approximately constant 113 MHz, have been observed in the far infrared. The splitting into two sets arises from water tunneling, while the overall band structure is due to internal rotation of the ammonia top. Nonlinear least-squares fits to an internal rotor Hamiltonian provided rotational constants, and an estimation of V3=10.5±5.0 cm–1 for the barrier height to internal rotation for the NH3 monomer. A nonlinear equilibrium hydrogen bond is most consistent with the vibrationally averaged rotational constants; with the angle co^s–1[] determined from , the projection of the ammonia's angular momentum onto the framework; and with the nitrogen quadrupole coupling constants of Herbine and Dyke. The water tunneling splitting and observed selection rules place constraints on the barrier height for proton exchange of the water as well as the most feasible water tunneling path along the intermolecular potential energy surface. An estimated barrier of ~700 cm^–1 is derived for the water tunneling motion about its c axis

    Multiclass classification of microarray data with repeated measurements: application to cancer

    Get PDF
    Prediction of the diagnostic category of a tissue sample from its gene-expression profile and selection of relevant genes for class prediction have important applications in cancer research. We have developed the uncorrelated shrunken centroid (USC) and error-weighted, uncorrelated shrunken centroid (EWUSC) algorithms that are applicable to microarray data with any number of classes. We show that removing highly correlated genes typically improves classification results using a small set of genes

    Sample size for detecting differentially expressed genes in microarray experiments

    Get PDF
    BACKGROUND: Microarray experiments are often performed with a small number of biological replicates, resulting in low statistical power for detecting differentially expressed genes and concomitant high false positive rates. While increasing sample size can increase statistical power and decrease error rates, with too many samples, valuable resources are not used efficiently. The issue of how many replicates are required in a typical experimental system needs to be addressed. Of particular interest is the difference in required sample sizes for similar experiments in inbred vs. outbred populations (e.g. mouse and rat vs. human). RESULTS: We hypothesize that if all other factors (assay protocol, microarray platform, data pre-processing) were equal, fewer individuals would be needed for the same statistical power using inbred animals as opposed to unrelated human subjects, as genetic effects on gene expression will be removed in the inbred populations. We apply the same normalization algorithm and estimate the variance of gene expression for a variety of cDNA data sets (humans, inbred mice and rats) comparing two conditions. Using one sample, paired sample or two independent sample t-tests, we calculate the sample sizes required to detect a 1.5-, 2-, and 4-fold changes in expression level as a function of false positive rate, power and percentage of genes that have a standard deviation below a given percentile. CONCLUSIONS: Factors that affect power and sample size calculations include variability of the population, the desired detectable differences, the power to detect the differences, and an acceptable error rate. In addition, experimental design, technical variability and data pre-processing play a role in the power of the statistical tests in microarrays. We show that the number of samples required for detecting a 2-fold change with 90% probability and a p-value of 0.01 in humans is much larger than the number of samples commonly used in present day studies, and that far fewer individuals are needed for the same statistical power when using inbred animals rather than unrelated human subjects

    Clustering gene-expression data with repeated measurements

    Get PDF
    Clustering is a common methodology for the analysis of array data, and many research laboratories are generating array data with repeated measurements. We evaluated several clustering algorithms that incorporate repeated measurements, and show that algorithms that take advantage of repeated measurements yield more accurate and more stable clusters. In particular, we show that the infinite mixture model-based approach with a built-in error model produces superior results

    From co-expression to co-regulation: how many microarray experiments do we need?

    Get PDF
    BACKGROUND: Cluster analysis is often used to infer regulatory modules or biological function by associating unknown genes with other genes that have similar expression patterns and known regulatory elements or functions. However, clustering results may not have any biological relevance. RESULTS: We applied various clustering algorithms to microarray datasets with different sizes, and we evaluated the clustering results by determining the fraction of gene pairs from the same clusters that share at least one known common transcription factor. We used both yeast transcription factor databases (SCPD, YPD) and chromatin immunoprecipitation (ChIP) data to evaluate our clustering results. We showed that the ability to identify co-regulated genes from clustering results is strongly dependent on the number of microarray experiments used in cluster analysis and the accuracy of these associations plateaus at between 50 and 100 experiments on yeast data. Moreover, the model-based clustering algorithm MCLUST consistently outperforms more traditional methods in accurately assigning co-regulated genes to the same clusters on standardized data. CONCLUSIONS: Our results are consistent with respect to independent evaluation criteria that strengthen our confidence in our results. However, when one compares ChIP data to YPD, the false-negative rate is approximately 80% using the recommended p-value of 0.001. In addition, we showed that even with large numbers of experiments, the false-positive rate may exceed the true-positive rate. In particular, even when all experiments are included, the best results produce clusters with only a 28% true-positive rate using known gene transcription factor interactions

    Iterative Bayesian Model Averaging: a method for the application of survival analysis to high-dimensional microarray data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Microarray technology is increasingly used to identify potential biomarkers for cancer prognostics and diagnostics. Previously, we have developed the iterative Bayesian Model Averaging (BMA) algorithm for use in classification. Here, we extend the iterative BMA algorithm for application to survival analysis on high-dimensional microarray data. The main goal in applying survival analysis to microarray data is to determine a highly predictive model of patients' time to event (such as death, relapse, or metastasis) using a small number of selected genes. Our multivariate procedure combines the effectiveness of multiple contending models by calculating the weighted average of their posterior probability distributions. Our results demonstrate that our iterative BMA algorithm for survival analysis achieves high prediction accuracy while consistently selecting a small and cost-effective number of predictor genes.</p> <p>Results</p> <p>We applied the iterative BMA algorithm to two cancer datasets: breast cancer and diffuse large B-cell lymphoma (DLBCL) data. On the breast cancer data, the algorithm selected a total of 15 predictor genes across 84 contending models from the training data. The maximum likelihood estimates of the selected genes and the posterior probabilities of the selected models from the training data were used to divide patients in the test (or validation) dataset into high- and low-risk categories. Using the genes and models determined from the training data, we assigned patients from the test data into highly distinct risk groups (as indicated by a p-value of 7.26e-05 from the log-rank test). Moreover, we achieved comparable results using only the 5 top selected genes with 100% posterior probabilities. On the DLBCL data, our iterative BMA procedure selected a total of 25 genes across 3 contending models from the training data. Once again, we assigned the patients in the validation set to significantly distinct risk groups (p-value = 0.00139).</p> <p>Conclusion</p> <p>The strength of the iterative BMA algorithm for survival analysis lies in its ability to account for model uncertainty. The results from this study demonstrate that our procedure selects a small number of genes while eclipsing other methods in predictive performance, making it a highly accurate and cost-effective prognostic tool in the clinical setting.</p

    What Is the Best Reference RNA? And Other Questions Regarding the Design and Analysis of Two-Color Microarray Experiments

    Get PDF
    The reference design is a practical and popular choice for microarray studies using two-color platforms. In the reference design, the reference RNA uses half of all array resources, leading investigators to ask: What is the best reference RNA? We propose a novel method for evaluating reference RNAs and present the results of an experiment that was specially designed to evaluate three common choices of reference RNA. We found no compelling evidence in favor of any particular reference. In particular, a commercial reference showed no advantage in our data. Our experimental design also enabled a new way to test the effectiveness of pre-processing methods for two-color arrays. Our results favor using an intensity-normalization and foregoing background-subtraction. Finally, we evaluate the sensitivity and specificity of data quality filters, and propose a new filter that can be applied to any experimental design and does not rely on replicate hybridizations
    • …
    corecore