23 research outputs found

    Torus principal component analysis with applications to RNA structure

    Get PDF
    There are several cutting edge applications needing PCA methods for data on tori, and we propose a novel torus-PCA method that adaptively favors low-dimensional representations while preventing overfitting by a new test—both of which can be generally applied and address shortcomings in two previously proposed PCA methods. Unlike tangent space PCA, our torus-PCA features structure fidelity by honoring the cyclic topology of the data space and, unlike geodesic PCA, produces nonwinding, nondense descriptors. These features are achieved by deforming tori into spheres with self-gluing and then using a variant of the recently developed principal nested spheres analysis. This PCA analysis involves a step of subsphere fitting, and we provide a new test to avoid overfitting. We validate our torus-PCA by application to an RNA benchmark data set. Further, using a larger RNA data set, torus-PCA recovers previously found structure, now globally at the one-dimensional representation, which is not accessible via tangent space PCA

    Distribution of Hβ hyperfine couplings in a tyrosyl radical revealed by 263 GHz ENDOR spectroscopy

    Get PDF
    1H ENDOR spectra of tyrosyl radicals (Y∙) have been the subject of numerous EPR spectroscopic studies due to their importance in biology. Nevertheless, assignment of all internal 1H hyperfine couplings has been challenging because of substantial spectral overlap. Recently, using 263 GHz ENDOR in conjunction with statistical analysis, we could identify the signature of the Hβ2 coupling in the essential Y122 radical of Escherichia coli ribonucleotide reductase, and modeled it with a distribution of radical conformations. Here, we demonstrate that this analysis can be extended to the full-width 1H ENDOR spectra that contain the larger Hβ1 coupling. The Hβ2 and Hβ1 couplings are related to each other through the ring dihedral and report on the amino acid conformation. The 263 GHz ENDOR data, acquired in batches instead of averaging, and data processing by a new “drift model” allow reconstructing the ENDOR spectra with statistically meaningful confidence intervals and separating them from baseline distortions. Spectral simulations using a distribution of ring dihedral angles confirm the presence of a conformational distribution, consistent with the previous analysis of the Hβ2 coupling. The analysis was corroborated by 94 GHz 2H ENDOR of deuterated Y∙122. These studies provide a starting point to investigate low populated states of tyrosyl radicals in greater detail

    Distribution of H-beta Hyperfine Couplings in a Tyrosyl Radical Revealed by 263 GHz ENDOR Spectroscopy

    Get PDF
    1H ENDOR spectra of tyrosyl radicals (Y∙) have been the subject of numerous EPR spectroscopic studies due to their importance in biology. Nevertheless, assignment of all internal 1H hyperfine couplings has been challenging because of substantial spectral overlap. Recently, using 263 GHz ENDOR in conjunction with statistical analysis, we could identify the signature of the Hβ2 coupling in the essential Y122 radical of Escherichia coli ribonucleotide reductase, and modeled it with a distribution of radical conformations. Here, we demonstrate that this analysis can be extended to the full-width 1H ENDOR spectra that contain the larger Hβ1 coupling. The Hβ2 and Hβ1 couplings are related to each other through the ring dihedral and report on the amino acid conformation. The 263 GHz ENDOR data, acquired in batches instead of averaging, and data processing by a new “drift model” allow reconstructing the ENDOR spectra with statistically meaningful confidence intervals and separating them from baseline distortions. Spectral simulations using a distribution of ring dihedral angles confirm the presence of a conformational distribution, consistent with the previous analysis of the Hβ2 coupling. The analysis was corroborated by 94 GHz 2H ENDOR of deuterated Y∙122. These studies provide a starting point to investigate low populated states of tyrosyl radicals in greater detail

    Statistical analysis of ENDOR spectra

    Get PDF
    Electron–nuclear double resonance (ENDOR) measures the hyperfine interaction of magnetic nuclei with paramagnetic centers and is hence a powerful tool for spectroscopic investigations extending from biophysics to material science. Progress in microwave technology and the recent availability of commercial electron paramagnetic resonance (EPR) spectrometers up to an electron Larmor frequency of 263 GHz now open the opportunity for a more quantitative spectral analysis. Using representative spectra of a prototype amino acid radical in a biologically relevant enzyme, the Y∙122 in Escherichia coli ribonucleotide reductase, we developed a statistical model for ENDOR data and conducted statistical inference on the spectra including uncertainty estimation and hypothesis testing. Our approach in conjunction with 1H/2H isotopic labeling of Y∙122 in the protein unambiguously established new unexpected spectral contributions. Density functional theory (DFT) calculations and ENDOR spectral simulations indicated that these features result from the beta-methylene hyperfine coupling and are caused by a distribution of molecular conformations, likely important for the biological function of this essential radical. The results demonstrate that model-based statistical analysis in combination with state-of-the-art spectroscopy accesses information hitherto beyond standard approaches

    Bayesian optimization to estimate hyperfine couplings from 19F ENDOR spectra

    Get PDF
    ENDOR spectroscopy is a fundamental method to detect nuclear spins in the vicinity of paramagnetic centers and their mutual hyperfine interaction. Recently, site-selective introduction of 19F as nuclear labels has been proposed as a tool for ENDOR-based distance determination in biomolecules, complementing pulsed dipolar spectroscopy in the range of angstrom to nanometer. Nevertheless, one main challenge of ENDOR still consists of its spectral analysis, which is aggravated by a large parameter space and broad resonances from hyperfine interactions. Additionally, at high EPR frequencies and fields (⩾94 GHz/3.4 Tesla), chemical shift anisotropy might contribute to broadening and asymmetry in the spectra. Here, we use two nitroxide-fluorine model systems to examine a statistical approach to finding the best parameter fit to experimental 263 GHz 19F ENDOR spectra. We propose Bayesian optimization for a rapid, global parameter search with little prior knowledge, followed by a refinement by more standard gradient-based fitting procedures. Indeed, the latter suffer from finding local rather than global minima of a suitably defined loss function. Using a new and accelerated simulation procedure, results for the semi-rigid nitroxide-fluorine two and three spin systems lead to physically reasonable solutions, if minima of similar loss can be distinguished by DFT predictions. The approach also delivers the stochastic error of the obtained parameter estimates. Future developments and perspectives are discussed

    Principal component analysis and clustering on manifolds

    No full text
    Big data, high dimensional data, sparse data, large scale data, and imaging data are all becoming new frontiers of statistics. Changing technologies have created this flood and have led to a real hunger for new modeling strategies and data analysis by scientists. In many cases data are not Euclidean; for example, in molecular biology, the data sit on manifolds. Even in a simple non-Euclidean manifold (circle), to summarize angles by the arithmetic average cannot make sense and so more care is needed. Thus non-Euclidean settings throw up many major challenges, both mathematical and statistical. This paper will focus on the PCA and clustering methods for some manifolds. Of course, the PCA and clustering methods in multivariate analysis are one of the core topics. We basically deal with two key manifolds from a practical point of view, namely spheres and tori. It is well known that dimension reduction on non-Euclidean manifolds with PCA-like methods has been a challenging task for quite some time but recently there has been some breakthrough. One of them is the idea of nested spheres and another is transforming a torus into a sphere effectively and subsequently use the technology of nested spheres PCA. We also provide a new method of clustering for multivariate analysis which has a fundamental property required for molecular biology that penalizes wrong assignments to avoid chemically no go areas. We give various examples to illustrate these methods. One of the important examples includes dealing with COVID-19 data

    Anisotropic x-ray scattering and orientation fields in cardiac tissue cells

    No full text
    X-ray diffraction from biomolecular assemblies is a powerful technique which can provide structural information about complex architectures such as the locomotor systems underlying muscle contraction. However, in its conventional form, macromolecular diffraction averages over large ensembles. Progress in x-ray optics has now enabled to probe structures on sub-cellular scales, with the beam confined to a distinct organelle. Here, we use scanning small angle x-ray scattering (scanning SAXS) to probe the diffraction from cytoskeleton networks in cardiac tissue cells. In particular, we focus on actin-myosin composites, which we identify as the dominating contribution to the anisotropic diffraction patterns, by correlation with optical fluorescence microscopy. To this end, we use a principal component analysis approach to quantify direction, degree of orientation, nematic order, and the second moment of the scattering distribution in each scan point. We compare the fiber orientation from micrographs of fluorescently labeled actin fibers to the structure orientation of the x-ray dataset and thus correlate signals of two different measurements: the native electron density distribution of the local probing area versus specifically labeled constituents of the sample. Further, we develop a robust and automated fitting approach based on a power law expansion, in order to describe the local structure factor in each scan point over a broad range of the momentum transfer qr{q}_{{\rm{r}}}. Finally, we demonstrate how the methodology shown for freeze dried cells in the first part of the paper can be translated to alive cell recordings

    Anisotropic x-ray scattering and orientation fields in cardiac tissue cells

    No full text
    X-ray diffraction from biomolecular assemblies is a powerful technique which can provide structural information about complex architectures such as the locomotor systems underlying muscle contraction. However, in its conventional form, macromolecular diffraction averages over large ensembles. Progress in x-ray optics has now enabled to probe structures on sub-cellular scales, with the beam confined to a distinct organelle. Here, we use scanning small angle x-ray scattering (scanning SAXS) to probe the diffraction from cytoskeleton networks in cardiac tissue cells. In particular, we focus on actin-myosin composites, which we identify as the dominating contribution to the anisotropic diffraction patterns, by correlation with optical fluorescence microscopy. To this end, we use a principal component analysis approach to quantify direction, degree of orientation, nematic order, and the second moment of the scattering distribution in each scan point. We compare the fiber orientation from micrographs of fluorescently labeled actin fibers to the structure orientation of the x-ray dataset and thus correlate signals of two different measurements: the native electron density distribution of the local probing area versus specifically labeled constituents of the sample. Further, we develop a robust and automated fitting approach based on a power law expansion, in order to describe the local structure factor in each scan point over a broad range of the momentum transfer qr{q}_{{\rm{r}}}. Finally, we demonstrate how the methodology shown for freeze dried cells in the first part of the paper can be translated to alive cell recordings
    corecore