60 research outputs found

    GBC: a parallel toolkit based on highly addressable byte-encoding blocks for extremely large-scale genotypes of species

    Get PDF
    Whole -genome sequencing projects of millions of subjects contain enormous genotypes, entailing a huge memory burden and time for computation. Here, we present GBC, a toolkit for rapidly compressing large-scale genotypes into highly addressable byte-encoding blocks under an optimized parallel framework. We demonstrate that GBC is up to 1000 times faster than state-of-the-art methods to access and manage compressed large-scale genotypes while maintaining a competitive compression ratio. We also showed that conventional analysis would be substantially sped up if built on GBC to access genotypes of a large population. GBC\u27s data structure and algorithms are valuable for accelerating large-scale genomic research

    Emergent electric field control of phase transformation in oxide superlattices.

    Get PDF
    Electric fields can transform materials with respect to their structure and properties, enabling various applications ranging from batteries to spintronics. Recently electrolytic gating, which can generate large electric fields and voltage-driven ion transfer, has been identified as a powerful means to achieve electric-field-controlled phase transformations. The class of transition metal oxides provide many potential candidates that present a strong response under electrolytic gating. However, very few show a reversible structural transformation at room-temperature. Here, we report the realization of a digitally synthesized transition metal oxide that shows a reversible, electric-field-controlled transformation between distinct crystalline phases at room-temperature. In superlattices comprised of alternating one-unit-cell of SrIrO3 and La0.2Sr0.8MnO3, we find a reversible phase transformation with a 7% lattice change and dramatic modulation in chemical, electronic, magnetic and optical properties, mediated by the reversible transfer of oxygen and hydrogen ions. Strikingly, this phase transformation is absent in the constituent oxides, solid solutions and larger period superlattices. Our findings open up this class of materials for voltage-controlled functionality

    XCloud-VIP: Virtual Peak Enables Highly Accelerated NMR Spectroscopy and Faithful Quantitative Measures

    Full text link
    Background: Nuclear Magnetic Resonance (NMR) spectroscopy is an important bio-engineering tool to determine the metabolic concentrations, molecule structures and so on. The data acquisition time, however, is very long in multi-dimensional NMR. To accelerate data acquisition, non-uniformly sampling is an effective way but may encounter severe spectral distortions and unfaithful quantitative measures when the acceleration factor is high. Objective: To reconstruct high fidelity spectra from highly accelerated NMR and achieve much better quantitative measures. Methods: A virtual peak (VIP) approach is proposed to self-learn the prior spectral information, such as the central frequency and peak lineshape, and then feed these information into the reconstruction. The proposed method is further implemented with cloud computing to facilitate online, open, and easy access. Results: Results on synthetic and experimental data demonstrate that, compared with the state-of-the-art method, the new approach provides much better reconstruction of low-intensity peaks and significantly improves the quantitative measures, including the regression of peak intensity, the distances between nuclear pairs, and concentrations of metabolics in mixtures. Conclusion: Self-learning prior peak information can improve the reconstruction and quantitative measures of spectra. Significance: This approach enables highly accelerated NMR and may promote time-consuming applications such as quantitative and time-resolved NMR experiments

    Crystal structure of the N domain of Lon protease from Mycobacterium avium complex.

    Get PDF
    Lon protease is evolutionarily conserved in prokaryotes and eukaryotic organelles. The primary function of Lon is to selectively degrade abnormal and certain regulatory proteins to maintain the homeostasis in vivo. Lon mainly consists of three functional domains and the N-terminal domain is required for the substrate selection and recognition. However, the precise contribution of the N-terminal domain remains elusive. Here, we determined the crystal structure of the N-terminal 192-residue construct of Lon protease from Mycobacterium avium complex at 2.4 å resolution,and measured NMR-relaxation parameters of backbones. This structure consists of two subdomains, the β-strand rich N-terminal subdomain and the five-helix bundle of C-terminal subdomain, connected by a flexible linker,and is similar to the overall structure of the N domain of Escherichia coli Lon even though their sequence identity is only 26%. The obtained NMR-relaxation parameters reveal two stabilized loops involved in the structural packing of the compact N domain and a turn structure formation. The performed homology comparison suggests that structural and sequence variations in the N domain may be closely related to the substrate selectivity of Lon variants. Our results provide the structure and dynamics characterization of a new Lon N domain, and will help to define the precise contribution of the Lon N-terminal domain to the substrate recognition

    The effect of turbulent intermittency on the deflagration to detonation transition in SN Ia explosions

    Get PDF
    We examine the effects of turbulent intermittency on the deflagration to detonation transition (DDT) in Type Ia supernovae. The Zel'dovich mechanism for DDT requires the formation of a nearly isothermal region of mixed ash and fuel that is larger than a critical size. We primarily consider the hypothesis by Khokhlov et al. and Niemeyer and Woosley that the nearly isothermal, mixed region is produced when the flame makes the transition to the distributed regime. We use two models for the distribution of the turbulent velocity fluctuations to estimate the probability as a function of the density in the exploding white dwarf that a given region of critical size is in the distributed regime due to strong local turbulent stretching of the flame structure. We also estimate lower limits on the number of such regions as a function of density. We find that the distributed regime, and hence perhaps DDT, occurs in a local region of critical size at a density at least a factor of 2-3 larger than predicted for mean conditions that neglect intermittency. This factor brings the transition density to be much larger than the empirical value from observations in most situations. We also consider the intermittency effect on the more stringent conditions for DDT by Lisewski et al. and Woosley. We find that a turbulent velocity of 10810^8 cm/s in a region of size 10610^6 cm, required by Lisewski et al., is rare. We expect that intermittency gives a weaker effect on the Woosley model with stronger criterion. The predicted transition density from this criterion remains below 10710^7 g/cm3^3 after accounting for intermittency using our intermittency models.Comment: 31 pages, accepted for publication in Ap

    Genetic variability and population divergence of Rhododendron platypodum Diels in China in the context of conservation

    No full text
    Genetic diversity in endangered species is of special significance in the face of escalating global climate change and alarming biodiversity declines. Rhododendron platypodum Diels, an endangered species endemic to China, is distinguished by its restricted geographical range. This study aimed to explore genetic diversity and differentiation among its populations, gathering samples from all four distribution sites: Jinfo Mountain (JFM), Zhaoyun Mountain (ZYM), Baima Mountain (BMM), and Mao’er Mountain (MEM). We employed 18 pairs of Simple Sequence Repeat (SSR) primers to ascertain the genetic diversity and structural characteristics of these samples and further utilized 19 phenotypic data points to corroborate the differentiation observed among the populations. These primers detected 52 alleles, with the average number of observed alleles (Na) being 2.89, the average number of effective alleles (Ne) being 2.12, the average observed heterozygosity (Ho) being 0.57, and the expected heterozygosity (He) being 0.50. This array of data demonstrates the efficacy of the primers in reflecting R. platypodum’s genetic diversity. SSR-based genetic analysis of the populations yielded Ho, He, and Shannon index (I) values ranging from 0.47 to 0.65, 0.36 to 0.46, and 0.53 to 0.69, respectively. Notably, the ZYM population emerged as the most genetically diverse. Further analysis, incorporating molecular variance, principal component analysis, UPGMA cluster analysis, and structure analysis, highlighted significant genetic differentiation between the Chongqing (BMM, JFM, ZYM) and Guangxi (MEM) populations. Morphological data analysis corroborated these findings. Additionally, marked genetic and morphological distinctions were evident among the three Chongqing populations (BMM, JFM, and ZYM). This suggests that, despite the observed regional differentiation, R. platypodum’s overall genetic diversity is relatively constrained compared to other species within the Rhododendron genus. Consequently, R. platypodum conservation hinges critically on preserving its genetic diversity and protecting its distinct populations

    OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data

    No full text
    Open chromatin regions (OCRs) are special regions of the human genome that can be accessed by DNA regulatory elements. Several studies have reported that a series of OCRs are associated with mechanisms involved in human diseases, such as cancers. Identifying OCRs using ATAC-seq or DNase-seq is often expensive. It has become popular to detect OCRs from plasma cell-free DNA (cfDNA) sequencing data, because both the fragmentation modes of cfDNA and the sequencing coverage in OCRs are significantly different from those in other regions. However, it is a challenging computational problem to accurately detect OCRs from plasma cfDNA-seq data, as multiple factors—e.g., sequencing and mapping bias, insufficient read depth, etc.—often mislead the computational model. In this paper, we propose a novel bioinformatics pipeline, OCRDetector, for detecting OCRs from whole-genome cfDNA sequencing data. The pipeline calculates the window protection score (WPS) waveform and the cfDNA sequencing coverage. To validate the proposed pipeline, we compared the percentage overlap of our OCRs with those obtained by other methods. The experimental results show that 81% of the TSS regions of housekeeping genes are detected, and our results have obvious tissue specificity. In addition, the overlap percentage between our OCRs and the high-confidence OCRs obtained by ATAC-seq or DNase-seq is greater than 70%
    • …
    corecore