23 research outputs found

    A Multimodal Dataset of 21,412 Recorded Nights for Sleep and Respiratory Research

    Full text link
    This study introduces a novel, rich dataset obtained from home sleep apnea tests using the FDA-approved WatchPAT-300 device, collected from 7,077 participants over 21,412 nights. The dataset comprises three levels of sleep data: raw multi-channel time-series from sensors, annotated sleep events, and computed summary statistics, which include 447 features related to sleep architecture, sleep apnea, and heart rate variability (HRV). We present reference values for Apnea/Hypopnea Index (AHI), sleep efficiency, Wake After Sleep Onset (WASO), and HRV sample entropy, stratified by age and sex. Moreover, we demonstrate that the dataset improves the predictive capability for various health related traits, including body composition, bone density, blood sugar levels and cardiovascular health. These results illustrate the dataset's potential to advance sleep research, personalized healthcare, and machine learning applications in biomedicine.Comment: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 14 page

    Improving 3D Genome Reconstructions Using Orthologous and Functional Constraints

    No full text
    <div><p>The study of the 3D architecture of chromosomes has been advancing rapidly in recent years. While a number of methods for 3D reconstruction of genomic models based on Hi-C data were proposed, most of the analyses in the field have been performed on different 3D representation forms (such as graphs). Here, we reproduce most of the previous results on the 3D genomic organization of the eukaryote <i>Saccharomyces cerevisiae</i> using analysis of 3D reconstructions. We show that many of these results can be reproduced in sparse reconstructions, generated from a small fraction of the experimental data (5% of the data), and study the properties of such models. Finally, we propose for the first time a novel approach for improving the accuracy of 3D reconstructions by introducing additional predicted physical interactions to the model, based on orthologous interactions in an evolutionary-related organism and based on predicted functional interactions between genes. We demonstrate that this approach indeed leads to the reconstruction of improved models.</p></div

    Additional file 1: of Estimation of ribosome profiling performance and reproducibility at various levels of resolution

    No full text
    Comparison of mapped RP profiles in this study with previously published ones. (A) Scatter plot for all yeast genes, where the x-axis represents the RPKM of a gene in the profiles generated in this study from a replicate of the McManus-2014 dataset (GSM1259974), while the y-axis represents the RPKM of a gene in the profiles published by the authors as bedGraph files in sacCer3 strand-specific genomic coordinates. Since the bedGraph profiles were smoothed by the authors by assigning values to all bases covered by the aligned ribosome protected fragment, we performed similar smoothing to our profiles using a 30 nt window. Spearman’s rho, p-value and the number of points are denoted above the plot. (B) Histogram of the position-specific correlations for yeast genes between the mapped profiles in this study and the ones provided by McManus et al. (median correlation r = 0.90). (C) Same as (A), for the Bazzini-2012 dataset based on smoothed profiles provided by the authors in GSM854439 in zv9 genomic coordinates (not strand-specific). (D) Same as (C), for the Bazzini-2012 dataset (median correlation r = 0.75). (PNG 839 kb

    Sparse 3D reconstructions.

    No full text
    <p>The figure contains boxplots of the benchmark test results for each of the 5 model categories (H1-4, R1). 20 models were generated from each type. Random models were generated by shuffling the coordinates of the original <i>S</i>. <i>cerevisiae</i> Hi-C distance map. Results that are distributed significantly above or below the random reference models (their median marked with a line) according to Wilcoxon rank-sum (one-tail), are denoted with a star or more with respect to their significance level (one star for p<0.05, two for p<0.01, three for p<0.001). We observe that on most tests all Hi-C model types obtain similar results, with some tests showing a gradual increase in signal gain with the number of Hi-C interactions. <b>(A)</b> Optimization objective function of the reconstructed solution, normalized with respect to random models with similar properties. <b>(B)</b> Average Spearman’s correlation between the pairwise distances in each model (9.1x10<sup>5</sup> points) with the other reconstructions generated in its category. <b>(C)</b> Centromere co-localization, measured in normalized set distance (NSD), expected to be lower/greater than 1 for co-localized/dispersed sets, respectively. <b>(D)</b> Telomere radius from the center of the nucleus. <b>(E)</b> Ratio of the average <i>cis</i> (intra-chromosomal) distances between chromosome arms and <i>trans</i> (inter-chromosomal) distances. <b>(F)-(L)</b> Co-localization results for various sets of functional loci. Where the set comprises of several co-localized subsets (such as each GO term, tRNA clusters 1 and 2, etc.), the result presented is the mean of the sets’ mean distance. <b>(M)</b> Spearman's correlation between pairwise distances of genes and their coefficient of correlation of expression (<i>n</i> = 2,000 bins). <b>(N)</b> Spearman's correlation between pairwise distances of genes and their distances on a protein-protein interaction (PPI) graph (<i>n</i> = 2,000 bins). <b>(O)</b> Spearman's correlation between pairwise distances of genes and their distances according to the codon usage frequency similarity (CUFS) (<i>n</i> = 2,000 bins).</p

    3D reconstruction schemes.

    No full text
    <p>The figure depicts the 4 reconstruction schemes employed in the study. <b>(A)</b> Sparse Hi-C models were generated by converting Hi-C contact maps to spatial nanometric distances and uniformly sampling from this map according to the desired sparseness. The non-linear program solved by Duan <i>et al</i>. [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004298#pcbi.1004298.ref008" target="_blank">8</a>] was utilized to generate the 3D model. Finally, the quality of the resultant models was assessed in a series of tests based on previously published results on yeast genomic organization. <b>(B)</b> Orthologous-integrated models were generated by converting the <i>S</i>. <i>pombe</i> (SP) Hi-C contact map to spatial nanometric distances, projecting it on the <i>S</i>. <i>cerevisiae</i> (SC) genome through orthologous genes and normalizing it according to SC distances. Integration into the Hi-C based map of distances was done by sampling non-overlapping (unknown) distances. <b>(C)</b> Gene interaction-integrated models were generated by utilizing the codon usage frequency similarity (CUFS) to predict the functional distance between genes. Distances were normalized according to <i>S</i>. <i>cerevisiae</i> distances, and non-overlapping distances were sampled and integrated into the model. <b>(D)</b> Random models were generated by shuffling the coordinates of the Hi-C distance map and integrating them into the model in the same manner.</p

    Orthologous-integrated 3D reconstructions.

    No full text
    <p>The figure contains benchmark test results for 2 types of <i>S</i>. <i>cerevisiae</i> 3D genomic models incorporating additional <i>S</i>. <i>pombe</i> Hi-C interactions (<i>orto-Hi-C</i>) and 2 types of models incorporating additional random interactions (<i>random</i>). 20 models were generated from each type. Random models were generated by permuting the coordinates of the original <i>S</i>. <i>cerevisiae</i> Hi-C map (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004298#sec007" target="_blank">Materials and Methods</a>). Results in each panel compare the improved reconstructions with the baseline model—<i>Hi-C-0</i>.<i>5%</i>. Arrows denote the expected direction for an improved model (a stronger signal than the one appearing in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004298#pcbi.1004298.g002" target="_blank">Fig 2</a>). We observe that in most tests the addition of orthologous interactions shows a significant improvement over the baseline model (marked by the horizontal line H1) and over models containing additional random interactions. Moreover, some models show stronger signals than 100% models (marked by the horizontal line H2). Results that are distributed significantly above or below the baseline according to Wilcoxon signed rank (one-tail), are denoted with a star or more with respect to their significance level (one star for p<0.05, two for p<0.01, three for p<0.001). <b>(A)</b> Optimization objective function of the reconstructed solution, normalized with respect to random models with similar properties. <b>(B)</b> Average Spearman’s correlation between the pairwise distances in each model (9.1x10<sup>5</sup> points) with the other reconstructions generated in its category. <b>(C)</b> Centromere co-localization, measured in normalized set distance (NSD), expected to be lower/greater than 1 for co-localized/dispersed sets, respectively. <b>(D)</b> Telomere radius from the center of the nucleus. <b>(E)</b> Ratio of the average <i>cis</i> (intra-chromosomal) distances between chromosome arms and <i>trans</i> (inter-chromosomal) distances. <b>(F)-(L)</b> Co-localization results for various sets of functional loci. Where the set comprises of several co-localized subsets (such as each GO term, tRNA clusters 1 and 2, etc.), the result presented is the mean of the sets’ mean distance. <b>(M)</b> Spearman's correlation between pairwise distances of genes and their coefficient of correlation of expression (<i>n</i> = 2,000 bins). <b>(N)</b> Spearman's correlation between pairwise distances of genes and their distances on a protein-protein interaction (PPI) graph (<i>n</i> = 2,000 bins). <b>(O)</b> Spearman's correlation between pairwise distances of genes and their protein abundance (PA) distances—measuring the similarity in expression levels (<i>n</i> = 2,000 bins).</p

    Minimal reproduction of the complete dataset.

    No full text
    <p>The bars in the figure denote the minimal amount of data required to reproduce the results obtained using the complete dataset in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004298#pcbi.1004298.g002" target="_blank">Fig 2</a>. The minimal degree of sparseness is the one that for all models with equal or larger datasets we were unable to reject the null hypothesis that the median of values was equal to that of the complete dataset (Wilcoxon two-tail rank sum at 0.01 significance level). In most cases (10 out of 15), 5% of the data sufficed to reproduce the result observed for 100%. Choosing a significance threshold of 0.05 affected only model similarity and <i>Irr1p</i> (minimal data: 100% and 50%, respectively).</p

    The extent of ribosome queuing in budding yeast

    No full text
    <div><p>Ribosome queuing is a fundamental phenomenon suggested to be related to topics such as genome evolution, synthetic biology, gene expression regulation, intracellular biophysics, and more. However, this phenomenon hasn't been quantified yet at a genomic level. Nevertheless, methodologies for studying translation (e.g. ribosome footprints) are usually calibrated to capture only single ribosome protected footprints (mRPFs) and thus limited in their ability to detect ribosome queuing. On the other hand, most of the models in the field assume and analyze a certain level of queuing. Here we present an experimental-computational approach for studying ribosome queuing based on sequencing of RNA footprints extracted from pairs of ribosomes (dRPFs) using a modified ribosome profiling protocol. We combine our approach with traditional ribosome profiling to generate a detailed profile of ribosome traffic. The data are analyzed using computational models of translation dynamics. The approach was implemented on the <i>Saccharomyces cerevisiae</i> transcriptome. Our data shows that ribosome queuing is more frequent than previously thought: the measured ratio of ribosomes within dRPFs to mRPFs is 0.2–0.35, suggesting that at least one to five translating ribosomes is in a traffic jam; these queued ribosomes cannot be captured by traditional methods. We found that specific regions are enriched with queued ribosomes, such as the 5’-end of ORFs, and regions upstream to mRPF peaks, among others. While queuing is related to higher density of ribosomes on the transcript (characteristic of highly translated genes), we report cases where traffic jams are relatively more severe in lowly expressed genes and possibly even selected for. In addition, our analysis demonstrates that higher adaptation of the coding region to the intracellular tRNA levels is associated with lower queuing levels. Our analysis also suggests that the <i>Saccharomyces cerevisiae</i> transcriptome undergoes selection for eliminating traffic jams. Thus, our proposed approach is an essential tool for high resolution analysis of ribosome traffic during mRNA translation and understanding its evolution.</p></div

    Sucrose gradient fractions.

    No full text
    <p><b>(A)</b> Quantification of ribosome protected footprints (RPFs) according to size. The plot shows the estimated components underlying the observed distribution, by means of Gaussian mixture. Each component reflects a set of footprints with a typical size, originating from a single ribosome (mono-RPF, mRPF), a pair of ribosomes (di-RPF, dRPF), etc. The ratio of dRPFs to mRPFs is reported in the caption. <b>(B)</b> Same for Ingolia 2009 data, profile reproduced from [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005951#pcbi.1005951.ref023" target="_blank">23</a>]. <b>(C)</b> Same for Guydosh 2014 data, profile reproduced from [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005951#pcbi.1005951.ref033" target="_blank">33</a>] (Fig 1 in the original paper). <b>(D)</b> Same for Shirokikh 2017 data, profile reproduced from [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005951#pcbi.1005951.ref035" target="_blank">35</a>] (Figure 5 in the original paper). See also <b><a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005951#pcbi.1005951.s006" target="_blank">S1 Fig</a></b>.</p
    corecore