122 research outputs found

    Genome-Wide Identification of Early-Firing Human Replication Origins by Optical Replication Mapping [preprint]

    Get PDF
    The timing of DNA replication is largely regulated by the location and timing of replication origin firing. Therefore, much effort has been invested in identifying and analyzing human replication origins. However, the heterogeneous nature of eukaryotic replication kinetics and the low efficiency of individual origins in metazoans has made mapping the location and timing of replication initiation in human cells difficult. We have mapped early-firing origins in HeLa cells using Optical Replication Mapping, a high-throughput single-molecule approach based on Bionano Genomics genomic mapping technology. The single-molecule nature and 290-fold coverage of our dataset allowed us to identify origins that fire with as little as 1% efficiency. We find sites of human replication initiation in early S phase are not confined to well-defined efficient replication origins, but are instead distributed across broad initiation zones consisting of many inefficient origins. These early-firing initiation zones co-localize with initiation zones inferred from Okazaki-fragment-mapping analysis and are enriched in ORC1 binding sites. Although most early-firing origins fire in early-replication regions of the genome, a significant number fire in late-replicating regions, suggesting that the major difference between origins in early and late replicating regions is their probability of firing in early S-phase, as opposed to qualitative differences in their firing-time distributions. This observation is consistent with stochastic models of origin timing regulation, which explain the regulation of replication timing in yeast

    BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes

    Get PDF
    The assembly of a reference genome sequence of bread wheat is challenging due to its specific features such as the genome size of 17 Gbp, polyploid nature and prevalence of repetitive sequences. BAC-by-BAC sequencing based on chromosomal physical maps, adopted by the International Wheat Genome Sequencing Consortium as the key strategy, reduces problems caused by the genome complexity and polyploidy, but the repeat content still hampers the sequence assembly. Availability of a high-resolution genomic map to guide sequence scaffolding and validate physical map and sequence assemblies would be highly beneficial to obtaining an accurate and complete genome sequence. Here, we chose the short arm of chromosome 7D (7DS) as a model to demonstrate for the first time that it is possible to couple chromosome flow sorting with genome mapping in nanochannel arrays and create a de novo genome map of a wheat chromosome. We constructed a high-resolution chromosome map composed of 371 contigs with an N50 of 1.3 Mb. Long DNA molecules achieved by our approach facilitated chromosome-scale analysis of repetitive sequences and revealed a ~800-kb array of tandem repeats intractable to current DNA sequencing technologies. Anchoring 7DS sequence assemblies obtained by clone-by-clone sequencing to the 7DS genome map provided a valuable tool to improve the BAC-contig physical map and validate sequence assembly on a chromosome-arm scale. Our results indicate that creating genome maps for the whole wheat genome in a chromosome-by-chromosome manner is feasible and that they will be an affordable tool to support the production of improved pseudomolecules

    Genome maps across 26 human populations reveal population-specific patterns of structural variation.

    Get PDF
    Large structural variants (SVs) in the human genome are difficult to detect and study by conventional sequencing technologies. With long-range genome analysis platforms, such as optical mapping, one can identify large SVs (>2 kb) across the genome in one experiment. Analyzing optical genome maps of 154 individuals from the 26 populations sequenced in the 1000 Genomes Project, we find that phylogenetic population patterns of large SVs are similar to those of single nucleotide variations in 86% of the human genome, while ~2% of the genome has high structural complexity. We are able to characterize SVs in many intractable regions of the genome, including segmental duplications and subtelomeric, pericentromeric, and acrocentric areas. In addition, we discover ~60 Mb of non-redundant genome content missing in the reference genome sequence assembly. Our results highlight the need for a comprehensive set of alternate haplotypes from different populations to represent SV patterns in the genome

    Genome-Wide Mapping of Human DNA Replication by Optical Replication Mapping Supports a Stochastic Model of Eukaryotic Replication Timing [preprint]

    Get PDF
    DNA replication is regulated by the location and timing of replication initiation. Therefore, much effort has been invested in identifying and analyzing the sites of human replication initiation. However, the heterogeneous nature of eukaryotic replication kinetics and the low efficiency of individual initiation site utilization in metazoans has made mapping the location and timing of replication initiation in human cells difficult. A potential solution to the problem of human replication mapping is single-molecule analysis. However, current approaches do not provide the throughput required for genome-wide experiments. To address this challenge, we have developed Optical Replication Mapping (ORM), a high-throughput single-molecule approach to map newly replicated DNA, and used it to map early initiation events in human cells. The single-molecule nature of our data, and a total of more than 2000-fold coverage of the human genome on 27 million fibers averaging ~300 kb in length, allow us to identify initiation sites and their firing probability with high confidence. In particular, for the first time, we are able to measure genome-wide the absolute efficiency of human replication initiation. We find that the distribution of human replication initiation is consistent with inefficient, stochastic initiation of heterogeneously distributed potential initiation complexes enriched in accessible chromatin. In particular, we find sites of human replication initiation are not confined to well-defined replication origins but are instead distributed across broad initiation zones consisting of many initiation sites. Furthermore, we find no correlation of initiation events between neighboring initiation zones. Although most early initiation events occur in early-replicating regions of the genome, a significant number occur in late-replicating regions. The fact that initiation sites in typically late-replicating regions have some probability of firing in early S phase suggests that the major difference between initiation events in early and late replicating regions is their intrinsic probability of firing, as opposed to a qualitative difference in their firing-time distributions. Moreover, modeling of replication kinetics demonstrates that measuring the efficiency of initiation-zone firing in early S phase suffices to predict the average firing time of such initiation zones throughout S phase, further suggesting that the differences between the firing times of early and late initiation zones are quantitative, rather than qualitative. These observations are consistent with stochastic models of initiation-timing regulation and suggest that stochastic regulation of replication kinetics is a fundamental feature of eukaryotic replication, conserved from yeast to humans

    Comparison of the CDC Backpack aspirator and the Prokopack aspirator for sampling indoor- and outdoor-resting mosquitoes in southern Tanzania.

    Get PDF
    BACKGROUND\ud \ud Resting mosquitoes can easily be collected using an aspirating device. The most commonly used mechanical aspirator is the CDC Backpack aspirator. Recently, a simple, and low-cost aspirator called the Prokopack has been devised and proved to have comparable performance. The following study evaluates the Prokopack aspirator compared to the CDC backpack aspirator when sampling resting mosquitoes in rural Tanzania.\ud \ud METHODS\ud \ud Mosquitoes were sampled in- and outdoors of 48 typical rural African households using both aspirators. The aspirators were rotated between collectors and households in a randomized, Latin Square design. Outdoor collections were performed using artificial resting places (large barrel and car tyre), underneath the outdoor kitchen (kibanda) roof and from a drop-net. Data were analysed with generalized linear models.\ud \ud RESULTS\ud \ud The number of mosquitoes collected using the CDC Backpack and the Prokopack aspirator were not significantly different both in- and outdoors (indoors p = 0.735; large barrel p = 0.867; car tyre p = 0.418; kibanda p = 0.519). The Prokopack was superior for sampling of drop-nets due to its smaller size. The number mosquitoes collected per technician was more consistent when using the Prokopack aspirator. The Prokopack was more user-friendly: technicians preferred using the it over the CDC backpack aspirator as it weighs considerably less, retains its charge for longer and is easier to manoeuvre.\ud \ud CONCLUSIONS\ud \ud The Prokopack proved in the field to be more advantageous than the CDC Backpack aspirator. It can be self assembled using simple, low-cost and easily attainable materials. This device is a useful tool for researchers or vector-control surveillance programs operating in rural Africa, as it is far simpler and quicker than traditional means of sampling resting mosquitoes. Further longitudinal evaluations of the Prokopack aspirator versus the gold standard pyrethrum spray catch for indoor resting catches are recommended

    Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    Get PDF
    BACKGROUND: Structural variants (SVs) are less common than single nucleotide polymorphisms and indels in the population, but collectively account for a significant fraction of genetic polymorphism and diseases. Base pair differences arising from SVs are on a much higher order (>100 fold) than point mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger than 1 kb. Excluding the 59 SVs (54 insertions/deletions, 5 inversions) that overlap with N-base gaps in the reference assembly hg19, 666 non-gap SVs remained, and 396 of them (60%) were verified by paired-end data from whole-genome sequencing-based re-sequencing or de novo assembly sequence from fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides valuable information for complex regions with haplotypes in a straightforward fashion. In addition, with long single-molecule labeling patterns, exogenous viral sequences were mapped on a whole-genome scale, and sample heterogeneity was analyzed at a new level. CONCLUSION: Our study highlights genome mapping technology as a comprehensive and cost-effective method for detecting structural variation and studying complex regions in the human genome, as well as deciphering viral integration into the host genome. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/2047-217X-3-34) contains supplementary material, which is available to authorized users

    Identifier mapping performance for integrating transcriptomics and proteomics experimental results

    Get PDF
    Background\ud Studies integrating transcriptomic data with proteomic data can illuminate the proteome more clearly than either separately. Integromic studies can deepen understanding of the dynamic complex regulatory relationship between the transcriptome and the proteome. Integrating these data dictates a reliable mapping between the identifier nomenclature resultant from the two high-throughput platforms. However, this kind of analysis is well known to be hampered by lack of standardization of identifier nomenclature among proteins, genes, and microarray probe sets. Therefore data integration may also play a role in critiquing the fallible gene identifications that both platforms emit.\ud \ud Results\ud We compared three freely available internet-based identifier mapping resources for mapping UniProt accessions (ACCs) to Affymetrix probesets identifications (IDs): DAVID, EnVision, and NetAffx. Liquid chromatography-tandem mass spectrometry analyses of 91 endometrial cancer and 7 noncancer samples generated 11,879 distinct ACCs. For each ACC, we compared the retrieval sets of probeset IDs from each mapping resource. We confirmed a high level of discrepancy among the mapping resources. On the same samples, mRNA expression was available. Therefore, to evaluate the quality of each ACC-to-probeset match, we calculated proteome-transcriptome correlations, and compared the resources presuming that better mapping of identifiers should generate a higher proportion of mapped pairs with strong inter-platform correlations. A mixture model for the correlations fitted well and supported regression analysis, providing a window into the performance of the mapping resources. The resources have added and dropped matches over two years, but their overall performance has not changed.\ud \ud Conclusions\ud The methods presented here serve to achieve concrete context-specific insight, to support well-informed decisions in choosing an ID mapping strategy for "omic" data merging

    When Does Diversity Trump Ability (and Vice Versa) in Group Decision Making? A Simulation Study

    Get PDF
    It is often unclear which factor plays a more critical role in determining a group's performance: the diversity among members of the group or their individual abilities. In this study, we addressed this “diversity vs. ability” issue in a decision-making task. We conducted three simulation studies in which we manipulated agents' individual ability (or accuracy, in the context of our investigation) and group diversity by varying (1) the heuristics agents used to search task-relevant information (i.e., cues); (2) the size of their groups; (3) how much they had learned about a good cue search order; and (4) the magnitude of errors in the information they searched. In each study, we found that a manipulation reducing agents' individual accuracy simultaneously increased their group's diversity, leading to a conflict between the two. These conflicts enabled us to identify certain conditions under which diversity trumps individual accuracy, and vice versa. Specifically, we found that individual accuracy is more important in task environments in which cues differ greatly in the quality of their information, and diversity matters more when such differences are relatively small. Changing the size of a group and the amount of learning by an agent had a limited impact on this general effect of task environment. Furthermore, we found that a group achieves its highest accuracy when there is an intermediate amount of errors in the cue information, regardless of the environment and the heuristic used, an effect that we believe has not been previously reported and warrants further investigation
    corecore