Search CORE

8,801 research outputs found

Identifying Inverted Repeat Structure in DNA Sequences using Correlation Framework

Author: Gupta Ravi
Mittal Ankush
Singh Kuldip
Publication venue
Publication date
Field of study

Publication in the conference proceedings of EUSIPCO, Florence, Italy, 200

ZENODO

Multi-platform discovery of haplotype-resolved structural variation in human genomes

Author: Ding Li
Publication venue: Digital Commons@Becker
Publication date: 01/01/2019
Field of study

Digital Commons@Becker

Retrotransposons are the major contributors to the expansion of the Drosophila ananassae Muller F element

Author: et al
Mardis Elaine R
Publication venue: Digital Commons@Becker
Publication date: 01/01/2017
Field of study

Digital Commons@Becker

Recommended from our members

Motif-informed analysis of phenotype heterogeneity in cancer

Author: Xu Qi, Ph. D.
Publication venue
Publication date: 08/05/2024
Field of study

The landscape of cancer genomics harbors a wealth of DNA motifs, whose thorough analysis and integration provide a pivotal method to decipher the complex molecular interactions underlying cancer. This dissertation delineates novel computational methodologies for robust DNA motif analysis and data integration, aiming to elucidate the implications of DNA motifs on cancer heterogeneity and clinical outcomes. Chapter 1 lays the groundwork by showing the significance of DNA motifs in the genomic framework and delineating the current biomarkers in cancer. It highlights the opportunity that DNA motif analysis presents in unveiling a nuanced understanding of genomic interactions. It also indicates the motivations and specific aims of the study of both DNA motif quantification and co-localization analysis. In Chapter 2, a foundational marker for quantifying the prevalence of DNA repetitive motifs, termed as “Non-B DNA Burden”, is introduced. A user-centric platform is also developed to facilitate the efficient computation and visualization of this metric across various genomic scales. Together, they are offering a novel perspective for analyzing DNA motif heterogeneity. Transitioning to Chapter 3, the focus evolves toward an integrated marker approach. By integrating the prevalence analysis of DNA motifs in conjunction with the frequency of co-localized mutations, novel markers mlTNB (mutation-localized total non-B burden) and nbTMB (non-B informed tumor mutation burden) are proposed. Their potential in predicting cancer prognosis and treatment responses is specifically explored. Chapter 4 broadens the analytical foundation by defining MoCoLo (Motif Co-Localization), a robust statistical framework for testing multi-modal DNA motif co-localization. Through this framework, we are able to explore the complex interplay of genomic features and provide a methodical approach to investigate their co-localization in a multi-modal data integration context. Case studies are employed to showcase the utility of MoCoLo in examining the co-localization of genomic features, thus facilitating the understanding of genomic interactions that are pivotal to cancer biology. Chapter 5 synthesizes the findings from the preceding explorations, outlining the contributions of the developed methodologies to the field of cancer genomics and bioinformatics. It demonstrates the potential impact of DNA motif analysis and data integration on understanding phenotype heterogeneity in cancer and shows the prospective avenues it provides for impactful future research. Overall, this work is structured to contribute to the bioinformatics community by weaving together innovative tools and analyses focused on DNA motif analysis and data integration. It strives to pave a beneficial way forward to a deeper understanding of the cancer genome, thereby enhancing potential diagnostic and therapeutic strategies.Cellular and Molecular Biolog

Texas ScholarWorks

Efficient Algorithms for Prokaryotic Whole Genome Assembly and Finishing

Author: Biswas Abhishek
Publication venue: ODU Digital Commons
Publication date: 01/10/2015
Field of study

De-novo genome assembly from DNA fragments is primarily based on sequence overlap information. In addition, mate-pair reads or paired-end reads provide linking information for joining gaps and bridging repeat regions. Genome assemblers in general assemble long contiguous sequences (contigs) using both overlapping reads and linked reads until the assembly runs into an ambiguous repeat region. These contigs are further bridged into scaffolds using linked read information. However, errors can be made in both phases of assembly due to high error threshold of overlap acceptance and linking based on too few mate reads. Identical as well as similar repeat regions can often cause errors in overlap and mate-pair evidence. In addition, the problem of setting the correct threshold to minimize errors and optimize assembly of reads is not trivial and often requires a time-consuming trial and error process to obtain optimal results. The typical trial-and-error with multiple assembler, which can be computationally intensive, and is very inefficient, especially when users must learn how to use a wide variety of assemblers, many of which may be serial requiring long execution time and will not return usable or accurate results. Further, we show that the comparison of assembly results may not provide the users with a clear winner under all circumstances. Therefore, we propose a novel scaffolding tool, Correlative Algorithm for Repeat Placement (CARP), capable of joining short low error contigs using mate pair reads, computationally resolved repeat structures and synteny with one or more reference organisms. The CARP tool requires a set of repeat sequences such as insertion sequences (IS) that can be found computationally found without assembling the genome. Development of methods to identify such repeating regions directly from raw sequence reads or draft genomes led to the development of the ISQuest software package. ISQuest identifies bacterial ISs and their sequence elements—inverted and direct repeats—in raw read data or contigs using flexible search parameters. ISQuest is capable of finding ISs in hundreds of partially assembled genomes within hours; making it a valuable high-throughput tool for a global search of IS and repeat elements. The CARP tool matches very low error contigs with strong overlap using the ambiguous partial repeat sequence at the ends of the contig annotated using the repeat sequences discovered using ISQuest. These matches are verified by synteny with genomes of one or more reference organisms. We show that the CARP tool can be used to verify low mate pair evidence regions, independently find new joins and significantly reduce the number of scaffolds. Finally, we are demonstrate a novel viewer that presents to the user the computationally derived joins along with the evidence used to make the joins. The viewer allows the user to independently assess their confidence in the joins made by the finishing tools and make an informed decision of whether to invest the resources necessary to confirm a particular portion of the assembly. Further, we allow users to manually record join evidence, re-order contigs, and track the assembly finishing process

Old Dominion University

Comparative chloroplast genomics and phylogenetics of Fagopyrum esculentum ssp. ancestrale – A wild ancestor of cultivated buckwheat

Author: Dhingra Amit
Logacheva Maria D
Penin Aleksey A
Samigullin Tahir H
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Chloroplast genome sequences are extremely informative about species-interrelationships owing to its non-meiotic and often uniparental inheritance over generations. The subject of our study, <it>Fagopyrum esculentum</it>, is a member of the family Polygonaceae belonging to the order Caryophyllales. An uncertainty remains regarding the affinity of Caryophyllales and the asterids that could be due to undersampling of the taxa. With that background, having access to the complete chloroplast genome sequence for <it>Fagopyrum </it>becomes quite pertinent. Results We report the complete chloroplast genome sequence of a wild ancestor of cultivated buckwheat, <it>Fagopyrum esculentum </it>ssp. <it>ancestrale</it>. The sequence was rapidly determined using a previously described approach that utilized a PCR-based method and employed universal primers, designed on the scaffold of multiple sequence alignment of chloroplast genomes. The gene content and order in buckwheat chloroplast genome is similar to <it>Spinacia oleracea</it>. However, some unique structural differences exist: the presence of an intron in the <it>rpl2 </it>gene, a frameshift mutation in the <it>rpl23 </it>gene and extension of the inverted repeat region to include the <it>ycf1 </it>gene. Phylogenetic analysis of 61 protein-coding gene sequences from 44 complete plastid genomes provided strong support for the sister relationships of Caryophyllales (including Polygonaceae) to asterids. Further, our analysis also provided support for <it>Amborella </it>as sister to all other angiosperms, but interestingly, in the bayesian phylogeny inference based on first two codon positions <it>Amborella </it>united with Nymphaeales. Conclusion Comparative genomics analyses revealed that the <it>Fagopyrum </it>chloroplast genome harbors the characteristic gene content and organization as has been described for several other chloroplast genomes. However, it has some unique structural features distinct from previously reported complete chloroplast genome sequences. Phylogenetic analysis of the dataset, including this new sequence from non-core Caryophyllales supports the sister relationship between Caryophyllales and asterids.</p

Crossref

Directory of Open Access Journals

PubMed Central

Recurrent inversion polymorphisms in humans associate with genetic instability and genomic disorders.

Author: Antonacci Francesca
Ashraf Hufsah
Audano Peter A
Beck Christine R
Benito Eva
Ebert Peter
Ebler Jana
Eichler Evan E
Gordon David S
Hallast Pille
Harvey William T
Hasenfeld Patrick
Henning Barbara
Hsieh PingHsun
Höps Wolfram
Korbel Jan O
Lee Charles
Maria Maggiolini Flavia Angela
Marschall Tobias
Porubsky David
Rodriguez-Martin Bernardo
Sanders Ashley D
Steinrücken Matthias
Yilmaz Feyza
Zhu Qihui
Publication venue: The Mouseion at the JAXlibrary
Publication date: 01/01/2022
Field of study

Unlike copy number variants (CNVs), inversions remain an underexplored genetic variation class. By integrating multiple genomic technologies, we discover 729 inversions in 41 human genomes. Approximately 85% of inversionsretrotransposition; 80% of the larger inversions are balanced and affect twice as many nucleotides as CNVs. Balanced inversions show an excess of common variants, and 72% are flanked by segmental duplications (SDs) or retrotransposons. Since flanking repeats promote non-allelic homologous recombination, we developed complementary approaches to identify recurrent inversion formation. We describe 40 recurrent inversions encompassing 0.6% of the genome, showing inversion rates up to 2.7 × 1

The Jackson Laboratory: The Mouseion at the JAXlibrary

Archivio istituzionale della ricerca - Università di Bari

MDC Repository