Search CORE

1,043 research outputs found

Recommended from our members

Inference of single-cell phylogenies from lineage tracing data using Cassiopeia.

Author: Chan Michelle M
Hussmann Jeffrey A
Jones Matthew G
Khodaverdian Alex
Quinn Jeffrey J
Wang Robert
Weissman Jonathan S
Xu Chenling
Yosef Nir
Publication venue: eScholarship, University of California
Publication date: 01/04/2020
Field of study

The pairing of CRISPR/Cas9-based gene editing with massively parallel single-cell readouts now enables large-scale lineage tracing. However, the rapid growth in complexity of data from these assays has outpaced our ability to accurately infer phylogenetic relationships. First, we introduce Cassiopeia-a suite of scalable maximum parsimony approaches for tree reconstruction. Second, we provide a simulation framework for evaluating algorithms and exploring lineage tracer design principles. Finally, we generate the most complex experimental lineage tracing dataset to date, 34,557 human cells continuously traced over 15 generations, and use it for benchmarking phylogenetic inference approaches. We show that Cassiopeia outperforms traditional methods by several metrics and under a wide variety of parameter regimes, and provide insight into the principles for the design of improved Cas9-enabled recorders. Together, these should broadly enable large-scale mammalian lineage tracing efforts. Cassiopeia and its benchmarking resources are publicly available at www.github.com/YosefLab/Cassiopeia

eScholarship - University of California

Quantitative evaluation of chromosomal rearrangements in gene-edited human stem cells by CAST-Seq

Author: Andrieux G
Blattner G
Boerries M
Cathomen T
Cornu T
El Gaz M
Klermund J
Monaco G
Mussolino C
Pennucci V
Poddar S
Turchiano G
Publication venue: CELL PRESS
Publication date: 03/06/2021
Field of study

Genome editing has shown great promise for clinical translation but also revealed the risk of genotoxicity caused by off-target effects of programmable nucleases. Here we describe chromosomal aberrations analysis by single targeted linker-mediated PCR sequencing (CAST-Seq), a preclinical assay to identify and quantify chromosomal aberrations derived from on-target and off-target activities of CRISPR-Cas nucleases or transcriptional activator-like effector nucleases (TALENs), respectively, in human hematopoietic stem cells (HSCs). Depending on the employed designer nuclease, CAST-Seq detected translocations in 0%–0.5% of gene-edited human CD34+ HSCs, and up to 20% of on-target loci harbored gross rearrangements. Moreover, CAST-Seq detected distinct types of chromosomal aberrations, such as homology-mediated translocations, that are mediated by homologous recombination and not off-target activity. CAST-Seq is a sensitive assay able to identify and quantify unintended chromosomal rearrangements in addition to the more typical mutations at off-target sites. CAST-Seq analyses may be particularly relevant for therapeutic genome editing to enable thorough risk assessment before clinical application of gene-edited products

Multispecies Genomic Sex Identification Using DDX3 Gene Polymorphisms

Author: Felts Jessica
Publication venue: DigitalCommons@USU
Publication date: 01/08/2023
Field of study

PCR sex determination assays must be reliable and cost effective due to the frequent and integral use of these assays in biological research and the animal production industry. Thus, the design of proof of a primer pair with a built-in control is warranted to not only bypass the extra cost of a multiplex reaction, but also to prevent anomalous results that have been documented with other primer pairs. The objective of this study was to design primer pairs with built in PCR amplification control to identify sex in Equus caballus (domestic horse), Homo sapiens (humans), Macaca mulatta (rhesus macaque), and Sus scrofa (domestic pig) DNA samples. The procedures utilized in this study were to first align the DDX3X gene with the Y chromosome homolog and create primer pairs to flank Y chromosome specific indels with each species. PCR with gel electrophoresis results showed confirmation of the hypothesis that the primer pairs can be used to accurately identify separate X and Y chromosome specific sequences via species-specific single primer pairs in Equus caballus, Homo sapiens, Macaca mulatta, and Sus scrofa genomic samples. Additionally, in silico and qualitative gel analyses were completed to assess the efficacy of the Equus caballus specific primers in alternative species, which yielded no results. Summarily, results concluded that there are suitable indel regions for PCR amplification in Equus caballus, Homo sapiens, Macaca mulatta, and Sus scrofa. The results presented herein represent a meaningful contribution to the field from this methodology, these indel regions can be identified via the designed primers and utilized in an efficient, quick, and cost-effective way to identify sex without the need for multiplex or the risk of false identification

Computational analysis of human genomic variants and lncRNAs from sequence data

Author: Wang Ning
Publication venue: fi=Turun yliopisto|en=University of Turku|
Publication date: 01/06/2023
Field of study

The high-throughput sequencing technologies have been developed and applied to the human genome studies for nearly 20 years. These technologies have provided numerous research applications and have significantly expanded our knowledge about the human genome. In this thesis, computational methods that utilize sequence data to study human genomic variants and transcripts were evaluated and developed. Indel represents insertion and deletion, which are two types of common genomic variants that are widespread in the human genome. Detecting indels from human genomes is the crucial step for diagnosing indel related genomic disorders and may potentially identify novel indel makers for studying certain diseases. Compared with previous techniques, the high-throughput sequencing technologies, especially the next- generation sequencing (NGS) technology, enable to detect indels accurately and efficiently in wide ranges of genome. In the first part of the thesis, tools with indel calling abilities are evaluated with an assortment of indels and different NGS settings. The results show that the selection of tools and NGS settings impact on indel detection significantly, which provide suggestions for tool selection and future developments. In bioinformatics analysis, an indel’s position can be marked inconsistently on the reference genome, which may result in an indel having different but equivalent representations and cause troubles for downstream. This problem is related to the complex sequence context of the indels, for example, short tandem repeats (STRs), where the same short stretch of nucleotides is amplified. In the second part of the thesis, a novel computational tool VarSCAT was described, which has various functions for annotating the sequence context of variants, including ambiguous positions, STRs, and other sequence context features. Analysis of several high- confidence human variant sets with VarSCAT reveals that a large number of genomic variants, especially indels, have sequence features associated with STRs. In the human genome, not all genes and their transcripts are translated into proteins. Long non-coding ribonucleic acid (lncRNA) is a typical example. Sequence recognition built with machine learning models have improved significantly in recent years. In the last part of the thesis, several machine learning-based lncRNA prediction tools were evaluated on their predictions for coding potentiality of transcripts. The results suggest that tools based on deep learning identify lncRNAs best. Ihmisen genomivarianttien ja lncRNA:iden laskennallinen analyysi sekvenssiaineistosta Korkean suorituskyvyn sekvensointiteknologioita on kehitetty ja sovellettu ihmisen genomitutkimuksiin lähes 20 vuoden ajan. Nämä teknologiat ovat mahdollistaneet ihmisen genomin laaja-alaisen tutkimisen ja lisänneet merkittävästi tietoamme siitä. Tässä väitöstyössä arvioitiin ja kehitettiin sekvenssiaineistoa hyödyntäviä laskennallisia menetelmiä ihmisen genomivarianttien sekä transkriptien tutkimiseen. Indeli on yhteisnimitys lisäys- eli insertio-varianteille ja häviämä- eli deleetio-varianteille, joita esiintyy koko genomin alueella. Indelien tunnistaminen on ratkaisevaa geneettisten poikkeavuuksien diagnosoinnissa ja eri sairauksiin liittyvien uusien indeli-markkereiden löytämisessä. Aiempiin teknologioihin verrattuna korkean suorituskyvyn sekvensointiteknologiat, erityisesti seuraavan sukupolven sekvensointi (NGS) mahdollistavat indelien havaitsemisen tarkemmin ja tehokkaammin laajemmilta genomialueilta. Väitöstyön ensimmäisessä osassa indelien kutsumiseen tarkoitettuja laskentatyökaluja arvioitiin käyttäen laajaa valikoimaa indeleitä ja erilaisia NGS-asetuksia. Tulokset osoittivat, että työkalujen valinta ja NGS-asetukset vaikuttivat indelien tunnistukseen merkittävästi ja siten ne voivat ohjata työkalujen valinnassa ja kehitystyössä. Bioinformatiivisessa analyysissä saman indelin sijainti voidaan merkitä eri kohtiin referenssigenomia, joka voi aiheuttaa ongelmia loppupään analyysiin, kuten indeli-kutsujen arviointiin. Tämä ongelma liittyy sekvenssikontekstiin, koska variantit voivat sijoittua lyhyille perättäisille tandem-toistojaksoille (STR), jossa sama lyhyt nukleotidijakso on monistunut. Väitöstyön toisessa osassa kehitettiin laskentatyökalu VarSCAT, jossa on eri toimintoja, mm. monitulkintaisten sijaintitietojen, vierekkäisten alueiden ja STR-alueiden tarkasteluun. Luotettaviksi arvioitujen ihmisen varianttiaineistojen analyysi VarSCAT-työkalulla paljasti, että monien geneettisten varianttien ja erityisesti indelien ominaisuudet liittyvät STR-alueisiin. Kaikkia ihmisen geenejä ja niiden geenituotteita, kuten esimerkiksi ei-koodaavia RNA:ta (lncRNA) ei käännetä proteiiniksi. Koneoppimismenetelmissä ja sekvenssitunnistuksessa on tapahtunut huomattavaa parannusta viime vuosina. Väitöstyön viimeisessä osassa arvioitiin useiden koneoppimiseen perustuvien lncRNA-ennustustyökalujen ennusteita. Tulokset viittaavat siihen, että syväoppimiseen perustuvat työkalut tunnistavat lncRNA:t parhaiten

Integrative computational approaches to study protein-nucleic acid interactions

Author: Chakrabarti Anob Mauli
Publication venue: UCL (University College London)
Publication date: 28/02/2020
Field of study

Interactions between proteins and nucleic acid molecules are central to the cellular regulation and homeostasis. To study them, I employ a wide range of computational analysis methods to integrate genomic data from many types of experiment. This thesis has three parts. In the first part, I explore the patterns of indels created by CRISPR-Cas9 genome editing. By thorough characterisation of the precision of editing at thousands of genomic target sites, we identify simple sequence rules that can help predict these outcomes. Furthermore, we examine the role of the structural chromatin context in fine-tuning Cas9-DNA interactions. In the second part, I explore methods to study protein-RNA interactions. I use comparative computational analyses to assess both the data quality of, and data analysis methods for, different crosslinking and immunoprecipitation (CLIP) technologies. I then develop new methods to analyse data generated by hybrid individual-nucleotide resolution CLIP (hiCLIP). By tailoring computational solutions to an understanding of experimental conditions, I improve the overall sensitivity of hiCLIP, and ultimately feedback to drive ongoing experimental development. In the third part, I focus on the Staufen family of double-stranded RNA binding proteins and using hiCLIP data to define transcriptome-wide atlases of RNA duplexes bound by these proteins both in a cell line and in rat brain tissue. Through integration with other data sets, both publicly available and newly generated, I derive insights into their function in RNA metabolism, and in how these interactions change during the course of mammalian brain development with putative roles in ribonucleoprotein complex formation. In summary, I present a range of tailored computational methods and analyses developed to understand interactions between proteins and nucleic acids; aiming to link these interactions to functional outcomes

Haploinsufficiency for DNA methyltransferase 3A predisposes hematopoietic cells to myeloid malignancies

Author: Allegra A. Petti
Amanda M. Smith
Angela M. Verdoni
Catrina Fronick
Celia V. Bangert
Christopher A. Miller
Christopher B. Cole
David A. Russler-Germain
Frith
Gue Su Chang
Jeffery M. Klco
Jeong
Mindy Guo
Nichole M. Helton
Poitras
Robert Fulton
Shamika Ketkar
Shelly O’Laughlin
Timothy J. Ley
Walter
Xi
Zheng
Publication venue: Digital Commons@Becker
Publication date: 01/01/2017
Field of study

IMPROVED ALLELE SPECIFIC EXPRESSION (ASE) DETECTION AND THE IMPLICATIONS FOR UNDERSTANDING REGULATORY AND DISEASE GENETICS

Author: Saukkonen Anna
Publication venue
Publication date: 01/05/2023
Field of study

King's Research Portal