1,123,882 research outputs found
BATCH-GE : batch analysis of next-generation sequencing data for genome editing assessment
Targeted mutagenesis by the CRISPR/Cas9 system is currently revolutionizing genetics. The ease of this technique has enabled genome engineering in-vitro and in a range of model organisms and has pushed experimental dimensions to unprecedented proportions. Due to its tremendous progress in terms of speed, read length, throughput and cost, Next-Generation Sequencing (NGS) has been increasingly used for the analysis of CRISPR/Cas9 genome editing experiments. However, the current tools for genome editing assessment lack flexibility and fall short in the analysis of large amounts of NGS data. Therefore, we designed BATCH-GE, an easy-to-use bioinformatics tool for batch analysis of NGS-generated genome editing data, available from https://github.com/WouterSteyaert/BATCH-GE.git. BATCH-GE detects and reports indel mutations and other precise genome editing events and calculates the corresponding mutagenesis efficiencies for a large number of samples in parallel. Furthermore, this new tool provides flexibility by allowing the user to adapt a number of input variables. The performance of BATCH-GE was evaluated in two genome editing experiments, aiming to generate knock-out and knock-in zebrafish mutants. This tool will not only contribute to the evaluation of CRISPR/Cas9-based experiments, but will be of use in any genome editing experiment and has the ability to analyze data from every organism with a sequenced genome
Comparative genome analysis of Wolbachia strain wAu
BACKGROUND:
Wolbachia intracellular bacteria can manipulate the reproduction of their arthropod hosts, including inducing sterility between populations known as cytoplasmic incompatibility (CI). Certain strains have been identified that are unable to induce or rescue CI, including wAu from Drosophila. Genome sequencing and comparison with CI-inducing related strain wMel was undertaken in order to better understand the molecular basis of the phenotype.
RESULTS:
Although the genomes were broadly similar, several rearrangements were identified, particularly in the prophage regions. Many orthologous genes contained single nucleotide polymorphisms (SNPs) between the two strains, but a subset containing major differences that would likely cause inactivation in wAu were identified, including the absence of the wMel ortholog of a gene recently identified as a CI candidate in a proteomic study. The comparative analyses also focused on a family of transcriptional regulator genes implicated in CI in previous work, and revealed numerous differences between the strains, including those that would have major effects on predicted function.
CONCLUSIONS:
The study provides support for existing candidates and novel genes that may be involved in CI, and provides a basis for further functional studies to examine the molecular basis of the phenotype
Fast Genome-Wide QTL Analysis Using Mendel
Pedigree GWAS (Option 29) in the current version of the Mendel software is an
optimized subroutine for performing large scale genome-wide QTL analysis. This
analysis (a) works for random sample data, pedigree data, or a mix of both, (b)
is highly efficient in both run time and memory requirement, (c) accommodates
both univariate and multivariate traits, (d) works for autosomal and x-linked
loci, (e) correctly deals with missing data in traits, covariates, and
genotypes, (f) allows for covariate adjustment and constraints among
parameters, (g) uses either theoretical or SNP-based empirical kinship matrix
for additive polygenic effects, (h) allows extra variance components such as
dominant polygenic effects and household effects, (i) detects and reports
outlier individuals and pedigrees, and (j) allows for robust estimation via the
-distribution. The current paper assesses these capabilities on the genetics
analysis workshop 19 (GAW19) sequencing data. We analyzed simulated and real
phenotypes for both family and random sample data sets. For instance, when
jointly testing the 8 longitudinally measured systolic blood pressure (SBP) and
diastolic blood pressure (DBP) traits, it takes Mendel 78 minutes on a standard
laptop computer to read, quality check, and analyze a data set with 849
individuals and 8.3 million SNPs. Genome-wide eQTL analysis of 20,643
expression traits on 641 individuals with 8.3 million SNPs takes 30 hours using
20 parallel runs on a cluster. Mendel is freely available at
\url{http://www.genetics.ucla.edu/software}
Genome maps across 26 human populations reveal population-specific patterns of structural variation.
Large structural variants (SVs) in the human genome are difficult to detect and study by conventional sequencing technologies. With long-range genome analysis platforms, such as optical mapping, one can identify large SVs (>2 kb) across the genome in one experiment. Analyzing optical genome maps of 154 individuals from the 26 populations sequenced in the 1000 Genomes Project, we find that phylogenetic population patterns of large SVs are similar to those of single nucleotide variations in 86% of the human genome, while ~2% of the genome has high structural complexity. We are able to characterize SVs in many intractable regions of the genome, including segmental duplications and subtelomeric, pericentromeric, and acrocentric areas. In addition, we discover ~60 Mb of non-redundant genome content missing in the reference genome sequence assembly. Our results highlight the need for a comprehensive set of alternate haplotypes from different populations to represent SV patterns in the genome
Pericentromeric heterochromatin is hierarchically organized and spatially contacts H3K9me2 islands in euchromatin.
Membraneless pericentromeric heterochromatin (PCH) domains play vital roles in chromosome dynamics and genome stability. However, our current understanding of 3D genome organization does not include PCH domains because of technical challenges associated with repetitive sequences enriched in PCH genomic regions. We investigated the 3D architecture of Drosophila melanogaster PCH domains and their spatial associations with the euchromatic genome by developing a novel analysis method that incorporates genome-wide Hi-C reads originating from PCH DNA. Combined with cytogenetic analysis, we reveal a hierarchical organization of the PCH domains into distinct territories. Strikingly, H3K9me2-enriched regions embedded in the euchromatic genome show prevalent 3D interactions with the PCH domain. These spatial contacts require H3K9me2 enrichment, are likely mediated by liquid-liquid phase separation, and may influence organismal fitness. Our findings have important implications for how PCH architecture influences the function and evolution of both repetitive heterochromatin and the gene-rich euchromatin
Odyssey: a semi-automated pipeline for phasing, imputation, and analysis of genome-wide genetic data
BACKGROUND:
Genome imputation, admixture resolution and genome-wide association analyses are timely and computationally intensive processes with many composite and requisite steps. Analysis time increases further when building and installing the run programs required for these analyses. For scientists that may not be as versed in programing language, but want to perform these operations hands on, there is a lengthy learning curve to utilize the vast number of programs available for these analyses.
RESULTS:
In an effort to streamline the entire process with easy-to-use steps for scientists working with big data, the Odyssey pipeline was developed. Odyssey is a simplified, efficient, semi-automated genome-wide imputation and analysis pipeline, which prepares raw genetic data, performs pre-imputation quality control, phasing, imputation, post-imputation quality control, population stratification analysis, and genome-wide association with statistical data analysis, including result visualization. Odyssey is a pipeline that integrates programs such as PLINK, SHAPEIT, Eagle, IMPUTE, Minimac, and several R packages, to create a seamless, easy-to-use, and modular workflow controlled via a single user-friendly configuration file. Odyssey was built with compatibility in mind, and thus utilizes the Singularity container solution, which can be run on Linux, MacOS, and Windows platforms. It is also easily scalable from a simple desktop to a High-Performance System (HPS).
CONCLUSION:
Odyssey facilitates efficient and fast genome-wide association analysis automation and can go from raw genetic data to genome: phenome association visualization and analyses results in 3-8 h on average, depending on the input data, choice of programs within the pipeline and available computer resources. Odyssey was built to be flexible, portable, compatible, scalable, and easy to setup. Biologists less familiar with programing can now work hands on with their own big data using this easy-to-use pipeline
- …
