19 research outputs found
The draft genome and transcriptome of Cannabis sativa
Background: Cannabis sativa has been cultivated throughout human history as a source of fiber, oil and food, and for its medicinal and intoxicating properties. Selective breeding has produced cannabis plants for specific uses, including high-potency marijuana strains and hemp cultivars for fiber and seed production. The molecular biology underlying cannabinoid biosynthesis and other traits of interest is largely unexplored. Results: We sequenced genomic DNA and RNA from the marijuana strain Purple Kush using shortread approaches. We report a draft haploid genome sequence of 534 Mb and a transcriptome of 30,000 genes. Comparison of the transcriptome of Purple Kush with that of the hemp cultivar 'Finola' revealed that many genes encoding proteins involved in cannabinoid and precursor pathways are more highly expressed in Purple Kush than in 'Finola'. The exclusive occurrence of \u3949-tetrahydrocannabinolic acid synthase in the Purple Kush transcriptome, and its replacement by cannabidiolic acid synthase in 'Finola', may explain why the psychoactive cannabinoid \u3949-tetrahydrocannabinol (THC) is produced in marijuana but not in hemp. Resequencing the hemp cultivars 'Finola' and 'USO-31' showed little difference in gene copy numbers of cannabinoid pathway enzymes. However, single nucleotide variant analysis uncovered a relatively high level of variation among four cannabis types, and supported a separation of marijuana and hemp. Conclusions: The availability of the Cannabis sativa genome enables the study of a multifunctional plant that occupies a unique role in human culture. Its availability will aid the development of therapeutic marijuana strains with tailored cannabinoid profiles and provide a basis for the breeding of hemp with improved agronomic characteristics.Peer reviewed: YesNRC publication: Ye
In Vivo Analysis of Cruciform Extrusion and Resolution of DNA Palindromes in Eukaryotes
DNA palindromes are implicated in several examples of gross chromosomal aberrations in the human genome, however, the molecular mechanism(s) that govern palindrome instability are largely under-investigated. Because of their propensity for intrastrand base pairing, it is suspected that the acquisition of a secondary structure, such
as a hairpin or cruciform, instigates the rearrangement process. A significant hurdle in defining palindrome-provoked instability lies in the fact that reliable methods for
examining in vivo cruciform extrusion remain underdeveloped. A challenge is to provide
straightforward evidence for cruciform extrusion in eukaryotic cells. Here, I present a plasmid system for use in Saccharomyces cerevisiae that enables for the detection of cruciforms in vivo. Cruciform extrusion, of either an in vitro-prepared palindrome or a near-palindrome from the human genome, is monitored by scoring for the product of cruciform resolution, being a dually hairpin-capped linear DNA molecule. These results not only provide evidence for the occurrence of cruciform extrusion in eukaryotic
chromatin, they also identify a novel source of endogenous double strand break formation.
A screen for candidate genes that are required for resolution revealed that the
Mus81 Endonuclease, a candidate Holliday junction resolvase, provides the majority of
cruciform resolution activity in mitotic cells, validating the notion that cellular HJ
resolvases can misrecognize a cruciform for a Holliday junction. A second screen
identified a requirement for the Sgs1-Top3-Rmi1 complex in the prevention of double
strand break formation, including cruciform resolution, of DNA palindromes. These
results uncover a new role for the RecQ helicase in prevention of palindrome-provoked
instability, possibly through the intrusion of cruciform structures. Together, this work
contributes significantly to our understanding of cruciform metabolism in eukaryotes and supports suggestions that cruciform extrusion instigates instability in the human genome.Ph
satmut_utils: a simulation and variant calling package for multiplexed assays of variant effect
Abstract The impact of millions of individual genetic variants on molecular phenotypes in coding sequences remains unknown. Multiplexed assays of variant effect (MAVEs) are scalable methods to annotate relevant variants, but existing software lacks standardization, requires cumbersome configuration, and does not scale to large targets. We present satmut_utils as a flexible solution for simulation and variant quantification. We then benchmark MAVE software using simulated and real MAVE data. We finally determine mRNA abundance for thousands of cystathionine beta-synthase variants using two experimental methods. The satmut_utils package enables high-performance analysis of MAVEs and reveals the capability of variants to alter mRNA abundance
Sequence specificity is obtained from the majority of modular C2H2 zinc-finger arrays
zinc-finger array
The draft genome and transcriptome of Cannabis sativa
Abstract
Background
Cannabis sativa has been cultivated throughout human history as a source of fiber, oil and food, and for its medicinal and intoxicating properties. Selective breeding has produced cannabis plants for specific uses, including high-potency marijuana strains and hemp cultivars for fiber and seed production. The molecular biology underlying cannabinoid biosynthesis and other traits of interest is largely unexplored.
Results
We sequenced genomic DNA and RNA from the marijuana strain Purple Kush using shortread approaches. We report a draft haploid genome sequence of 534 Mb and a transcriptome of 30,000 genes. Comparison of the transcriptome of Purple Kush with that of the hemp cultivar 'Finola' revealed that many genes encoding proteins involved in cannabinoid and precursor pathways are more highly expressed in Purple Kush than in 'Finola'. The exclusive occurrence of Î9-tetrahydrocannabinolic acid synthase in the Purple Kush transcriptome, and its replacement by cannabidiolic acid synthase in 'Finola', may explain why the psychoactive cannabinoid Î9-tetrahydrocannabinol (THC) is produced in marijuana but not in hemp. Resequencing the hemp cultivars 'Finola' and 'USO-31' showed little difference in gene copy numbers of cannabinoid pathway enzymes. However, single nucleotide variant analysis uncovered a relatively high level of variation among four cannabis types, and supported a separation of marijuana and hemp.
Conclusions
The availability of the Cannabis sativa genome enables the study of a multifunctional plant that occupies a unique role in human culture. Its availability will aid the development of therapeutic marijuana strains with tailored cannabinoid profiles and provide a basis for the breeding of hemp with improved agronomic characteristics
Structural basis for recognition of AT-rich DNA by unrelated xenogeneic silencing proteins
H-NS and Lsr2 are nucleoid-associated proteins from Gram-negative bacteria and Mycobacteria, respectively, that play an important role in the silencing of horizontally acquired foreign DNA that is more AT-rich than the resident genome. Despite the fact that Lsr2 and H-NS proteins are dissimilar in sequence and structure, they serve apparently similar functions and can functionally complement one another. The mechanism by which these xenogeneic silencers selectively target AT-rich DNA has been enigmatic. We performed high-resolution protein binding microarray analysis to simultaneously assess the binding preference of H-NS and Lsr2 for all possible 8-base sequences. Concurrently, we performed a detailed structure-function relationship analysis of their C-terminal DNA binding domains by NMR. Unexpectedly, we found that H-NS and Lsr2 use a common DNA binding mechanism where a short loop containing a âQ/RGRâ motif selectively interacts with the DNA minor groove, where the highest affinity is for AT-rich sequences that lack A-tracts. Mutations of the Q/RGR motif abolished DNA binding activity. Netropsin, a DNA minor groove-binding molecule effectively outcompeted H-NS and Lsr2 for binding to AT-rich sequences. These results provide a unified molecular mechanism to explain findings related to xenogeneic silencing proteins, including their lack of apparent sequence specificity but preference for AT-rich sequences. Our findings also suggest that structural information contained within the DNA minor groove is deciphered by xenogeneic silencing proteins to distinguish genetic material that is self from nonself
A proactive genotype-to-patient-phenotype map for cystathionine beta-synthase
Background For the majority of rare clinical missense variants, pathogenicity status cannot currently be classified. Classical homocystinuria, characterized by elevated homocysteine in plasma and urine, is caused by variants in the cystathionine beta-synthase (CBS) gene, most of which are rare. With early detection, existing therapies are highly effective. Methods Damaging CBS variants can be detected based on their failure to restore growth in yeast cells lacking the yeast ortholog CYS4. This assay has only been applied reactively, after first observing a variant in patients. Using saturation codon-mutagenesis, en masse growth selection, and sequencing, we generated a comprehensive, proactive map of CBS missense variant function. Results Our CBS variant effect map far exceeds the performance of computational predictors of disease variants. Map scores correlated strongly with both disease severity (Spearman's rho = 0.9) and human clinical response to vitamin B-6 (rho = 0.93). Conclusions We demonstrate that highly multiplexed cell-based assays can yield proactive maps of variant function and patient response to therapy, even for rare variants not previously seen in the clinic
Recommended from our members
A framework for exhaustively mapping functional missense variants
Abstract Although we now routinely sequence human genomes, we can confidently identify only a fraction of the sequence variants that have a functional impact. Here, we developed a deep mutational scanning framework that produces exhaustive maps for human missense variants by combining random codon mutagenesis and multiplexed functional variation assays with computational imputation and refinement. We applied this framework to four proteins corresponding to six human genes: UBE2I (encoding SUMO E2 conjugase), SUMO1 (small ubiquitinâlike modifier), TPK1 (thiamin pyrophosphokinase), and CALM1/2/3 (three genes encoding the protein calmodulin). The resulting maps recapitulate known protein features and confidently identify pathogenic variation. Assays potentially amenable to deep mutational scanning are already available for 57% of human disease genes, suggesting that DMS could ultimately map functional variation for all human disease genes