134 research outputs found
GenomeVIP: A cloud platform for genomic variant discovery and interpretation
Identifying genomic variants is a fundamental first step toward the understanding of the role of inherited and acquired variation in disease. The accelerating growth in the corpus of sequencing data that underpins such analysis is making the data-download bottleneck more evident, placing substantial burdens on the research community to keep pace. As a result, the search for alternative approaches to the traditional “download and analyze” paradigm on local computing resources has led to a rapidly growing demand for cloud-computing solutions for genomics analysis. Here, we introduce the Genome Variant Investigation Platform (GenomeVIP), an open-source framework for performing genomics variant discovery and annotation using cloud- or local high-performance computing infrastructure. GenomeVIP orchestrates the analysis of whole-genome and exome sequence data using a set of robust and popular task-specific tools, including VarScan, GATK, Pindel, BreakDancer, Strelka, and Genome STRiP, through a web interface. GenomeVIP has been used for genomic analysis in large-data projects such as the TCGA PanCanAtlas and in other projects, such as the ICGC Pilots, CPTAC, ICGC-TCGA DREAM Challenges, and the 1000 Genomes SV Project. Here, we demonstrate GenomeVIP's ability to provide high-confidence annotated somatic, germline, and de novo variants of potential biological significance using publicly available data sets.</jats:p
GINS motion reveals replication fork progression is remarkably uniform throughout the yeast genome
Time-resolved ChIP-chip can be utilized to monitor the genome-wide dynamics of the GINS complex, yielding quantitative information on replication fork movement.Replication forks progress at remarkably uniform rates across the genome, regardless of location.GINS progression appears to be arrested, albeit with very low frequency, at sites of highly transcribed genes.Comparison of simulation with data leads to novel biological insights regarding the dynamics of replication fork progressio
A Fluid Dynamics Calculation of Sputtering from a Cylindrical Thermal Spike
The sputtering yield, Y, from a cylindrical thermal spike is calculated using
a two dimensional fluid dynamics model which includes the transport of energy,
momentum and mass. The results show that the high pressure built-up within the
spike causes the hot core to perform a rapid expansion both laterally and
upwards. This expansion appears to play a significant role in the sputtering
process. It is responsible for the ejection of mass from the surface and causes
fast cooling of the cascade. The competition between these effects accounts for
the nearly linear dependence of with the deposited energy per unit depth
that was observed in recent Molecular Dynamics simulations. Based on this we
describe the conditions for attaining a linear yield at high excitation
densities and give a simple model for this yield.Comment: 10 pages, 9 pages (including 9 figures), submitted to PR
Crater formation by fast ions: comparison of experiment with Molecular Dynamics simulations
An incident fast ion in the electronic stopping regime produces a track of
excitations which can lead to particle ejection and cratering. Molecular
Dynamics simulations of the evolution of the deposited energy were used to
study the resulting crater morphology as a function of the excitation density
in a cylindrical track for large angle of incidence with respect to the surface
normal. Surprisingly, the overall behavior is shown to be similar to that seen
in the experimental data for crater formation in polymers. However, the
simulations give greater insight into the cratering process. The threshold for
crater formation occurs when the excitation density approaches the cohesive
energy density, and a crater rim is formed at about six times that energy
density. The crater length scales roughly as the square root of the electronic
stopping power, and the crater width and depth seem to saturate for the largest
energy densities considered here. The number of ejected particles, the
sputtering yield, is shown to be much smaller than simple estimates based on
crater size unless the full crater morphology is considered. Therefore, crater
size can not easily be used to estimate the sputtering yield.Comment: LaTeX, 7 pages, 5 EPS figures. For related figures/movies, see:
http://dirac.ms.virginia.edu/~emb3t/craters/craters.html New version uploaded
5/16/01, with minor text changes + new figure
Recommended from our members
Community Assessment of the Predictability of Cancer Protein and Phosphoprotein Levels from Genomics and Transcriptomics.
Cancer is driven by genomic alterations, but the processes causing this disease are largely performed by proteins. However, proteins are harder and more expensive to measure than genes and transcripts. To catalyze developments of methods to infer protein levels from other omics measurements, we leveraged crowdsourcing via the NCI-CPTAC DREAM proteogenomic challenge. We asked for methods to predict protein and phosphorylation levels from genomic and transcriptomic data in cancer patients. The best performance was achieved by an ensemble of models, including as predictors transcript level of the corresponding genes, interaction between genes, conservation across tumor types, and phosphosite proximity for phosphorylation prediction. Proteins from metabolic pathways and complexes were the best and worst predicted, respectively. The performance of even the best-performing model was modest, suggesting that many proteins are strongly regulated through translational control and degradation. Our results set a reference for the limitations of computational inference in proteogenomics. A record of this paper's transparent peer review process is included in the Supplemental Information
W-Curve Alignments for HIV-1 Genomic Comparisons
The W-curve was originally developed as a graphical visualization technique for viewing DNA and RNA sequences. Its ability to render features of DNA also makes it suitable for computational studies. Its main advantage in this area is utilizing a single-pass algorithm for comparing the sequences. Avoiding recursion during sequence alignments offers advantages for speed and in-process resources. The graphical technique also allows for multiple models of comparison to be used depending on the nucleotide patterns embedded in similar whole genomic sequences. The W-curve approach allows us to compare large numbers of samples quickly.We are currently tuning the algorithm to accommodate quirks specific to HIV-1 genomic sequences so that it can be used to aid in diagnostic and vaccine efforts. Tracking the molecular evolution of the virus has been greatly hampered by gap associated problems predominantly embedded within the envelope gene of the virus. Gaps and hypermutation of the virus slow conventional string based alignments of the whole genome. This paper describes the W-curve algorithm itself, and how we have adapted it for comparison of similar HIV-1 genomes. A treebuilding method is developed with the W-curve that utilizes a novel Cylindrical Coordinate distance method and gap analysis method. HIV-1 C2-V5 env sequence regions from a Mother/Infant cohort study are used in the comparison.The output distance matrix and neighbor results produced by the W-curve are functionally equivalent to those from Clustal for C2-V5 sequences in the mother/infant pairs infected with CRF01_AE.Significant potential exists for utilizing this method in place of conventional string based alignment of HIV-1 genomes, such as Clustal X. With W-curve heuristic alignment, it may be possible to obtain clinically useful results in a short time-short enough to affect clinical choices for acute treatment. A description of the W-curve generation process, including a comparison technique of aligning extremes of the curves to effectively phase-shift them past the HIV-1 gap problem, is presented. Besides yielding similar neighbor-joining phenogram topologies, most Mother and Infant C2-V5 sequences in the cohort pairs geometrically map closest to each other, indicating that W-curve heuristics overcame any gap problem
Proteogenomics connects somatic mutations to signalling in breast cancer
Somatic mutations have been extensively characterized in breast cancer, but the effects of these genetic alterations on the proteomic landscape remain poorly understood. We describe quantitative mass spectrometry-based proteomic and phosphoproteomic analyses of 105 genomically annotated breast cancers of which 77 provided high-quality data. Integrated analyses allowed insights into the somatic cancer genome including the consequences of chromosomal loss, such as the 5q deletion characteristic of basal-like breast cancer. The 5q trans effects were interrogated against the Library of Integrated Network-based Cellular Signatures, thereby connecting CETN3 and SKP1 loss to elevated expression of EGFR, and SKP1 loss also to increased SRC. Global proteomic data confirmed a stromal-enriched group in addition to basal and luminal clusters and pathway analysis of the phosphoproteome identified a G Protein-coupled receptor cluster that was not readily identified at the mRNA level. Besides ERBB2, other amplicon-associated, highly phosphorylated kinases were identified, including CDK12, PAK1, PTK2, RIPK2 and TLK2. We demonstrate that proteogenomic analysis of breast cancer elucidates functional consequences of somatic mutations, narrows candidate nominations for driver genes within large deletions and amplified regions, and identifies therapeutic targets
- …