234 research outputs found
The Kontsevich constants for the volume of the moduli of curves and topological recursion
We give an Eynard-Orantin type topological recursion formula for the
canonical Euclidean volume of the combinatorial moduli space of pointed smooth
algebraic curves. The recursion comes from the edge removal operation on the
space of ribbon graphs. As an application we obtain a new proof of the
Kontsevich constants for the ratio of the Euclidean and the symplectic volumes
of the moduli space of curves.Comment: 37 pages with 20 figure
CloudMan as a platform for tool, data, and analysis distribution
Background Cloud computing provides an infrastructure that facilitates large scale computational analysis in a scalable, democratized fashion, However, in this context it is difficult to ensure sharing of an analysis environment and associated data in a scalable and precisely reproducible way. Results CloudMan (usecloudman.org) enables individual researchers to easily deploy, customize, and share their entire cloud analysis environment, including data, tools, and configurations. Conclusions With the enabled customization and sharing of instances, CloudMan can be used as a platform for collaboration. The presented solution improves accessibility of cloud resources, tools, and data to the level of an individual researcher and contributes toward reproducibility and transparency of research solutions
GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations
Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics
Recommended from our members
Bio.Phylo: A Unified Toolkit for Processing, Analyzing and Visualizing Phylogenetic Trees in Biopython
Background: Ongoing innovation in phylogenetics and evolutionary biology has been accompanied by a proliferation of software tools, data formats, analytical techniques and web servers. This brings with it the challenge of integrating phylogenetic and other related biological data found in a wide variety of formats, and underlines the need for reusable software that can read, manipulate and transform this information into the various forms required to build computational pipelines. Results: We built a Python software library for working with phylogenetic data that is tightly integrated with Biopython, a broad-ranging toolkit for computational biology. Our library, Bio.Phylo, is highly interoperable with existing libraries, tools and standards, and is capable of parsing common file formats for phylogenetic trees, performing basic transformations and manipulations, attaching rich annotations, and visualizing trees. We unified the modules for working with the standard file formats Newick, NEXUS and phyloXML behind a consistent and simple API, providing a common set of functionality independent of the data source. Conclusions: Bio.Phylo meets a growing need in bioinformatics for working with heterogeneous types of phylogenetic data. By supporting interoperability with multiple file formats and leveraging existing Biopython features, this library simplifies the construction of phylogenetic workflows. We also provide examples of the benefits of building a community around a shared open-source project. Bio.Phylo is included with Biopython, available through the Biopython website, http://biopython.org
VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research
Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research
Recommended from our members
Cloud BioLinux: pre-configured and on-demand bioinformatics computing for the genomics community
Background: A steep drop in the cost of next-generation sequencing during recent years has made the technology affordable to the majority of researchers, but downstream bioinformatic analysis still poses a resource bottleneck for smaller laboratories and institutes that do not have access to substantial computational resources. Sequencing instruments are typically bundled with only the minimal processing and storage capacity required for data capture during sequencing runs. Given the scale of sequence datasets, scientific value cannot be obtained from acquiring a sequencer unless it is accompanied by an equal investment in informatics infrastructure. Results: Cloud BioLinux is a publicly accessible Virtual Machine (VM) that enables scientists to quickly provision on-demand infrastructures for high-performance bioinformatics computing using cloud platforms. Users have instant access to a range of pre-configured command line and graphical software applications, including a full-featured desktop interface, documentation and over 135 bioinformatics packages for applications including sequence alignment, clustering, assembly, display, editing, and phylogeny. Each tool's functionality is fully described in the documentation directly accessible from the graphical interface of the VM. Besides the Amazon EC2 cloud, we have started instances of Cloud BioLinux on a private Eucalyptus cloud installed at the J. Craig Venter Institute, and demonstrated access to the bioinformatic tools interface through a remote connection to EC2 instances from a local desktop computer. Documentation for using Cloud BioLinux on EC2 is available from our project website, while a Eucalyptus cloud image and VirtualBox Appliance is also publicly available for download and use by researchers with access to private clouds. Conclusions: Cloud BioLinux provides a platform for developing bioinformatics infrastructures on the cloud. An automated and configurable process builds Virtual Machines, allowing the development of highly customized versions from a shared code base. This shared community toolkit enables application specific analysis platforms on the cloud by minimizing the effort required to prepare and maintain them
Vibration Fault Detection for Steam Generator Tubing
The detection of flaws within steam generator tubing is an important part of safety in a nuclear plant as it could potentially lead to release of radioactive material if unchecked. The current test method for testing these tubes is expensive and time consuming; however, as sound has been used to detect flaws successfully in other applications, an alternative method for using acoustics and accelerometers to detect flaws is what has been explored in this project. Preliminary results of testing with a simple hollow steel tube have given promising results of detecting a hole as small as 7.66% of the tube diameter. Testing of a model steam generator with four tubes led showed promising results using a motor to vibrate the system
AWARE@Home: Management Tool for Monitoring At-Home Costs and Environmental Impact
Capstone Design and Manufacturing Experience: Winter 2006The goal of the AWARE@Home system is to enable households to monitor their electricity, natural gas, and water consumption. Before we took over the project, the system had been implemented for electricity but not for natural gas or water. Our goals were to: (1) expand the AWARE@Home system to have natural gas and water monitoring capabilities, (2) improve the system by resolving current flaws, (3) validate the updated system and its components, and (4) add automobile fuel consumption monitoring to the system if time permits.http://deepblue.lib.umich.edu/bitstream/2027.42/49580/2/proj11_report.pd
Timber Mountain Precipitation Monitoring Station
A precipitation monitoring station was placed on the west flank of Timber Mountain during the year 2010. It is located in an isolated highland area near the western border of the Nevada National Security Site (NNSS), south of Pahute Mesa. The cost of the equipment, permitting, and installation was provided by the Environmental Monitoring Systems Initiative (EMSI) project. Data collection, analysis, and maintenance of the station during fiscal year 2011 was funded by the U.S. Department of Energy, National Nuclear Security Administration, Nevada Site Office Environmental Restoration, Soils Activity. The station is located near the western headwaters of Forty Mile Wash on the Nevada Test and Training Range (NTTR). Overland flows from precipitation events that occur in the Timber Mountain high elevation area cross several of the contaminated Soils project CAU (Corrective Action Unit) sites located in the Forty Mile Wash watershed. Rain-on-snow events in the early winter and spring around Timber Mountain have contributed to several significant flow events in Forty Mile Wash. The data from the new precipitation gauge at Timber Mountain will provide important information for determining runoff response to precipitation events in this area of the NNSS. Timber Mountain is also a groundwater recharge area, and estimation of recharge from precipitation was important for the EMSI project in determining groundwater flowpaths and designing effective groundwater monitoring for Yucca Mountain. Recharge estimation additionally provides benefit to the Underground Test Area Sub-project analysis of groundwater flow direction and velocity from nuclear test areas on Pahute Mesa. Additionally, this site provides data that has been used during wild fire events and provided a singular monitoring location of the extreme precipitation events during December 2010 (see data section for more details). This letter report provides a summary of the site location, equipment, and data collected in fiscal year 2011
- ā¦