1,799 research outputs found
Log::ProgramInfo: A Perl module to collect and log data for bioinformatics pipelines.
BackgroundTo reproduce and report a bioinformatics analysis, it is important to be able to determine the environment in which a program was run. It can also be valuable when trying to debug why different executions are giving unexpectedly different results.ResultsLog::ProgramInfo is a Perl module that writes a log file at the termination of execution of the enclosing program, to document useful execution characteristics. This log file can be used to re-create the environment in order to reproduce an earlier execution. It can also be used to compare the environments of two executions to determine whether there were any differences that might affect (or explain) their operation.AvailabilityThe source is available on CPAN (Macdonald and Boutros, Log-ProgramInfo. http://search.cpan.org/~boutroslb/Log-ProgramInfo/).ConclusionUsing Log::ProgramInfo in programs creating result data for publishable research, and including the Log::ProgramInfo output log as part of the publication of that research is a valuable method to assist others to duplicate the programming environment as a precursor to validating and/or extending that research
Irregular Turbo Codes in Block-Fading Channels
We study irregular binary turbo codes over non-ergodic block-fading channels.
We first propose an extension of channel multiplexers initially designed for
regular turbo codes. We then show that, using these multiplexers, irregular
turbo codes that exhibit a small decoding threshold over the ergodic
Gaussian-noise channel perform very close to the outage probability on
block-fading channels, from both density evolution and finite-length
perspectives.Comment: to be presented at the IEEE International Symposium on Information
Theory, 201
VennDiagramWeb: a web application for the generation of highly customizable Venn and Euler diagrams.
BackgroundVisualization of data generated by high-throughput, high-dimensionality experiments is rapidly becoming a rate-limiting step in computational biology. There is an ongoing need to quickly develop high-quality visualizations that can be easily customized or incorporated into automated pipelines. This often requires an interface for manual plot modification, rapid cycles of tweaking visualization parameters, and the generation of graphics code. To facilitate this process for the generation of highly-customizable, high-resolution Venn and Euler diagrams, we introduce VennDiagramWeb: a web application for the widely used VennDiagram R package. VennDiagramWeb is hosted at http://venndiagram.res.oicr.on.ca/ .ResultsVennDiagramWeb allows real-time modification of Venn and Euler diagrams, with parameter setting through a web interface and immediate visualization of results. It allows customization of essentially all aspects of figures, but also supports integration into computational pipelines via download of R code. Users can upload data and download figures in a range of formats, and there is exhaustive support documentation.ConclusionsVennDiagramWeb allows the easy creation of Venn and Euler diagrams for computational biologists, and indeed many other fields. Its ability to support real-time graphics changes that are linked to downloadable code that can be integrated into automated pipelines will greatly facilitate the improved visualization of complex datasets. For application support please contact [email protected]
HCF-1 amino- and carboxy-terminal subunit association through two separate sets of interaction modules: Involvement of fibronectin type 3 repeats
When herpes simplex virus infects permissive cells, the viral regulatory protein VP16 forms a specific complex with HCF-1, a preexisting nuclear protein involved in cell proliferation. The majority of HCF-1 in the cell is a complex of associated amino (BCF-1(N))- and carboxy (HCF-1(C))-terminal subunits that result from an unusual proteolytic processing of a large precursor polypeptide. Here, we have characterized the structure and function of sequences required for HCF-1(N) and HCF-1(C) subunit association. HCF-1 contains two matched pairs of self-association sequences called SAS1 and SAS2. One of these matched association sequences, SAS1, consists of a short 43-amino-acid region of the HCF-1(N) subunit, which associates with a carboxy-terminal region of the HCF-1(C) subunit that is composed of a tandem pair of fibronectin type 3 repeats, a structural motif known to promote protein-protein interactions. Unexpectedly, the related protein HCF-2, which is not proteolyzed, also contains a functional SAS1 association element, suggesting that this element does not function solely to maintain HCF-1(N) and HCF-1(C) subunit association. HCF-1(N) subunits do not possess a nuclear localization signal. We show that, owing to a carboxy-terminal HCF-1 nuclear localization signal, HCF-1(C) subunits can recruit HCF-1(N) subunits to the nucleus
Multidimensional reconciliation for continuous-variable quantum key distribution
We propose a method for extracting an errorless secret key in a
continuous-variable quantum key distribution protocol, which is based on
Gaussian modulation of coherent states and homodyne detection. The crucial
feature is an eight-dimensional reconciliation method, based on the algebraic
properties of octonions. Since the protocol does not use any postselection, it
can be proven secure against arbitrary collective attacks, by using
well-established theorems on the optimality of Gaussian attacks. By using this
new coding scheme with an appropriate signal to noise ratio, the distance for
secure continuous-variable quantum key distribution can be significantly
extended.Comment: 8 pages, 3 figure
Kronos: a workflow assembler for genome analytics and informatics.
BackgroundThe field of next-generation sequencing informatics has matured to a point where algorithmic advances in sequence alignment and individual feature detection methods have stabilized. Practical and robust implementation of complex analytical workflows (where such tools are structured into "best practices" for automated analysis of next-generation sequencing datasets) still requires significant programming investment and expertise.ResultsWe present Kronos, a software platform for facilitating the development and execution of modular, auditable, and distributable bioinformatics workflows. Kronos obviates the need for explicit coding of workflows by compiling a text configuration file into executable Python applications. Making analysis modules would still require programming. The framework of each workflow includes a run manager to execute the encoded workflows locally (or on a cluster or cloud), parallelize tasks, and log all runtime events. The resulting workflows are highly modular and configurable by construction, facilitating flexible and extensible meta-applications that can be modified easily through configuration file editing. The workflows are fully encoded for ease of distribution and can be instantiated on external systems, a step toward reproducible research and comparative analyses. We introduce a framework for building Kronos components that function as shareable, modular nodes in Kronos workflows.ConclusionsThe Kronos platform provides a standard framework for developers to implement custom tools, reuse existing tools, and contribute to the community at large. Kronos is shipped with both Docker and Amazon Web Services Machine Images. It is free, open source, and available through the Python Package Index and at https://github.com/jtaghiyar/kronos
Recommended from our members
Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset
BACKGROUND: As more methods are developed to analyze RNA-profiling data, assessing their performance using control datasets becomes increasingly important. RESULTS: We present a 'spike-in' experiment for Affymetrix GeneChips that provides a defined dataset of 3,860 RNA species, which we use to evaluate analysis options for identifying differentially expressed genes. The experimental design incorporates two novel features. First, to obtain accurate estimates of false-positive and false-negative rates, 100-200 RNAs are spiked in at each fold-change level of interest, ranging from 1.2 to 4-fold. Second, instead of using an uncharacterized background RNA sample, a set of 2,551 RNA species is used as the constant (1x) set, allowing us to know whether any given probe set is truly present or absent. Application of a large number of analysis methods to this dataset reveals clear variation in their ability to identify differentially expressed genes. False-negative and false-positive rates are minimized when the following options are chosen: subtracting nonspecific signal from the PM probe intensities; performing an intensity-dependent normalization at the probe set level; and incorporating a signal intensity-dependent standard deviation in the test statistic. CONCLUSIONS: A best-route combination of analysis methods is presented that allows detection of approximately 70% of true positives before reaching a 10% false-discovery rate. We highlight areas in need of improvement, including better estimate of false-discovery rates and decreased false-negative rates
Low-Temperature Growth of High Resistivity GaAs by Photoassisted Metalorganic Chemical Vapor Deposition
We report the photoassisted lowâtemperature (LT) metalorganic chemical vapor deposition (MOCVD) of high resistivity GaAs. The undoped asâgrown GaAs exhibits a resistivity of âŒ106 Ωâcm, which is the highest reported for undoped material grown in the MOCVD environment. Photoassisted growth of doped and undoped device quality GaAs has been achieved at a substrate temperature of 400â°C in a modified atmospheric pressure MOCVD reactor. By using silane as a dopant gas, the LT photoassisted doped films have high levels of doping and electron mobilities comparable to those achieved by MOCVD for growth temperatures, Tgâł600â°C
- âŠ