123 research outputs found
Efficient Pattern Matching in Python
Pattern matching is a powerful tool for symbolic computations. Applications
include term rewriting systems, as well as the manipulation of symbolic
expressions, abstract syntax trees, and XML and JSON data. It also allows for
an intuitive description of algorithms in the form of rewrite rules. We present
the open source Python module MatchPy, which offers functionality and
expressiveness similar to the pattern matching in Mathematica. In particular,
it includes syntactic pattern matching, as well as matching for commutative
and/or associative functions, sequence variables, and matching with
constraints. MatchPy uses new and improved algorithms to efficiently find
matches for large pattern sets by exploiting similarities between patterns. The
performance of MatchPy is investigated on several real-world problems
A Peer-to-Peer Approach to Content-Based Publish/Subscribe
Publish/subscribe systems are successfully used to decouple distributed applications. However, their e#ciency is closely tied to the topology of the underlying network, the design of which has been neglected. Peer-to-peer network topologies can o#er inherently bounded delivery depth, load sharing, and self-organisation. In this paper, we present a contentbased publish/subscribe system routed over a peer-to-peer topology graph. The implications of combining these approaches are explored and a particular implementation using elements from Rebeca and Chord is proven correct
QuTiP: An open-source Python framework for the dynamics of open quantum systems
We present an object-oriented open-source framework for solving the dynamics
of open quantum systems written in Python. Arbitrary Hamiltonians, including
time-dependent systems, may be built up from operators and states defined by a
quantum object class, and then passed on to a choice of master equation or
Monte-Carlo solvers. We give an overview of the basic structure for the
framework before detailing the numerical simulation of open system dynamics.
Several examples are given to illustrate the build up to a complete
calculation. Finally, we measure the performance of our library against that of
current implementations. The framework described here is particularly
well-suited to the fields of quantum optics, superconducting circuit devices,
nanomechanics, and trapped ions, while also being ideal for use in classroom
instruction.Comment: 16 pages, 12 figure
Reduced mRNA Secondary-Structure Stability Near the Start Codon Indicates Functional Genes in Prokaryotes
Several recent studies have found that selection acts on synonymous mutations at the beginning of genes to reduce mRNA secondary-structure stability, presumably to aid in translation initiation. This observation suggests that a metric of relative mRNA secondary-structure stability, ZÎG, could be used to test whether putative genes are likely to be functionally important. Using the Escherichia coli genome, we compared the mean ZÎG of genes with known functions, genes with known orthologs, genes where function and orthology are unknown, and pseudogenes. Genes in the first two categories demonstrated similar levels of selection for reduced stability (increased ZÎG), whereas for pseudogenes stability did not differ from our null expectation. Surprisingly, genes where function and orthology were unknown were also not different from the null expectation, suggesting that many of these open reading frames are not functionally important. We extended our analysis by constructing a Bayesian phylogenetic mixed model based on data from 145 prokaryotic genomes. As in E. coli, genes with no known function had consistently lower ZÎG, even though we expect that many of the currently unannotated genes will ultimately have their functional utility discovered. Our findings suggest that functional genes tend to evolve increased ZÎG, whereas nonfunctional ones do not. Therefore, ZÎG may be a useful metric for identifying genes of potentially important function and could be used to target genes for further functional study
ISAMBARD:An open-source computational environment for biomolecular analysis, modelling and design
Motivation: The rational design of biomolecules is becoming a reality. However, further computational tools are needed to facilitate and accelerate this, and to make it accessible to more users.
Results: Here we introduce ISAMBARD, a tool for structural analysis, model building and rational design of biomolecules. ISAMBARD is open-source, modular, computationally scalable and intuitive to use. These features allow non-experts to explore biomolecular design in silico. ISAMBARD addresses a standing issue in protein design, namely, how to introduce backbone variability in a controlled manner. This is achieved through the generalization of tools for parametric modelling, describing the overall shape of proteins geometrically, and without input from experimentally determined structures. This will allow backbone conformations for entire folds and assemblies not observed in nature to be generated de novo, that is, to access the âdark matter of protein-fold spaceâ. We anticipate that ISAMBARD will find broad applications in biomolecular design, biotechnology and synthetic biology.
Availability and implementation: A current stable build can be downloaded from the python package index (https://pypi.python.org/pypi/isambard/) with development builds available on GitHub (https://github.com/woolfson-group/) along with documentation, tutorial material and all the scripts used to generate the data described in this paper.
Contact:[email protected] or [email protected]
Supplementary information:Supplementary data are available at Bioinformatics online
Large-eddy simulation in an anelastic framework with closed water and entropy balances
A large-eddy simulation (LES) framework is developed for simulating the dynamics of clouds and boundary layers with closed water and entropy balances. The framework is based on the anelastic equations in a formulation that remains accurate for deep convection. As prognostic variables, it uses total water and entropy, which are conserved in adiabatic and reversible processes, including reversible phase changes of water. This has numerical advantages for modeling clouds, in which reversible phase changes of water occur frequently. The equations of motion are discretized using higher-order weighted essentially nonoscillatory (WENO) discretization schemes with strong stability preserving time stepping. Numerical tests demonstrate that the WENO schemes yield simulations superior to centered schemes, even when the WENO schemes are used at coarser resolution. The framework is implemented in a new LES code written in Python and Cython, which makes the code transparent and easy to use for a wide user group
Lessons learnt on the analysis of large sequence data in animal genomics
The âomics revolution has made a large amount of sequence data available to researchers and the industry. This has had a profound impact in the field of bioinformatics, stimulating unprecedented advancements in this discipline. Mostly, this is usually looked at from the perspective of human âomics, in particular human genomics. Plant and animal genomics, however, have also been deeply influenced by nextâgeneration sequencing technologies, with several genomics applications now popular among researchers and the breeding industry. Genomics tends to generate huge amounts of data, and genomic sequence data account for an increasing proportion of big data in biological sciences, due largely to decreasing sequencing and genotyping costs and to largeâscale sequencing and resequencing projects. The analysis of big data poses a challenge to scientists, as data gathering currently takes place at a faster pace than does data processing and analysis, and the associated computational burden is increasingly taxing, making even simple manipulation, visualization and transferring of data a cumbersome operation. The time consumed by the processing and analysing of huge data sets may be at the expense of data quality assessment and critical interpretation. Additionally, when analysing lots of data, something is likely to go awryâthe software may crash or stopâand it can be very frustrating to track the error. We herein review the most relevant issues related to tackling these challenges and problems, from the perspective of animal genomics, and provide researchers that lack extensive computing experience with guidelines that will help when processing large genomic data sets
- âŠ