109 research outputs found
An Improved Nuclear Vector Replacement Algorithm for Nuclear Magnetic Resonance Assignment
We report an improvement to the Nuclear Vector Replacement (NVR) algorithm for high-throughput Nuclear Magnetic Resonance (NMR) resonance assignment. The new algorithm improves upon our earlier result in terms of accuracy and computational complexity. In particular, the new NVR algorithm assigns backbone resonances without error (100% accuracy) on the same test suite examined in [Langmead and Donald J. Biomol. NMR 2004], and runs in time where is the number of amino acids in the primary sequence of the protein, and is the maximum edge weight in an integer-weighted bipartite graph
High-Throughput 3D Homology Detection via NMR Resonance Assignment
One goal of the structural genomics initiative is the identification of new protein folds. Sequence-based structural homology prediction methods are an important means for prioritizing unknown proteins for structure determination. However, an important challenge remains: two highly dissimilar sequences can have similar folds --- how can we detect this rapidly, in the context of structural genomics? High-throughput NMR experiments, coupled with novel algorithms for data analysis, can address this challenge. We report an automated procedure, called HD, for detecting 3D structural homologies from sparse, unassigned protein NMR data. Our method identifies 3D models in a protein structural database whose geometries best fit the unassigned experimental NMR data. HD does not use, and is thus not limited by sequence homology. The method can also be used to confirm or refute structural predictions made by other techniques such as protein threading or homology modelling. The algorithm runs in time, where is the number of proteins in the database, is the number of residues in the target protein and is the maximum edge weight in an integer-weighted bipartite graph. Our experiments on real NMR data from 3 different proteins against a database of 4,500 representative folds demonstrate that the method identifies closely related protein folds, including sub-domains of larger proteins, with as little as 10-30\% sequence homology between the target protein (or sub-domain) and the computed model. In particular, we report no false-negatives or false-positives despite significant percentages of missing experimental data
3D-Structural Homology Detection via Unassigned Residual Dipolar Couplings
Recognition of a protein\u27s fold provides valuable information about its function. While many sequence-based homology prediction methods exist, an important challenge remains: two highly dissimilar sequences can have similar folds --- how can we detect this rapidly, in the context of structural genomics? High-throughput NMR experiments, coupled with novel algorithms for data analysis, can address this challenge. We report an automated procedure for detecting 3D-structural homologies from sparse, unassigned protein NMR data. Our method identifies the 3D-structural models in a protein structural database whose geometries best fit the unassigned experimental NMR data. It does not use sequence information and is thus not limited by sequence homology. The method can also be used to confirm or refute structural predictions made by other techniques such as protein threading or sequence homology. The algorithm runs in O(pnk3) time, where p is the number of proteins in the database, n is the number of residues in the target protein, and k is the resolution of a rotation search. The method requires only uniform 15N-labelling of the protein and processes unassigned 1H-15N residual dipolar couplings, which can be acquired in a couple of hours. Our experiments on NMR data from 5 different proteins demonstrate that the method identifies closely related protein folds, despite low-sequence homology between the target protein and the computed model
Exploring behaviors of stochastic differential equation models of biological systems using change of measures
Stochastic Differential Equations (SDE) are often used to model the stochastic dynamics of biological systems. Unfortunately, rare but biologically interesting behaviors (e. g., oncogenesis) can be difficult to observe in stochastic models. Consequently, the analysis of behaviors of SDE models using numerical simulations can be challenging. We introduce a method for solving the following problem: given a SDE model and a high-level behavioral specification about the dynamics of the model, algorithmically decide whether the model satisfies the specification. While there are a number of techniques for addressing this problem for discrete-state stochastic models, the analysis of SDE and other continuous-state models has received less attention. Our proposed solution uses a combination of Bayesian sequential hypothesis testing, non-identically distributed samples, and Girsanov\u27s theorem for change of measures to examine rare behaviors. We use our algorithm to analyze two SDE models of tumor dynamics. Our use of non-identically distributed samples sampling contributes to the state of the art in statistical verification and model checking of stochastic models by providing an effective means for exposing rare events in SDEs, while retaining the ability to compute bounds on the probability that those events occur
Peptide Binding Classification on Quantum Computers
We conduct an extensive study on using near-term quantum computers for a task
in the domain of computational biology. By constructing quantum models based on
parameterised quantum circuits we perform sequence classification on a task
relevant to the design of therapeutic proteins, and find competitive
performance with classical baselines of similar scale. To study the effect of
noise, we run some of the best-performing quantum models with favourable
resource requirements on emulators of state-of-the-art noisy quantum
processors. We then apply error mitigation methods to improve the signal. We
further execute these quantum models on the Quantinuum H1-1 trapped-ion quantum
processor and observe very close agreement with noiseless exact simulation.
Finally, we perform feature attribution methods and find that the quantum
models indeed identify sensible relationships, at least as well as the
classical baselines. This work constitutes the first proof-of-concept
application of near-term quantum computing to a task critical to the design of
therapeutic proteins, opening the route toward larger-scale applications in
this and related fields, in line with the hardware development roadmaps of
near-term quantum technologies
Evaluation of expression and function of the H+/myo-inositol transporter HMIT;
BACKGROUND:
The phosphoinositide (PIns) signalling pathway regulates a series of neuronal processes, such as neurotransmitter release, that are thought to be altered in mood disorders. Furthermore, mood-stabilising drugs have been shown to inhibit key enzymes that regulate PIns production and alter neuronal growth cone morphology in an inositol-reversible manner. Here, we describe analyses of expression and function of the recently identified H+/myo-inositol transporter (HMIT) investigated as a potential regulator of PIns signalling.
RESULTS:
We show that HMIT is primarily a neuronal transporter widely expressed in the rat and human brain, with particularly high levels in the hippocampus and cortex, as shown by immunohistochemistry. The transporter is localised at the Golgi apparatus in primary cultured neurones. No HMIT-mediated electrophysiological responses were detected in rat brain neurones or slices; in addition, inositol transport and homeostasis were unaffected in HMIT targeted null-mutant mice.
CONCLUSION:
Together, these data do not support a role for HMIT as a neuronal plasma membrane inositol transporter, as previously proposed. However, we observed that HMIT can transport inositol triphosphate, indicating unanticipated intracellular functions for this transporter that may be relevant to mood control
Exploring behaviors of stochastic differential equation models of biological systems using change of measures
Stochastic Differential Equations (SDE) are often used to model the stochastic dynamics of biological systems. Unfortunately, rare but biologically interesting behaviors (e.g., oncogenesis) can be difficult to observe in stochastic models. Consequently, the analysis of behaviors of SDE models using numerical simulations can be challenging. We introduce a method for solving the following problem: given a SDE model and a high-level behavioral specification about the dynamics of the model, algorithmically decide whether the model satisfies the specification. While there are a number of techniques for addressing this problem for discrete-state stochastic models, the analysis of SDE and other continuous-state models has received less attention. Our proposed solution uses a combination of Bayesian sequential hypothesis testing, non-identically distributed samples, and Girsanov's theorem for change of measures to examine rare behaviors. We use our algorithm to analyze two SDE models of tumor dynamics. Our use of non-identically distributed samples sampling contributes to the state of the art in statistical verification and model checking of stochastic models by providing an effective means for exposing rare events in SDEs, while retaining the ability to compute bounds on the probability that those events occur
GENE-Counter: A Computational Pipeline for the Analysis of RNA-Seq Data for Gene Expression Differences
GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts
- …