14 research outputs found
An Exact Algorithm for Side-Chain Placement in Protein Design
Computational protein design aims at constructing novel or improved functions
on the structure of a given protein backbone and has important applications in
the pharmaceutical and biotechnical industry. The underlying combinatorial
side-chain placement problem consists of choosing a side-chain placement for
each residue position such that the resulting overall energy is minimum. The
choice of the side-chain then also determines the amino acid for this position.
Many algorithms for this NP-hard problem have been proposed in the context of
homology modeling, which, however, reach their limits when faced with large
protein design instances.
In this paper, we propose a new exact method for the side-chain placement
problem that works well even for large instance sizes as they appear in protein
design. Our main contribution is a dedicated branch-and-bound algorithm that
combines tight upper and lower bounds resulting from a novel Lagrangian
relaxation approach for side-chain placement. Our experimental results show
that our method outperforms alternative state-of-the art exact approaches and
makes it possible to optimally solve large protein design instances routinely
Big data opportunities and challenges for assessing multiple stressors across scales in aquatic ecosystems
Aquatic ecosystems are under threat from multiple stressors, which vary in distribution and intensity across temporal and spatial scales. Monitoring and assessment of these ecosystems have historically focussed on collection of physical and chemical information and increasingly include associated observations on biological condition. However, ecosystem assessment is often lacking because the scale and quality of biological observations frequently fail to match those available from physical and chemical measurements. The advent of high-performance computing, coupled with new earth observation platforms, has accelerated the adoption of molecular and remote sensing tools in ecosystem assessment. To assess how emerging science and tools can be applied to study multiple stressors on a large (ecosystem) scale and to facilitate greater integration of approaches among different scientific disciplines, a workshop was held on 10-12 September 2014 at the Sydney Institute of Marine Sciences, Australia. Here we introduce a conceptual framework for assessing multiple stressors across ecosystems using emerging sources of big data and critique a range of available big-data types that could support models for multiple stressors. We define big data as any set or series of data, which is either so large or complex, it becomes difficult to analyse using traditional data analysis methods
A Bayesian approach for determining protein side-chain rotamer conformations using unassigned NOE data
Abstract. A major bottleneck in protein structure determination via nuclear magnetic resonance (NMR) is the lengthy and laborious process of assigning resonances and nuclear Overhauser effect (NOE) cross peaks. Recent studies have shown that accurate backbone folds can be determined using sparse NMR data, such as residual dipolar couplings (RDCs) or backbone chemical shifts. This opens a question of whether we can also determine the accurate protein sidechain conformations using sparse or unassigned NMR data. We attack this question by using unassigned nuclear Overhauser effect spectroscopy (NOESY) data, which record the through-space dipolar interactions between protons nearby in 3D space. We propose a Bayesian approach with a Markov random field (MRF) model to integrate the likelihood function derived from observed experimental data, with prior information (i.e., empirical molecular mechanics energies) about the protein structures. We unify the side-chain structure prediction problem with the side-chain structure determination problem using unassigned NMR data, and apply the deterministic dead-end elimination (DEE) and A * search algorithms t