22 research outputs found
Recommended from our members
Predicting multibody assembly of proteins
textThis thesis addresses the multi-body assembly (MBA) problem in the context of protein assemblies. [...] In this thesis, we chose the protein assembly domain because accurate and reliable computational modeling, simulation and prediction of such assemblies would clearly accelerate discoveries in understanding of the complexities of metabolic pathways, identifying the molecular basis for normal health and diseases, and in the designing of new drugs and other therapeutics. [...] [We developed] F²Dock (Fast Fourier Docking) which includes a multi-term function which includes both a statistical thermodynamic approximation of molecular free energy as well as several of knowledge-based terms. Parameters of the scoring model were learned based on a large set of positive/negative examples, and when tested on 176 protein complexes of various types, showed excellent accuracy in ranking correct configurations higher (F² Dock ranks the correcti solution as the top ranked one in 22/176 cases, which is better than other unsupervised prediction software on the same benchmark). Most of the protein-protein interaction scoring terms can be expressed as integrals over the occupied volume, boundary, or a set of discrete points (atom locations), of distance dependent decaying kernels. We developed a dynamic adaptive grid (DAG) data structure which computes smooth surface and volumetric representations of a protein complex in O(m log m) time, where m is the number of atoms assuming that the smallest feature size h is [theta](r[subscript max]) where r[subscript max] is the radius of the largest atom; updates in O(log m) time; and uses O(m)memory. We also developed the dynamic packing grids (DPG) data structure which supports quasi-constant time updates (O(log w)) and spherical neighborhood queries (O(log log w)), where w is the word-size in the RAM. DPG and DAG together results in O(k) time approximation of scoring terms where k << m is the size of the contact region between proteins. [...] [W]e consider the symmetric spherical shell assembly case, where multiple copies of identical proteins tile the surface of a sphere. Though this is a restricted subclass of MBA, it is an important one since it would accelerate development of drugs and antibodies to prevent viruses from forming capsids, which have such spherical symmetry in nature. We proved that it is possible to characterize the space of possible symmetric spherical layouts using a small number of representative local arrangements (called tiles), and their global configurations (tiling). We further show that the tilings, and the mapping of proteins to tilings on arbitrary sized shells is parameterized by 3 discrete parameters and 6 continuous degrees of freedom; and the 3 discrete DOF can be restricted to a constant number of cases if the size of the shell is known (in terms of the number of protein n). We also consider the case where a coarse model of the whole complex of proteins are available. We show that even when such coarse models do not show atomic positions, they can be sufficient to identify a general location for each protein and its neighbors, and thereby restricts the configurational space. We developed an iterative refinement search protocol that leverages such multi-resolution structural data to predict accurate high resolution model of protein complexes, and successfully applied the protocol to model gp120, a protein on the spike of HIV and currently the most feasible target for anti-HIV drug design.Computer Science
2022 Review of Data-Driven Plasma Science
Data-driven science and technology offer transformative tools and methods to science. This review article highlights the latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS), i.e., plasma science whose progress is driven strongly by data and data analyses. Plasma is considered to be the most ubiquitous form of observable matter in the universe. Data associated with plasmas can, therefore, cover extremely large spatial and temporal scales, and often provide essential information for other scientific disciplines. Thanks to the latest technological developments, plasma experiments, observations, and computation now produce a large amount of data that can no longer be analyzed or interpreted manually. This trend now necessitates a highly sophisticated use of high-performance computers for data analyses, making artificial intelligence and machine learning vital components of DDPS. This article contains seven primary sections, in addition to the introduction and summary. Following an overview of fundamental data-driven science, five other sections cover widely studied topics of plasma science and technologies, i.e., basic plasma physics and laboratory experiments, magnetic confinement fusion, inertial confinement fusion and high-energy-density physics, space and astronomical plasmas, and plasma technologies for industrial and other applications. The final section before the summary discusses plasma-related databases that could significantly contribute to DDPS. Each primary section starts with a brief introduction to the topic, discusses the state-of-the-art developments in the use of data and/or data-scientific approaches, and presents the summary and outlook. Despite the recent impressive signs of progress, the DDPS is still in its infancy. This article attempts to offer a broad perspective on the development of this field and identify where further innovations are required
Silicon photovoltaics: experimental testing and modelling of fracture across scales
The study of the properties of materials can be addressed through a multi-scale approach, in order to have the possibility to grasp at each of the levels of analysis the peculiar aspects. Tracing a path inside the state-of-the-art in the available bibliography, historically in the field of mechanics s are found in which the material is studied through nonlocal theories based on continuous or discrete local approaches. More recently, with the advent of great computatio- nal power computers, analytical methodologies based on theories also very complex deriving from the field of chemistry and physics have been developed, capable to discretize at the ato- mic scale the material and study its behavior by applying energy approaches. Starting from the analysis of some of these theories at the nano- and micro-scales, it is possible to investi- gate the separation mechanisms at the molecular level, which may be considered as cracking processes within the material according to the adopted scale of analysis. The application of theories of this kind to large portions of material, in which there are up to some millions of particles involved is reasonably not an applicable solution, since it would require a huge effort in terms of computation time. To work around this problem and find a method suitable for the study of cracking mechanisms, a mixed method (MDFEM) was byconjugating pure molecu- lar dynamics (MD) and the finite element method (FEM), in which the material is discretized by means of one-dimensional elements whose mechanical characteristics are derived from MD. This approach is based on the application of a nonlocal theory in which the contribution of a portion of material placed within a certain distance from the point of fracture is taken into account by means of a parameter of non-locality. Moreover, the study of the evolution of cracking is addressed at the meso-scale by the application of a cohesive non-linear model.
Towards the analysis of the macroscale, the theories put forward so far have been ap- plied to the study of phenomena of breakage inside Silicon cells embedded into rigid or semi-flexible photovoltaic modules. By performing various laboratory tests, useful for the characterization of the material and for understating the evolution of cracking process due to multiple causes, a study on the main issues that may compromise the durability and mainte- nance of the expected service levels of photovoltaic panels has been conducted. Experimen- tally results have been interpreted by using appropriate macro-scopic continuum models. The research carried out had the purpose to provide an introduction to a correct design of these systems of energy production in order to increase their durability and resistance to cracking
MS FT-2-2 7 Orthogonal polynomials and quadrature: Theory, computation, and applications
Quadrature rules find many applications in science and engineering. Their analysis is a classical area of applied mathematics and continues to attract considerable attention. This seminar brings together speakers with expertise in a large variety of quadrature rules. It is the aim of the seminar to provide an overview of recent developments in the analysis of quadrature rules. The computation of error estimates and novel applications also are described
Generalized averaged Gaussian quadrature and applications
A simple numerical method for constructing the optimal generalized averaged Gaussian quadrature formulas will be presented. These formulas exist in many cases in which real positive GaussKronrod formulas do not exist, and can be used as an adequate alternative in order to estimate the error of a Gaussian rule. We also investigate the conditions under which the optimal averaged Gaussian quadrature formulas and their truncated variants are internal
Mass & secondary structure propensity of amino acids explain their mutability and evolutionary replacements
Why is an amino acid replacement in a protein accepted during evolution? The answer given by bioinformatics relies on the frequency of change of each amino acid by another one and the propensity of each to remain unchanged. We propose that these replacement rules are recoverable from the secondary structural trends of amino acids. A distance measure between high-resolution Ramachandran distributions reveals that structurally similar residues coincide with those found in substitution matrices such as BLOSUM: Asn Asp, Phe Tyr, Lys Arg, Gln Glu, Ile Val, Met → Leu; with Ala, Cys, His, Gly, Ser, Pro, and Thr, as structurally idiosyncratic residues. We also found a high average correlation (\overline{R} R = 0.85) between thirty amino acid mutability scales and the mutational inertia (I X ), which measures the energetic cost weighted by the number of observations at the most probable amino acid conformation. These results indicate that amino acid substitutions follow two optimally-efficient principles: (a) amino acids interchangeability privileges their secondary structural similarity, and (b) the amino acid mutability depends directly on its biosynthetic energy cost, and inversely with its frequency. These two principles are the underlying rules governing the observed amino acid substitutions. © 2017 The Author(s)
Recommended from our members
Atomistic modelling of precipitation in Ni-base superalloys
The presence of the ordered phase () in Ni-base superalloys is fundamental to the performance of engineering components such as turbine disks and blades which operate at high temperatures and loads. Hence for these alloys it is important to optimize their microstructure and phase composition. This is typically done by varying their chemistry and heat treatment to achieve an appropriate balance between content and other constituents such as carbides, borides, oxides and topologically close packed phases. In this work we have set out to investigate the onset of ordering in Ni-Al single crystals and in Ni-Al bicrystals containing coincidence site lattice grain boundaries (GBs) and we do this at high temperatures, which are representative of typical heat treatment schedules including quenching and annealing. For this we use the atomistic simulation methods of molecular dynamics (MD) and density functional theory (DFT).
In the first part of this work we develop robust Bayesian classifiers to identify the phase in large scale simulation boxes at high temperatures around 1500Â K. We observe significant \gamma^{\prime} ordering in the simulations in the form of clusters of -like ordered atoms embedded in a host solid solution and this happens within 100Â ns. Single crystals are found to exhibit the expected homogeneous ordering with slight indications of chemical composition change and a positive correlation between the Al concentration and the concentration of phase. In general, the ordering is found to take place faster in systems with GBs and preferentially adjacent to the GBs. The sole exception to this is the tilt GB, which is a coherent twin. An analysis of the ensemble and time lag average displacements of the GBs reveals mostly `anomalous diffusion' behaviour. Increasing the Al content from pure Ni to Ni 20Â at.% Al was found to either consistently increase or decrease the mobility of the GB as seen from the changing slope of the time lag displacement average. The movement of the GB can then be characterized as either `super' or `sub-diffusive' and is interpreted in terms of diffusion induced grain boundary migration, which is posited as a possible precursor to the appearance of serrated edge grain boundaries.
In the second part of this work we develop a method for the training of empirical interatomic potentials to capture more elements in the alloy system. We focus on the embedded atom method (EAM) and use the Ni-Al system as a test case. Recently, empirical potentials have been developed based on results from DFT which utilize energies and forces, but neglect the electron densities, which are also available. Noting the importance of electron densities, we propose a route to include them into the training of EAM-type potentials via Bayesian linear regression. Electron density models obtained for structures with a range of bonding types are shown to accurately reproduce the electron densities from DFT. Also, the resulting empirical potentials accurately reproduce DFT energies and forces of all the phases considered within the Ni-Al system. Properties not included in the training process, such as stacking fault energies, are sometimes not reproduced with the desired accuracy and the reasons for this are discussed. General regression issues, known to the machine learning community, are identified as the main difficulty facing further development of empirical potentials using this approach.EPSRC, Rolls-Royc