Search CORE

83 research outputs found

Universality and predictability in molecular quantitative genetics

Author: Held Torsten
Lässig Michael
Nourmohammad Armita
Publication venue
Publication date: 01/01/2013
Field of study

Molecular traits, such as gene expression levels or protein binding affinities, are increasingly accessible to quantitative measurement by modern high-throughput techniques. Such traits measure molecular functions and, from an evolutionary point of view, are important as targets of natural selection. We review recent developments in evolutionary theory and experiments that are expected to become building blocks of a quantitative genetics of molecular traits. We focus on universal evolutionary characteristics: these are largely independent of a trait's genetic basis, which is often at least partially unknown. We show that universal measurements can be used to infer selection on a quantitative trait, which determines its evolutionary mode of conservation or adaptation. Furthermore, universality is closely linked to predictability of trait evolution across lineages. We argue that universal trait statistics extends over a range of cellular scales and opens new avenues of quantitative evolutionary systems biology

arXiv.org e-Print Archive

Kölner UniversitätsPublikationsServer

Adaptive evolution of molecular phenotypes

Author: Armita Nourmohammad
Chernyak V Y
Crow J F
Falconer D S
Fisher R
Lynch M
Lynch M
Lynch M
Michael Lässig
Muller H J
Neher R A
Nourmohammad A
Nourmohammad A
Torsten Held
Publication venue: 'IOP Publishing'
Publication date: 01/01/2014
Field of study

Molecular phenotypes link genomic information with organismic functions, fitness, and evolution. Quantitative traits are complex phenotypes that depend on multiple genomic loci. In this paper, we study the adaptive evolution of a quantitative trait under time-dependent selection, which arises from environmental changes or through fitness interactions with other co-evolving phenotypes. We analyze a model of trait evolution under mutations and genetic drift in a single-peak fitness seascape. The fitness peak performs a constrained random walk in the trait amplitude, which determines the time-dependent trait optimum in a given population. We derive analytical expressions for the distribution of the time-dependent trait divergence between populations and of the trait diversity within populations. Based on this solution, we develop a method to infer adaptive evolution of quantitative traits. Specifically, we show that the ratio of the average trait divergence and the diversity is a universal function of evolutionary time, which predicts the stabilizing strength and the driving rate of the fitness seascape. From an information-theoretic point of view, this function measures the macro-evolutionary entropy in a population ensemble, which determines the predictability of the evolutionary process. Our solution also quantifies two key characteristics of adapting populations: the cumulative fitness flux, which measures the total amount of adaptation, and the adaptive load, which is the fitness cost due to a population's lag behind the fitness peak.Comment: Figures are not optimally displayed in Firefo

arXiv.org e-Print Archive

Crossref

Kölner UniversitätsPublikationsServer

The size of the immune repertoire of bacteria

Author: Balasubramanian Vijay
Bradde Serena
Goyal Sidhartha
Nourmohammad Armita
Publication venue
Publication date: 01/03/2019
Field of study

Some bacteria and archaea possess an immune system, based on the CRISPR-Cas mechanism, that confers adaptive immunity against phage. In such species, individual bacteria maintain a "cassette" of viral DNA elements called spacers as a memory of past infections. The typical cassette contains a few dozen spacers. Given that bacteria can have very large genomes, and since having more spacers should confer a better memory, it is puzzling that so little genetic space would be devoted by bacteria to their adaptive immune system. Here, we identify a fundamental trade-off between the size of the bacterial immune repertoire and effectiveness of response to a given threat, and show how this tradeoff imposes a limit on the optimal size of the CRISPR cassette.Comment: 9 pages, 5 figure

arXiv.org e-Print Archive

MPG.PuRe

Holographic-(V)AE: an end-to-end SO(3)-Equivariant (Variational) Autoencoder in Fourier Space

Author: Nourmohammad Armita
Pun Michael N.
Visani Gian Marco
Publication venue
Publication date: 30/09/2022
Field of study

Group-equivariant neural networks have emerged as a data-efficient approach to solve classification and regression tasks, while respecting the relevant symmetries of the data. However, little work has been done to extend this paradigm to the unsupervised and generative domains. Here, we present Holographic-(V)AE (H-(V)AE), a fully end-to-end SO(3)-equivariant (variational) autoencoder in Fourier space, suitable for unsupervised learning and generation of data distributed around a specified origin. H-(V)AE is trained to reconstruct the spherical Fourier encoding of data, learning in the process a latent space with a maximally informative invariant embedding alongside an equivariant frame describing the orientation of the data. We extensively test the performance of H-(V)AE on diverse datasets and show that its latent space efficiently encodes the categorical features of spherical images and structural features of protein atomic environments. Our work can further be seen as a case study for equivariant modeling of a data distribution by reconstructing its Fourier encoding

arXiv.org e-Print Archive

H-Packer: Holographic Rotationally Equivariant Convolutional Neural Network for Protein Side-Chain Packing

Author: Galvin William
Nourmohammad Armita
Pun Michael Neal
Visani Gian Marco
Publication venue
Publication date: 28/11/2023
Field of study

Accurately modeling protein 3D structure is essential for the design of functional proteins. An important sub-task of structure modeling is protein side-chain packing: predicting the conformation of side-chains (rotamers) given the protein's backbone structure and amino-acid sequence. Conventional approaches for this task rely on expensive sampling procedures over hand-crafted energy functions and rotamer libraries. Recently, several deep learning methods have been developed to tackle the problem in a data-driven way, albeit with vastly different formulations (from image-to-image translation to directly predicting atomic coordinates). Here, we frame the problem as a joint regression over the side-chains' true degrees of freedom: the dihedral

\chi

angles. We carefully study possible objective functions for this task, while accounting for the underlying symmetries of the task. We propose Holographic Packer (H-Packer), a novel two-stage algorithm for side-chain packing built on top of two light-weight rotationally equivariant neural networks. We evaluate our method on CASP13 and CASP14 targets. H-Packer is computationally efficient and shows favorable performance against conventional physics-based algorithms and is competitive against alternative deep learning solutions.Comment: Accepted as a conference paper at MLCB 2023. 8 pages main body, 20 pages with appendix. 10 figure

arXiv.org e-Print Archive

Deep generative selection models of T and B cell receptor repertoires with soNNia

Author: Isacchini Giulio
Mora Thierry
Nourmohammad Armita
Walczak Aleksandra M
Publication venue
Publication date: 06/11/2020
Field of study

Subclasses of lymphocytes carry different functional roles to work together to produce an immune response and lasting immunity. Additionally to these functional roles, T and B-cell lymphocytes rely on the diversity of their receptor chains to recognize different pathogens. The lymphocyte subclasses emerge from common ancestors generated with the same diversity of receptors during selection processes. Here we leverage biophysical models of receptor generation with machine learning models of selection to identify specific sequence features characteristic of functional lymphocyte repertoires and subrepertoires. Specifically using only repertoire level sequence information, we classify CD4

^+

and CD8

^+

T-cells, find correlations between receptor chains arising during selection and identify T-cells subsets that are targets of pathogenic epitopes. We also show examples of when simple linear classifiers do as well as more complex machine learning methods

arXiv.org e-Print Archive

Hal-Diderot

MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood Inference from Sampled Trajectories

Author: Isacchini Giulio
Mora Thierry
Nourmohammad Armita
Spisak Natanael
Walczak Aleksandra M.
Publication venue
Publication date: 16/12/2021
Field of study

Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice. One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio, or equivalently the posterior function. We show that this approach can be formulated in terms of mutual information maximization between model parameters and simulated data. We use this equivalence to reinterpret existing approaches for amortized inference and propose two new methods that rely on lower bounds of the mutual information. We apply our framework to the inference of parameters of stochastic processes and chaotic dynamical systems from sampled trajectories, using artificial neural networks for posterior prediction. Our approach provides a unified framework that leverages the power of mutual information estimators for inference

arXiv.org e-Print Archive

MPG.PuRe