Search CORE

98,376 research outputs found

On the entropy of protein families

Author: Barton John
Chakraborty Arup
Cocco Simona
Jacquin Hugo
Monasson Rémi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/12/2015
Field of study

Proteins are essential components of living systems, capable of performing a huge variety of tasks at the molecular level, such as recognition, signalling, copy, transport, ... The protein sequences realizing a given function may largely vary across organisms, giving rise to a protein family. Here, we estimate the entropy of those families based on different approaches, including Hidden Markov Models used for protein databases and inferred statistical models reproducing the low-order (1-and 2-point) statistics of multi-sequence alignments. We also compute the entropic cost, that is, the loss in entropy resulting from a constraint acting on the protein, such as the fixation of one particular amino-acid on a specific site, and relate this notion to the escape probability of the HIV virus. The case of lattice proteins, for which the entropy can be computed exactly, allows us to provide another illustration of the concept of cost, due to the competition of different folds. The relevance of the entropy in relation to directed evolution experiments is stressed.Comment: to appear in Journal of Statistical Physic

arXiv.org e-Print Archive

DSpace@MIT

Hal-Diderot

The Roles of Gene Duplication, Gene Conversion and Positive Selection in Rodent \u3ci\u3eEsp\u3c/i\u3e and \u3ci\u3eMup\u3c/i\u3e Pheromone Gene Families with Comparison to the \u3ci\u3eAbp\u3c/i\u3e Family

Author: Karn Robert C.
Laukaitis Christina M.
Publication venue: Digital Commons @ Butler University
Publication date: 01/01/2012
Field of study

Three proteinaceous pheromone families, the androgen-binding proteins (ABPs), the exocrine-gland secreting peptides (ESPs) and the major urinary proteins (MUPs) are encoded by large gene families in the genomes of Mus musculus and Rattus norvegicus. We studied the evolutionary histories of the Mup and Esp genes and compared them with what is known about the Abp genes. Apparently gene conversion has played little if any role in the expansion of the mouse Class A and Class B Mup genes and pseudogenes, and the rat Mups. By contrast, we found evidence of extensive gene conversion in many Esp genes although not in all of them. Our studies of selection identified at least two amino acid sites in β-sheets as having evolved under positive selection in the mouse Class A and Class B MUPs and in rat MUPs. We show that selection may have acted on the ESPs by determining Ka/Ks for Exon 3 sequences with and without the converted sequence segment. While it appears that purifying selection acted on the ESP signal peptides, the secreted portions of the ESPs probably have undergone much more rapid evolution. When the inner gene converted fragment sequences were removed, eleven Esp paralogs were present in two or more pairs with Ka/Ks \u3e1.0 and thus we propose that positive selection is detectable by this means in at least some mouse Esp paralogs. We compare and contrast the evolutionary histories of all three mouse pheromone gene families in light of their proposed functions in mouse communication

CiteSeerX

Directory of Open Access Journals

PubMed Central

Digital Commons @ Butler University

FigShare

Family-specific scaling laws in bacterial genomes

Author: de Lazzari Eleonora
Grilli Jacopo
Lagomarsino Marco Cosentino
Maslov Sergei
Publication venue
Publication date: 01/01/2017
Field of study

Among several quantitative invariants found in evolutionary genomics, one of the most striking is the scaling of the overall abundance of proteins, or protein domains, sharing a specific functional annotation across genomes of given size. The size of these functional categories change, on average, as power-laws in the total number of protein-coding genes. Here, we show that such regularities are not restricted to the overall behavior of high-level functional categories, but also exist systematically at the level of single evolutionary families of protein domains. Specifically, the number of proteins within each family follows family-specific scaling laws with genome size. Functionally similar sets of families tend to follow similar scaling laws, but this is not always the case. To understand this systematically, we provide a comprehensive classification of families based on their scaling properties. Additionally, we develop a quantitative score for the heterogeneity of the scaling of families belonging to a given category or predefined group. Under the common reasonable assumption that selection is driven solely or mainly by biological function, these findings point to fine-tuned and interdependent functional roles of specific protein domains, beyond our current functional annotations. This analysis provides a deeper view on the links between evolutionary expansion of protein families and the functional constraints shaping the gene repertoire of bacterial genomes.Comment: 41 pages, 16 figure

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

The regulation of differentiation in mesenchymal stem cells

Author: Augello Andrea
De Bari Cosimo
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 30/08/2010
Field of study

Peer reviewedPublisher PD

Aberdeen University Research

Inverse Statistical Physics of Protein Sequences: A Key Issues Review

Author: Cocco Simona
Feinauer Christoph
Figliuzzi Matteo
Monasson Remi
Weigt Martin
Publication venue: 'IOP Publishing'
Publication date: 03/03/2017
Field of study

In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e.~evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.Comment: 18 pages, 7 figure

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Bocconi

HAL-Inserm

HAL-INSU

High-resolution temporal profiling of transcripts during Arabidopsis leaf senescence reveals a distinct chronology of processes and regulation

Author: Beynon Jim
Breeze Emily
Buchanan-Wollaston Vicky
Denby Katherine J.
Harrison Elizabeth
Hickman Richard J.
Hill Claire
Hughes Linda Karen
Jackson Stephen D.
Jenkins Dafyd J.
Jenner Carol E.
Kiddle Steven J.
Kim Youn-sung
Legaie Roxane
McHattie Stuart
Mead A.
Moore Jonathan D.
Morris Karl
Ott Sascha
Penfold Christopher A.
Rand D. A.
Tabrett Alex
Thomas Brian
Wild David L.
Zhang Cunjin
Publication venue: 'American Society of Plant Biologists (ASPB)'
Publication date: 01/03/2011
Field of study

Leaf senescence is an essential developmental process that impacts dramatically on crop yields and involves altered regulation of thousands of genes and many metabolic and signaling pathways, resulting in major changes in the leaf. The regulation of senescence is complex, and although senescence regulatory genes have been characterized, there is little information on how these function in the global control of the process. We used microarray analysis to obtain a highresolution time-course profile of gene expression during development of a single leaf over a 3-week period to senescence. A complex experimental design approach and a combination of methods were used to extract high-quality replicated data and to identify differentially expressed genes. The multiple time points enable the use of highly informative clustering to reveal distinct time points at which signaling and metabolic pathways change. Analysis of motif enrichment, as well as comparison of transcription factor (TF) families showing altered expression over the time course, identify clear groups of TFs active at different stages of leaf development and senescence. These data enable connection of metabolic processes, signaling pathways, and specific TF activity, which will underpin the development of network models to elucidate the process of senescence

Crossref

PubMed Central

Warwick Research Archives Portal Repository

King's Research Portal

Skewed Factor Models Using Selection Mechanisms

Author: Arellano-Valle Reinaldo B.
Genton Marc G.
Kim Hyoung-Moon
Maadooliat Mehdi
Publication venue: e-Publications@Marquette
Publication date: 01/03/2016
Field of study

Traditional factor models explicitly or implicitly assume that the factors follow a multivariate normal distribution; that is, only moments up to order two are involved. However, it may happen in real data problems that the first two moments cannot explain the factors. Based on this motivation, here we devise three new skewed factor models, the skew-normal, the skew-t, and the generalized skew-normal factor models depending on a selection mechanism on the factors. The ECME algorithms are adopted to estimate related parameters for statistical inference. Monte Carlo simulations validate our new models and we demonstrate the need for skewed factor models using the classic open/closed book exam scores dataset

epublications@Marquette

FOXL2 mutations in Indian families with blepharophimosis-ptosis-epicanthus inversus syndrome

Author: De Baere Elfride
Jitendra Jethani
Nallathambi Jeyabalan
Neethirajan Guruswamy
Sundaresan Periasamy
Usha Kim
Publication venue
Publication date: 01/01/2007
Field of study

Ghent University Academic Bibliography

Spinocerebellar Ataxia Type 2

Author: Auburger Georg
Freund Hans-Joachim
Rodríguez-Labrada Roberto
Velázquez-Pérez Luis
Publication venue
Publication date: 01/01/2012
Field of study

1. Introduction: The autosomal dominant cerebellar ataxias (ADCA) are a clinically, pathologically and genetically heterogeneous group of neurodegenerative disorders caused by degeneration of cerebellum and its afferent and efferent connections. The degenerative process may additionally involves the ponto- medullar systems, pyramidal tracts, basal ganglia, cerebral cortex, peripheral nerves (ADCA I) and the retina (ADCA II), or can be limited to the cerebellum (ADCA III) (Harding et al., 1993). The most common of these dominantly inherited autosomal ataxias, ADCA I, includes many Spinocerebellar Ataxias (SCA) subtypes, some of which are caused by pathological CAG trinucleotide repeat expansion in the coding region on the mutated gene. Such is the case for SCA1, SCA2, SCA3/MJD, SCA6, SCA7, SCA17 and Dentatorubral-pallidoluysian atrophy (DRPLA) (Matilla et al., 2006). Among the almost 30 SCAs, the variant SCA2 is the second most prevalent subtype worldwide, only surpassed by SCA3 (Schöls et al., 2004; Matilla et al., 2006; Auburger, 2011)..

IntechOpen

Crossref

Hochschulschriftenserver - Universität Frankfurt am Main