Search CORE

5 research outputs found

Computing and visually analyzing mutual information in molecular co-evolution

Author: B Shneiderman
CM Buslje
D MacKay
DA Keim
F Hoffgaard
FM Codoñer
G Gloor
H Levkowitz
HM Wu
J Thomas
K Hamacher
Kay Hamacher
P Weil
Patrick Boba
R Gouveia-Oliveira
SC Card
Sebastian Bremm
Stephanie Held
T Lengauer
Tobias Schreck
W Humphrey
Y Bourne
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Selective pressure in molecular evolution leads to uneven distributions of amino acids and nucleotides. In fact one observes correlations among such constituents due to a large number of biophysical mechanisms (folding properties, electrostatics, ...). To quantify these correlations the mutual information -after proper normalization - has proven most effective. The challenge is to navigate the large amount of data, which in a study for a typical protein cannot simply be plotted. Results To visually analyze mutual information we developed a matrix visualization tool that allows different views on the mutual information matrix: filtering, sorting, and weighting are among them. The user can interactively navigate a huge matrix in real-time and search e.g., for patterns and unusual high or low values. A computation of the mutual information matrix for a sequence alignment in FASTA-format is possible. The respective stand-alone program computes in addition proper normalizations for a null model of neutral evolution and maps the mutual information to <it>Z</it>-scores with respect to the null model. Conclusions The new tool allows to compute and visually analyze sequence data for possible co-evolutionary signals. The tool has already been successfully employed in evolutionary studies on HIV1 protease and acetylcholinesterase. The functionality of the tool was defined by users using the tool in real-world research. The software can also be used for visual analysis of other matrix-like data, such as information obtained by DNA microarray experiments. The package is platform-independently implemented in <monospace>Java</monospace> and free for academic use under a GPL license.</p

KOPS - The Institutional Repository of the University of Konstanz

TUbiblio

Crossref

Springer

Springer - Publisher Connector

Directory of Open Access Journals

Fraunhofer-ePrints

PubMed Central

Structural Organization of DNA in Chlorella Viruses

Author: A Cordova
A Ore
Anna Moroni
BN Ames
Christian Polzer
CJ Funk
D Raoult
David D. Dunigan
EV Koonin
F Frohns
G Thiel
Gerhard Thiel
Gregor Hiepler
H Pfister
I Agarkova
J Kindt
James L. Van Etten
Jianming Qiu
JL Van Etten
K Hamacher
KA Tweeten
Kay Hamacher
KB Tan
LM Iyer
LM Iyer
LP Villarreal
M Neupärtl
M Wang
Marco Lolicato
MC Cavalcanti
MV Cherrier
P Grayson
P Grayson
P Serwer
PJ Cock
R Hass
R Hoshina
RH Meints
RH Meints
Robert Shoeman
S Bremm
S Mangenot
SR Casjens
SW Schneider
T Greiner
T Yamada
Timo Wulfmeyer
Tobias Meckel
VA Bloomfield
WC Earnshaw
X Yan
X Zhang
Y Ishihama
Y Kuznetsov
YG Kuznetsov
YG Kuznetsov
Z Li
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Chlorella viruses have icosahedral capsids with an internal membrane enclosing their large dsDNA genomes and associated proteins. Their genomes are packaged in the particles with a predicted DNA density of ca. 0.2 bp nm−3. Occasionally infection of an algal cell by an individual particle fails and the viral DNA is dynamically ejected from the capsid. This shows that the release of the DNA generates a force, which can aid in the transfer of the genome into the host in a successful infection. Imaging of ejected viral DNA indicates that it is intimately associated with proteins in a periodic fashion. The bulk of the protein particles detected by atomic force microscopy have a size of ∼60 kDa and two proteins (A278L and A282L) of about this size are among 6 basic putative DNA binding proteins found in a proteomic analysis of DNA binding proteins packaged in the virion. A combination of fluorescence images of ejected DNA and a bioinformatics analysis of the DNA reveal periodic patterns in the viral DNA. The periodic distribution of GC rich regions in the genome provides potential binding sites for basic proteins. This DNA/protein aggregation could be responsible for the periodic concentration of fluorescently labeled DNA observed in ejected viral DNA. Collectively the data indicate that the large chlorella viruses have a DNA packaging strategy that differs from bacteriophages; it involves proteins and share similarities to that of chromatin structure in eukaryotes

CiteSeerX

Public Library of Science (PLOS)

TUbiblio

Crossref

DigitalCommons@University of Nebraska

AIR Universita degli studi di Milano

Archivio Istituzionale della Ricerca - Università degli Studi di Pavia

Directory of Open Access Journals

PubMed Central

MPG.PuRe

FigShare

Coupled mutation finder: A new entropy-based method quantifying phylogenetic noise for the detection of compensatory mutations

Author: Gültas M. (Mehmet)
Haubrock M. (Martin)
Tüysüz N. (Nesrin)
Waack S. (Stephan)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/09/2012
Field of study

Background: The detection of significant compensatory mutation signals in multiple sequence alignments (MSAs) is often complicated by noise. A challenging problem in bioinformatics is remains the separation of significant signals between two or more non-conserved residue sites from the phylogenetic noise and unrelated pair signals. Determination of these non-conserved residue sites is as important as the recognition of strictly conserved positions for understanding of the structural basis of protein functions and identification of functionally important residue regions. In this study, we developed a new method, the Coupled Mutation Finder (CMF) quantifying the phylogenetic noise for the detection of compensatory mutations.Results: To demonstrate the effectiveness of this method, we analyzed essential sites of two human proteins: epidermal growth factor receptor (EGFR) and glucokinase (GCK). Our results suggest that the CMF is able to separate significant compensatory mutation signals from the phylogenetic noise and unrelated pair signals. The vast majority of compensatory mutation sites found by the CMF are related to essential sites of both proteins and they are likely to affect protein stability or functionality.Conclusions: The CMF is a new method, which includes an MSA-specific statistical model based on multiple testing procedures that quantify the error made in terms of the false discovery rate and a novel entropy-based metric to upscale BLOSUM62 dissimilar compensatory mutations. Therefore, it is a helpful tool to predict and investigate compensatory mutation sites of structural or functional importance in proteins. We suggest that the CMF could be used as a novel automated function prediction tool that is required for a better understanding of the structural basis of proteins. The CMF server is freely accessible at http://cmf.bioinf.med.uni-goettingen.de

Crossref

Springer - Publisher Connector

PubMed Central

Erasmus University Digital Repository

MIA: Mutual Information Analyzer, a graphic user interface program that calculates entropy, vertical and horizontal mutual information of molecular sequence sets

Author: AD Johnson
C Adami
CE Shannon
DM Endres
Fernando Antoneli
Flavio Lichtenstein
G Santos
H Herzel
H Herzel
I Grosse
I Grosse
K Linde van der
K Tamura
LC Martin
M Gouy
M Kearse
Marcelo R. S. Briones
MS Roulston
P Carpena
P Weil
R Gouveia-Oliveira
R Steuer
RC Edgar
RC Ferreira
S Bremm
TA Hall
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Development of novel Classical and Quantum Information Theory Based Methods for the Detection of Compensatory Mutations in MSAs

Author: Gültas Mehmet
Publication venue
Publication date: 18/09/2013
Field of study

Multiple Sequenzalignments (MSAs) von homologen Proteinen sind nützliche Werkzeuge, um kompensatorische Mutationen zwischen nicht-konservierten Residuen zu charakterisieren. Die Identifizierung dieser Residuen in MSAs ist eine wichtige Aufgabe um die strukturellen Grundlagen und molekularen Mechanismen von Proteinfunktionen besser zu verstehen. Trotz der vielen Anzahl an Literatur über kompensatorische Mutationen sowie über die Sequenzkonservierungsanalyse für die Erkennung von wichtigen Residuen, haben vorherige Methoden meistens die biochemischen Eigenschaften von Aminosäuren nicht mit in Betracht gezogen, welche allerdings entscheidend für die Erkennung von kompensatorischen Mutationssignalen sein können. Jedoch werden kompensatorische Mutationssignale in MSAs oft durch das Rauschen verfälscht. Aus diesem Grund besteht ein weiteres Problem der Bioinformatik in der Trennung signifikanter Signale vom phylogenetischen Rauschen und beziehungslosen Paarsignalen. Das Ziel dieser Arbeit besteht darin Methoden zu entwickeln, welche biochemische Eigenschaften wie Ähnlichkeiten und Unähnlichkeiten von Aminosäuren in der Identifizierung von kompensatorischen Mutationen integriert und sich mit dem Rauschen auseinandersetzt. Deshalb entwickeln wir unterschiedliche Methoden basierend auf klassischer- und quantum Informationstheorie sowie multiple Testverfahren. Unsere erste Methode basiert auf der klassischen Informationstheorie. Diese Methode betrachtet hauptsächlich BLOSUM62-unähnliche Paare von Aminosäuren als ein Modell von kompensatorischen Mutationen und integriert sie in die Identifizierung von wichtigen Residuen. Um diese Methode zu ergänzen, entwickeln wir unsere zweite Methode unter Verwendung der Grundlagen von quantum Informationstheorie. Diese neue Methode unterscheidet sich von der ersten Methode durch gleichzeitige Modellierung ähnlicher und unähnlicher Signale in der kompensatorischen Mutationsanalyse. Des Weiteren, um signifikante Signale vom Rauschen zu trennen, entwickeln wir ein MSA-spezifisch statistisches Modell in Bezug auf multiple Testverfahren. Wir wenden unsere Methode für zwei menschliche Proteine an, nämlich epidermal growth factor receptor (EGFR) und glucokinase (GCK). Die Ergebnisse zeigen, dass das MSA-spezifisch statistische Modell die signifikanten Signale vom phylogenetischen Rauschen und von beziehungslosen Paarsignalen trennen kann. Nur unter Berücksichtigung BLOSUM62-unähnlicher Paare von Aminosäuren identifiziert die erste Methode erfolgreich die krankheits-assoziierten wichtigen Residuen der beiden Proteine. Im Gegensatz dazu, durch die gleichzeitige Modellierung ähnlicher und unähnlicher Signale von Aminosäurepaare ist die zweite Methode sensibler für die Identifizierung von katalytischen und allosterischen Residuen

Georg-August-University Göttingen