Search CORE

21 research outputs found

Quantitative Assessment of Molecular Dynamics Sampling for Flexible Systems

Author: Daniel Hoffmann (377414)
Mike Nemec (3698137)
Publication venue
Publication date
Field of study

Molecular dynamics (MD) simulation is a natural method for the study of flexible molecules but at the same time is limited by the large size of the conformational space of these molecules. We ask by how much the MD sampling quality for flexible molecules can be improved by two means: the use of diverse sets of trajectories starting from different initial conformations to detect deviations between samples and sampling with enhanced methods such as accelerated MD (aMD) or scaled MD (sMD) that distort the energy landscape in controlled ways. To this end, we test the effects of these approaches on MD simulations of two flexible biomolecules in aqueous solution, Met-Enkephalin (5 amino acids) and HIV-1 gp120 V3 (a cycle of 35 amino acids). We assess the convergence of the sampling quantitatively with known, extensive measures of cluster number Nc and cluster distribution entropy Sc and with two new quantities, conformational overlap Oconf and density overlap Odens, both conveniently ranging from 0 to 1. These new overlap measures quantify self-consistency of sampling in multitrajectory MD experiments, a necessary condition for converged sampling. A comprehensive assessment of sampling quality of MD experiments identifies the combination of diverse trajectory sets and aMD as the most efficient approach among those tested. However, analysis of Odens between conventional and aMD trajectories also reveals that we have not completely corrected aMD sampling for the distorted energy landscape. Moreover, for V3, the courses of Nc and Odens indicate that much higher resources than those generally invested today will probably be needed to achieve convergence. The comparative analysis also shows that conventional MD simulations with insufficient sampling can be easily misinterpreted as being converged

FigShare

Comparison of statistical indicators of association.

Author: Bettina Budeus (820725)
Daniel Hoffmann (377414)
Jörg Timm (247565)
Publication venue
Publication date
Field of study

200 random contingency tables with total count N = 100, a typical order of magnitude for analyses of sequence-feature association in practice, are analyzed by Fisher’s exact test, yielding p values for the rejection of independence (horizontal axis, not corrected for multiple testing), and by four different BF models, namely K = 1, K = 100, KD, and uniform model, with corresponding BFs on vertical axis. Solid horizontal black line at BF = 1 and dashed vertical line at p = 0.05 for orientation.</p

FigShare

Odds-ratio plot and Tartan plot for visualization of statistical associations.

Author: Bettina Budeus (820725)
Daniel Hoffmann (377414)
Jörg Timm (247565)
Publication venue
Publication date
Field of study

A Odds-ratio plot, based on an alignment of region of HIV-1 gp120 around the V3 loop (C296-C331). Here, the feature is the predicted co-receptor tropism of HIV-1 [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146409#pone.0146409.ref017" target="_blank">17</a>] (R5 vs. X4 tropic). Bar heights and colors indicate logarithms of odds ratios and negative logarithms of p values, respectively. A reference sequence and sequence positions can be added in the top and bottom rows for orientation. B Tartan plot for the synopsis of two alignment pair association measures, here: −log p from association test between alignment position pairs (upper right triangle) vs. Direct Information between these pairs (lower left triangle). Association strengths are color coded (color legend on the right). For orientation, axes can be annotated and sequence substructures can be indicated by lines.</p

FigShare

Comparison of frequentist approach and Bayes factors (BF).

Author: Bettina Budeus (820725)
Daniel Hoffmann (377414)
Jörg Timm (247565)
Publication venue
Publication date
Field of study

Discovery of association of alignment positions of HBV core proteins with patient HLA types, here: A*01 (top row) and B*44 (bottom row). Sequence numbers in panel titles are feature-carrying fractions of the total of 148 sequences included in the alignment. Association of sequences with feature HLA were analyzed by Fisher’s exact test (panels A, D), BF with K = 1 (panels B, E), and BF with KD (panels C, F). Alignment positions with association above certain thresholds (horizontal dashed lines) are marked by red stars and vertical dashed lines, namely p < 0.01 (A, D), or BF > 10 (B, C, E, F). The p values and BFs shown are the best for each alignment position (lowest p values, highest BFs).</p

FigShare

Phylogenetic distribution of feature-carrying sequences and phylogenetic bias indicator B.

Author: Bettina Budeus (820725)
Daniel Hoffmann (377414)
Jörg Timm (247565)
Publication venue
Publication date
Field of study

The distance-based phylogenetic tree in all six panels was computed for the same set of 788 East Asian HIV-1 gag protein sequences obtained from the HIV sequence database at <a href="http://www.hiv.lanl.gov" target="_blank">http://www.hiv.lanl.gov</a>. In each panel, those branches are colored red that correspond to sequences that carry an amino acid substitution apparently associated with a certain HLA type. The numbers to the upper right of each tree are the corresponding values of the bias indicator B, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146409#pone.0146409.e005" target="_blank">Eq (4)</a>.</p

FigShare

Broken stick distribution (solid line) and NRADs of IgG+CD27+ fractions (points).

Author: Anja Lange (820723)
Astrid M. Westendorf (360708)
Bettina Budeus (820725)
Daniel Hoffmann (377414)
Farnoush Farahpour (3692227)
Marc Seifert (260083)
Mohammadkarim Saeedghalati (3692224)
Ralf Küppers (260072)
Publication venue
Publication date
Field of study

Inset: section of hierarchical clustering dendrogram where broken stick distribution appears. This plot adopts the usual presentation of the broken stick distribution in the literature with linear horizontal axis and logarithmic vertical axis. Therefore the boomerang shapes of the log-log <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005362#pcbi.1005362.g004" target="_blank">Fig 4</a> appear horizontally stretched.</p

FigShare

Averaged NRADs of gut microbiome data in six age groups.

Author: Anja Lange (820723)
Astrid M. Westendorf (360708)
Bettina Budeus (820725)
Daniel Hoffmann (377414)
Farnoush Farahpour (3692227)
Marc Seifert (260083)
Mohammadkarim Saeedghalati (3692224)
Ralf Küppers (260072)
Publication venue
Publication date
Field of study

The number of NRADs per group from youngest to oldest were 9, 18, 55, 64, 34, and 309, respectively. Solid lines are mean NRADs, shaded areas are 90% confidence intervals for the means.</p

FigShare

Robustness of NRADs against varying sampling depth.

Author: Anja Lange (820723)
Astrid M. Westendorf (360708)
Bettina Budeus (820725)
Daniel Hoffmann (377414)
Farnoush Farahpour (3692227)
Marc Seifert (260083)
Mohammadkarim Saeedghalati (3692224)
Ralf Küppers (260072)
Publication venue
Publication date
Field of study

(A) original RAD of first sample of [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005362#pcbi.1005362.ref031" target="_blank">31</a>] (black) and down-sampled RAD (red). (B) the two NRADs obtained by MaxRank normalization to R = 1000 of the RADs in panel A are almost indistinguishable. (C) comparison of NRAD distances of the first 50 samples of the data set of [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1005362#pcbi.1005362.ref031" target="_blank">31</a>]. Left violin plot: density of distances between NRADs computed by MaxRank normalization to R = 1000 of the original RADs; middle violin plot: same for down-sampled RADs; right violin plot: distances between corresponding original and down-sampled NRADs. The biologically meaningful NRAD distance distributions are robust against differences in sample size (left and middle violin). In comparison, the distances related to differences in sample size are negligible (right violin).</p

FigShare

Diversity of the VH region of BCRs.

Author: Anja Lange (820723)
Astrid M. Westendorf (360708)
Bettina Budeus (820725)
Daniel Hoffmann (377414)
Farnoush Farahpour (3692227)
Marc Seifert (260083)
Mohammadkarim Saeedghalati (3692224)
Ralf Küppers (260072)
Publication venue
Publication date
Field of study

(A) The human genome contains sets of VH, DH, and JH gene segments. (B) The “variable” VH segments can be grouped into seven VH families based on sequence similarity. (C) A genetically diverse pool of B cells is generated by V(D)J recombination. (D) Exposure to antigens induces an adaptation of the BCR repertoire, generating genetic variants and changing the usage pattern of VH gene segments.</p

FigShare

General process employed in this work.

Author: Anja Lange (820723)
Astrid M. Westendorf (360708)
Bettina Budeus (820725)
Daniel Hoffmann (377414)
Farnoush Farahpour (3692227)
Marc Seifert (260083)
Mohammadkarim Saeedghalati (3692224)
Ralf Küppers (260072)
Publication venue
Publication date
Field of study

Flowchart of procedure from original species/abundances or sequence/reads data (top box) to original RADs, then to NRADs, and analyses based on NRADs.</p

FigShare

Quantitative Assessment of Molecular Dynamics Sampling for Flexible Systems

Comparison of statistical indicators of association.

Odds-ratio plot and Tartan plot for visualization of statistical associations.

Comparison of frequentist approach and Bayes factors (BF).

Phylogenetic distribution of feature-carrying sequences and phylogenetic bias indicator <i>B</i>.

Broken stick distribution (solid line) and NRADs of <i>IgG</i><sup>+</sup><i>CD</i>27<sup>+</sup> fractions (points).

Averaged NRADs of gut microbiome data in six age groups.

Robustness of NRADs against varying sampling depth.

Diversity of the <i>V</i><sub><i>H</i></sub> region of BCRs.

General process employed in this work.