18 research outputs found
Gpachov at CheckThat! 2023: A Diverse Multi-Approach Ensemble for Subjectivity Detection in News Articles
The wide-spread use of social networks has given rise to subjective,
misleading, and even false information on the Internet. Thus, subjectivity
detection can play an important role in ensuring the objectiveness and the
quality of a piece of information. This paper presents the solution built by
the Gpachov team for the CLEF-2023 CheckThat! lab Task~2 on subjectivity
detection. Three different research directions are explored. The first one is
based on fine-tuning a sentence embeddings encoder model and dimensionality
reduction. The second one explores a sample-efficient few-shot learning model.
The third one evaluates fine-tuning a multilingual transformer on an altered
dataset, using data from multiple languages. Finally, the three approaches are
combined in a simple majority voting ensemble, resulting in 0.77 macro F1 on
the test set and achieving 2nd place on the English subtask
Characterizing RNA ensembles from NMR data with kinematic models
International audienceFunctional mechanisms of biomolecules often manifest themselves precisely in transient conformational substates. Researchers have long sought to structurally characterize dynamic processes in non-coding RNA, combining experimental data with computer algorithms. However, adequate exploration of conformational space for these highly dynamic molecules, starting from static crystal structures, remains challenging. Here, we report a new conformational sampling procedure, KGSrna, which can efficiently probe the native ensemble of RNA molecules in solution. We found that KGSrna ensembles accurately represent the conformational landscapes of 3D RNA encoded by NMR proton chemical shifts. KGSrna resolves motionally averaged NMR data into structural contributions; when coupled with residual dipolar coupling data, a KGSrna ensemble revealed a previously uncharacterized transient excited state of the HIV-1 trans-activation response element stem-loop. Ensemble-based interpretations of averaged data can aid in formulating and testing dynamic, motion-based hypotheses of functional mechanisms in RNAs with broad implications for RNA engineering and therapeutic intervention
Direction of displacement of the AH-domain.
<p><b>a)</b> Relative frequency of the angles between KGS and iMC average displacements for each C<sub><i>α</i></sub> of the AH-domain in the inactive (grey) and active states (orange/yellow). The average AH-domain displacement is shifted by 50 to 60 degrees for the states between the two methods. <b>b)</b> Differences in the directions of the mean displacement of the center of geometry of the G<i>α</i> AH-domain between the KGS (red) and iMC (blue) ensembles. <b>c)</b> Directionality of the CA displacements of the <i>α</i><sub>5</sub>-helix in the KGS (red) and iMC (blue) ensembles in the active state (top) and inactive state (bottom). The motion in the KGS ensemble is directed from the inactive conformation of <i>α</i><sub>5</sub> to the active conformation. d) Superimposed Ras-domains from the KGS ensembles for the active state (yellow) and inactive (grey) states. The amplitude of the Ras-domain ensemble is limited, except for marked fluctuations of helices <i>α</i><sub>4</sub> and <i>α</i><sub>5</sub>.</p
Nullspace Sampling with Holonomic Constraints Reveals Molecular Mechanisms of Protein Gαs
<div><p>Proteins perform their function or interact with partners by exchanging between conformational substates on a wide range of spatiotemporal scales. Structurally characterizing these exchanges is challenging, both experimentally and computationally. Large, diffusional motions are often on timescales that are difficult to access with molecular dynamics simulations, especially for large proteins and their complexes. The low frequency modes of normal mode analysis (NMA) report on molecular fluctuations associated with biological activity. However, NMA is limited to a second order expansion about a minimum of the potential energy function, which limits opportunities to observe diffusional motions. By contrast, kino-geometric conformational sampling (KGS) permits large perturbations while maintaining the exact geometry of explicit conformational constraints, such as hydrogen bonds. Here, we extend KGS and show that a conformational ensemble of the α subunit Gαs of heterotrimeric stimulatory protein Gs exhibits structural features implicated in its activation pathway. Activation of protein Gs by G protein-coupled receptors (GPCRs) is associated with GDP release and large conformational changes of its α-helical domain. Our method reveals a coupled α-helical domain opening motion while, simultaneously, Gαs helix α<sub>5</sub> samples an activated conformation. These motions are moderated in the activated state. The motion centers on a dynamic hub near the nucleotide-binding site of Gαs, and radiates to helix α<sub>4</sub>. We find that comparative NMA-based ensembles underestimate the amplitudes of the motion. Additionally, the ensembles fall short in predicting the accepted direction of the full activation pathway. Taken together, our findings suggest that nullspace sampling with explicit, holonomic constraints yields ensembles that illuminate molecular mechanisms involved in GDP release and protein Gs activation, and further establish conformational coupling between key structural elements of Gαs.</p></div
Heatmaps of conformational variability.
<p><b>a)</b> RMS fluctuations of DoFs for KGS (red) and iMC (blue) for 20,000 samples starting from the activated conformation (top two panels, orange) and the inactive conformation (bottom two panels, grey). The DoFs corresponding to the AH-domain and helix <i>α</i><sub>5</sub> are colored in darker shades. Free DoFs are circles, cycle DoFs are squares. The horizontal lines correspond to the mean RMSF value of the DoFs plus 2<i>σ</i>. <b>b,c)</b> Heatmaps representing the contribution of DoFs to conformational variability for KGS (b) and iMC (c). Coupling in the GDP binding pocket (red circle, <i>α</i><sub>5</sub>, <i>α</i><sub>1</sub>, and the adjacent <i>β</i><sub>1</sub>–<i>α</i><sub>1</sub> loop (P-loop)) extends to include helix <i>α</i><sub><i>F</i></sub> (bottom of the red circle), Linker II (SW I), and the N-terminus of <i>α</i><sub><i>E</i></sub> (right side of the blue oval).</p
Sampling trajectories on the constraint manifold encode collective motions.
<p><b>a)</b> KGS conformational distributions starting from three ligand-free holo crystal structures (leub, algi, osmo) are biased toward the apo structures. The polar plots show the distribution of the angles <i>θ</i> (along the radius), and <i>ϕ</i> (along the circumference) of the center of mass of domain two with respect to domain one (see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004361#sec002" target="_blank">Methods</a>). The domains open, reorient, and deform upon adopting the apo conformation (red circle), affecting the relative position of the centers of mass of the domains. The orientation of domain two starts out at the origin. The colors of samples are red-shifted toward higher sampling number. The conformations diffuse toward the apo state as sampling progresses. <b>b)</b> The KGS conformational distribution along the reaction coordinates <i>θ</i> and RMSD (in Å) to the holo crystal structure of apo and holo human lysozyme. The holo distribution samples more broadly, and more towards smaller <i>θ</i> angles than the apo distribution, in agreement with the free-energy landscape observed from RDC restrained simulations (left panel). The middle and right panel show the sampling distribution for apo and holo in more detail. (Inset: free-energy landscapes from RDC restrained MD simulations. Images adapted from [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004361#pcbi.1004361.ref029" target="_blank">29</a>]). Weak local maxima approximately corresponding to the ‘unlocked’ and ‘locked’ state can be observed in the distribution starting from the holo structure.</p
Architecture of G<i>α</i>s.
<p><b>a)</b> The G<i>α</i>s subunit consists of a Ras-domain and an <i>α</i>-helical domain. Linker I and II are shown in yellow, and the binding site of the nucleotide in between the two domains is indicated by a blue circle. The location of donor and acceptor atoms of the hydrogen bonding network used for KGS are indicated with white (donors) and red (acceptor) spheres. <b>b)</b> The activated state of Gs (pale cyan) superimposed onto the inactive state. Upon activation, the <i>α</i>-helical domain undergoes a large rotational motion. Helix <i>α</i><sub>5</sub> translates and rotates upward to interact with the cytoplasmic core of the receptor.</p