30 research outputs found
mRNA Inventory of Extracellular Vesicles from Ustilago maydis
Extracellular vesicles (EVs) can transfer diverse RNA cargo for intercellular communication. EV-associated RNAs have been found in diverse fungi and were proposed to be relevant for pathogenesis in animal hosts. In plant-pathogen interactions, small RNAs are exchanged in a cross-kingdom RNAi warfare and EVs were considered to be a delivery mechanism. To extend the search for EV-associated molecules involved in plant-pathogen communication, we have characterised the repertoire of EV-associated mRNAs secreted by the maize smut pathogen, Ustilago maydis. For this initial survey, we examined EV-enriched fractions from axenic filamentous cultures that mimic infectious hyphae. EV-associated RNAs were resistant to degradation by RNases and the presence of intact mRNAs was evident. The set of mRNAs enriched inside EVs relative to the fungal cells are functionally distinct from those that are depleted from EVs. mRNAs encoding metabolic enzymes are particularly enriched. Intriguingly, mRNAs of some known effectors and other proteins linked to virulence were also found in EVs. Furthermore, several mRNAs enriched in EVs are also upregulated during infection, suggesting that EV-associated mRNAs may participate in plant-pathogen interactions
Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
De novo prediction of RNA 3D structures with deep generative models.
We present a Deep Learning approach to predict 3D folding structures of RNAs from their nucleic acid sequence. Our approach combines an autoregressive Deep Generative Model, Monte Carlo Tree Search, and a score model to find and rank the most likely folding structures for a given RNA sequence. We show that RNA de novo structure prediction by deep learning is possible at atom resolution, despite the low number of experimentally measured structures that can be used for training. We confirm the predictive power of our approach by achieving competitive results in a retrospective evaluation of the RNA-Puzzles prediction challenges, without using structural contact information from multiple sequence alignments or additional data from chemical probing experiments. Blind predictions for recent RNA-Puzzle challenges under the name "Dfold" further support the competitive performance of our approach
VQ-VAE decoder architecture.
We present a Deep Learning approach to predict 3D folding structures of RNAs from their nucleic acid sequence. Our approach combines an autoregressive Deep Generative Model, Monte Carlo Tree Search, and a score model to find and rank the most likely folding structures for a given RNA sequence. We show that RNA de novo structure prediction by deep learning is possible at atom resolution, despite the low number of experimentally measured structures that can be used for training. We confirm the predictive power of our approach by achieving competitive results in a retrospective evaluation of the RNA-Puzzles prediction challenges, without using structural contact information from multiple sequence alignments or additional data from chemical probing experiments. Blind predictions for recent RNA-Puzzle challenges under the name “Dfold” further support the competitive performance of our approach.</div
Results: Evaluation of the structural predictions.
A, Reconstruction error resulting from encoding and decoding RNA 3D structures of the test set as a function of the number of distance classes, B,C Simulated blind tests of the RNA-Puzzles Challenges in comparison to other approaches, with or without simulated SHAPE reactivity data and homologous sequences as additional input. The three right most predictions are real blind submission from the latest puzzle round 33 (7mlx, 7eoj, 7elq). Puzzles 6ufm, 6pom, and 6pmo are large compounds of tRNA-Riboswitch complexes. Puzzle predictions are sorted in descending order of the sequence length (left to right) D, Reconstruction error of longer RNA-Puzzles that lack both structural and sequence homology (PDB 4R4V: 185 nt,) in comparison to (PDB 4QLM: 108 nt, PDB 6pom tBox: 75, PDB 6ufm Complex: 175 nt) for which structural and sequence homologs are available. Structures of length >175 nt were cut into substructure of length ≤100 nt, aligned with PyMOL, and RMSD calculated as average over all substructures. E, Simulated chemical probing data in comparison with experimentally measured reactivities [52] (SHAPE) for PDB 1Y26, F, Most likely alternative structures, as predicted by the Score Model, for the Gluatamine riboswitch and the ZMP riboswitch (S3 Fig).</p
Score model network: Optimization hyperparameters.
Score model network: Optimization hyperparameters.</p
Generator network: Resiudal architecture.
We present a Deep Learning approach to predict 3D folding structures of RNAs from their nucleic acid sequence. Our approach combines an autoregressive Deep Generative Model, Monte Carlo Tree Search, and a score model to find and rank the most likely folding structures for a given RNA sequence. We show that RNA de novo structure prediction by deep learning is possible at atom resolution, despite the low number of experimentally measured structures that can be used for training. We confirm the predictive power of our approach by achieving competitive results in a retrospective evaluation of the RNA-Puzzles prediction challenges, without using structural contact information from multiple sequence alignments or additional data from chemical probing experiments. Blind predictions for recent RNA-Puzzle challenges under the name “Dfold” further support the competitive performance of our approach.</div
MCTS: Search tree with two distinct leafs.
Starting with all target distance classes masked, the Generator places initial probabilities for every pixel in the distance class softmax prediction. From there, pixels were sampled iteartively using the MCTS search objective which aimed for entropy reduction. We derived two terminal leaf nodes A for which the Generator network saw enough distance pixels to fill up the remainers using its argmax prediciton B. From those filled up leafs, the VQ-VAE could be applied to decode into real distance space C. After further energy refinement, we showed for the ZMP-Riboswitch (PDB: 4XW7), that those two leafs indeed corresponded to two different structures. The riboswitch has a movable hinge part, which gets stabilized by a small molecule. Hence, our best leaf prediction in red is closer to the blue target solution, bottom C. We also sampled an alternative structure. In this particular example, the strechted grey RNA structure C was ranked with a lower score by the Score Model. (PDF)</p
Encoding of purine (G/A) and pyrimidine (C/U) coordinates.
Encoding of purine (G/A) and pyrimidine (C/U) coordinates.</p