Principles for Understanding the Accuracy of SHAPE-Directed RNA Structure Modeling

Abstract

Accurate RNA structure modeling is an important, incompletely solved, challenge. Single-nucleotide resolution SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension) yields an experimental measurement of local nucleotide flexibility that can be incorporated as pseudo-free energy change constraints to direct secondary structure predictions. Prior work from our laboratory has emphasized both the overall accuracy of this approach and the need for nuanced interpretation of some apparent discrepancies between modeled and accepted structures. Recent studies by Das and colleagues [Kladwang et al., Biochemistry 50:8049 (2011) and Nat. Chem. 3:954 (2011)], focused on analyzing six small RNAs, yielded poorer RNA secondary structure predictions than expected based on prior benchmarking efforts. To understand the features that led to these divergent results, we re-examined four RNAs yielding the poorest results in this recent work – tRNAPhe, the adenine and cyclic-di-GMP riboswitches, and 5S rRNA. Most of the errors reported by Das and colleagues reflected non-standard experiment and data processing choices, and selective scoring rules. For two RNAs, tRNAPhe and the adenine riboswitch, secondary structure predictions are nearly perfect if no experimental information is included but were rendered inaccurate by the Das and colleagues SHAPE data. When best practices were used, single-sequence SHAPE-directed secondary structure modeling recovered ~93% of individual base pairs and greater than 90% of helices in the four RNAs, essentially indistinguishable from the mutate-and-map approach with the exception of a single helix in the 5S rRNA. The field of experimentally-directed RNA secondary structure prediction is entering a phase focused on the most difficult prediction challenges. We outline five constructive principles for guiding this field forward

    Similar works