Despite nearly two scores of research on RNA secondary structure and RNA-RNA
interaction prediction, the accuracy of the state-of-the-art algorithms are
still far from satisfactory. Researchers have proposed increasingly complex
energy models and improved parameter estimation methods in anticipation of
endowing their methods with enough power to solve the problem. The output has
disappointingly been only modest improvements, not matching the expectations.
Even recent massively featured machine learning approaches were not able to
break the barrier. In this paper, we introduce the notion of learnability of
the parameters of an energy model as a measure of its inherent capability. We
say that the parameters of an energy model are learnable iff there exists at
least one set of such parameters that renders every known RNA structure to date
the minimum free energy structure. We derive a necessary condition for the
learnability and give a dynamic programming algorithm to assess it. Our
algorithm computes the convex hull of the feature vectors of all feasible
structures in the ensemble of a given input sequence. Interestingly, that
convex hull coincides with the Newton polytope of the partition function as a
polynomial in energy parameters. We demonstrated the application of our theory
to a simple energy model consisting of a weighted count of A-U and C-G base
pairs. Our results show that this simple energy model satisfies the necessary
condition for less than one third of the input unpseudoknotted
sequence-structure pairs chosen from the RNA STRAND v2.0 database. For another
one third, the necessary condition is barely violated, which suggests that
augmenting this simple energy model with more features such as the Turner loops
may solve the problem. The necessary condition is severely violated for 8%,
which provides a small set of hard cases that require further investigation