Artificial intelligence (AI) and machine learning (ML) are expanding in
popularity for broad applications to challenging tasks in chemistry and
materials science. Examples include the prediction of properties, the discovery
of new reaction pathways, or the design of new molecules. The machine needs to
read and write fluently in a chemical language for each of these tasks. Strings
are a common tool to represent molecular graphs, and the most popular molecular
string representation, SMILES, has powered cheminformatics since the late
1980s. However, in the context of AI and ML in chemistry, SMILES has several
shortcomings -- most pertinently, most combinations of symbols lead to invalid
results with no valid chemical interpretation. To overcome this issue, a new
language for molecules was introduced in 2020 that guarantees 100\% robustness:
SELFIES (SELF-referencIng Embedded Strings). SELFIES has since simplified and
enabled numerous new applications in chemistry. In this manuscript, we look to
the future and discuss molecular string representations, along with their
respective opportunities and challenges. We propose 16 concrete Future Projects
for robust molecular representations. These involve the extension toward new
chemical domains, exciting questions at the interface of AI and robust
languages and interpretability for both humans and machines. We hope that these
proposals will inspire several follow-up works exploiting the full potential of
molecular string representations for the future of AI in chemistry and
materials science.Comment: 34 pages, 15 figures, comments and suggestions for additional
references are welcome