As the sophistication of Machine Learning Force Fields (MLFF) increases to
match the complexity of extended molecules and materials, so does the need for
tools to properly analyze and assess the practical performance of MLFFs. To go
beyond average error metrics and into a complete picture of a model's
applicability and limitations, we develop FFAST (Force Field Analysis Software
and Tools): a cross-platform software package designed to gain detailed
insights into a model's performance and limitations, complete with an
easy-to-use graphical user interface. The software allows the user to gauge the
performance of many popular state-of-the-art MLFF models on various popular
dataset types, providing general prediction error overviews, outlier detection
mechanisms, atom-projected errors, and more. It has a 3D visualizer to find and
picture problematic configurations, atoms, or clusters in a large dataset. In
this paper, the example of the MACE and Nequip models are used on two datasets
of interest -- stachyose and docosahexaenoic acid (DHA) -- to illustrate the
use cases of the software. With it, it was found that carbons and oxygens
involved in or near glycosidic bonds inside the stachyose molecule present
increased prediction errors. In addition, prediction errors on DHA rise as the
molecule folds, especially for the carboxylic group at the edge of the
molecule. We emphasize the need for a systematic assessment of MLFF models for
ensuring their successful application to study the dynamics of molecules and
materials.Comment: 22 pages, 11 figure