Text simplification seeks to improve readability while retaining the original
content and meaning. Our study investigates whether pre-trained classifiers
also maintain such coherence by comparing their predictions on both original
and simplified inputs. We conduct experiments using 11 pre-trained models,
including BERT and OpenAI's GPT 3.5, across six datasets spanning three
languages. Additionally, we conduct a detailed analysis of the correlation
between prediction change rates and simplification types/strengths. Our
findings reveal alarming inconsistencies across all languages and models. If
not promptly addressed, simplified inputs can be easily exploited to craft
zero-iteration model-agnostic adversarial attacks with success rates of up to
50%Comment: Published at DeTermIt! Workshop at LREC-COLING 202