Negation is a fundamental aspect of natural language, playing a critical role
in communication and comprehension. Our study assesses the negation detection
performance of Generative Pre-trained Transformer (GPT) models, specifically
GPT-2, GPT-3, GPT-3.5, and GPT-4. We focus on the identification of negation in
natural language using a zero-shot prediction approach applied to our custom
xNot360 dataset. Our approach examines sentence pairs labeled to indicate
whether the second sentence negates the first. Our findings expose a
considerable performance disparity among the GPT models, with GPT-4 surpassing
its counterparts and GPT-3.5 displaying a marked performance reduction. The
overall proficiency of the GPT models in negation detection remains relatively
modest, indicating that this task pushes the boundaries of their natural
language understanding capabilities. We not only highlight the constraints of
GPT models in handling negation but also emphasize the importance of logical
reliability in high-stakes domains such as healthcare, science, and law