This study explores the robustness of university assessments against the use
of Open AI's Generative Pre-Trained Transformer 4 (GPT-4) generated content and
evaluates the ability of academic staff to detect its use when supported by the
Turnitin Artificial Intelligence (AI) detection tool. The research involved
twenty-two GPT-4 generated submissions being created and included in the
assessment process to be marked by fifteen different faculty members. The study
reveals that although the detection tool identified 91% of the experimental
submissions as containing some AI-generated content, the total detected content
was only 54.8%. This suggests that the use of adversarial techniques regarding
prompt engineering is an effective method in evading AI detection tools and
highlights that improvements to AI detection software are needed. Using the
Turnitin AI detect tool, faculty reported 54.5% of the experimental submissions
to the academic misconduct process, suggesting the need for increased awareness
and training into these tools. Genuine submissions received a mean score of
54.4, whereas AI-generated content scored 52.3, indicating the comparable
performance of GPT-4 in real-life situations. Recommendations include adjusting
assessment strategies to make them more resistant to the use of AI tools, using
AI-inclusive assessment where possible, and providing comprehensive training
programs for faculty and students. This research contributes to understanding
the relationship between AI-generated content and academic assessment, urging
further investigation to preserve academic integrity