Phishing attacks pose a significant threat to Internet users, with
cybercriminals elaborately replicating the visual appearance of legitimate
websites to deceive victims. Visual similarity-based detection systems have
emerged as an effective countermeasure, but their effectiveness and robustness
in real-world scenarios have been unexplored. In this paper, we comprehensively
scrutinize and evaluate state-of-the-art visual similarity-based anti-phishing
models using a large-scale dataset of 450K real-world phishing websites. Our
analysis reveals that while certain models maintain high accuracy, others
exhibit notably lower performance than results on curated datasets,
highlighting the importance of real-world evaluation. In addition, we observe
the real-world tactic of manipulating visual components that phishing attackers
employ to circumvent the detection systems. To assess the resilience of
existing models against adversarial attacks and robustness, we apply visible
and perturbation-based manipulations to website logos, which adversaries
typically target. We then evaluate the models' robustness in handling these
adversarial samples. Our findings reveal vulnerabilities in several models,
emphasizing the need for more robust visual similarity techniques capable of
withstanding sophisticated evasion attempts. We provide actionable insights for
enhancing the security of phishing defense systems, encouraging proactive
actions. To the best of our knowledge, this work represents the first
large-scale, systematic evaluation of visual similarity-based models for
phishing detection in real-world settings, necessitating the development of
more effective and robust defenses.Comment: 12 page