1 research outputs found
Web Phishing Detection In Machine Learning Using Heuristic Image Based Method
Phishing attacks are significant threat to users of the Internet causing tremendous economic loss every year. In combating phish Industry relies heavily on manual verification to achieve a low false positive rate, which however tends to be slows in responding to the huge volume created by toolkits. The goal here is to combine the best aspects of human verified blacklists and heuristic-based methods which are the low false positive rate of the former and the broad coverage of the latter. The key insight behind our detection algorithm is to leverage existing human-verified blacklists and apply the shingling technique, a popular near duplicate detection algorithm used by search engines, to detect phish in a probabilistic fashion with very high accuracy. The features introduced in Carnegie Mellon Anti-Phishing and Network Analysis Tool (CANTINA), in similarity feature to a machine learning based phishing detection system. By preliminarily experimented with a small set of 200 web data, consisting of 100 phishing webs and another 100 non-phishing webs. The evaluation result in terms of f-measure was upto 0.9250, with 7.50 % of error rate is implemented