Search CORE

1 research outputs found

Artificial Neural Networks For Content-based Web Spam Detection

Author: Almeida T.A.
Silva R.M.
Yamakami A.
Publication venue
Publication date: 26/11/2015
Field of study

Web spam has become a big problem in the lives of Internet users, causing personal injury and economic losses. Although some approaches have been proposed to automatically detect and avoid this problem, the high speed the techniques employed by spammers are improved requires that the classifiers be more generic, efficient and highly adaptive. Despite of the fact that it is a common sense in the literature that neural based techniques have a high ability of generalization and adaptation, as far as we know there is no work that explore such method to avoid web spam. Given this scenario and to fill this important gap, this paper presents a performance evaluation of different models of artificial neural networks used to automatically classify and filter real samples of web spam based on their contents. The results indicate that some of evaluated approaches have a big potential since they are suitable to deal with the problem and clearly outperform the state-of-the-art techniques.1209215George Mason Univ., Bioinformatics Comput. Biol. Program,HST Harvard Univ. MIT, Biomed. Cybern. Lab.,University of Minnesota, Minnesota Supercomputing Institute,Center for Cyber Defense, NCAT,Argonne's Leadersh. Comput. Facil. Argonne Natl. Lab.Svore, K.M., Wu, Q., Burges, C.J., Improving web spam classification using rank-time features (2007) Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb'07), pp. 9-16. , Banff, Alberta, CanadaGyongyi, Z., Garcia-Molina, H., Spam: It's not just for inboxes anymore (2005) Computer, 38 (10), pp. 28-34Shen, G., Gao, B., Liu, T., Feng, G., Song, S., Li, H., Detecting link spam using temporal information (2006) Proceedings of the 6th IEEE International Conference on Data Mining (ICDM'06), pp. 1049-1053. , Hong Kong, ChinaEgele, M., Kolbitsch, C., Platzer, C., Removing web spam links from search engine results (2011) Journal in Computer Virology, 7, pp. 51-62Eiron, N., McCurley, K.S., Tomlin, J.A., Ranking the web frontier (2004) Proceedings of the 13rd International Conference on World Wide Web (WWW'04), pp. 309-318. , New York, NY, USAAlmeida, T., Yamakami, A., Almeida, J., Evaluation of approaches for dimensionality reduction applied with naive bayes anti-spam filters (2009) Proceedings of the 8th IEEE International Conference on Machine Learning and Applications, pp. 517-522. , Miami, FL, USAAlmeida, T., Yamakami, A., Almeida, J., Filtering spams using the minimum description length principle (2010) Proceedings of the 25th ACM Symposium on Applied Computing, pp. 1856-1860. , Sierre, SwitzerlandAlmeida, T., Yamakami, A., Almeida, J., Probabilistic anti-spam filtering with dimensionality reduction (2010) Proceedings of the 25th ACM Symposium on Applied Computing, pp. 1804-1808. , Sierre, SwitzerlandAlmeida, T., Yamakami, A., Content-based spam filtering (2010) Proceedings of the 23rd IEEE International Joint Conference on Neural Networks, pp. 1-7. , Barcelona, SpainAlmeida, T., Almeida, J., Yamakami, A., Spam filtering: How the dimensionality reduction affects the accuracy of naive bayes classifiers (2011) Journal of Internet Services and Applications, 1 (3), pp. 183-200Almeida, T., Yamakami, A., Redução de Dimensionalidade Aplicada na Classificaç ão de Spams Usando Filtros Bayesianos (2011) Revista Brasileira de Computação Aplicada, 3 (1), pp. 16-29Almeida, T., Hidalgo, J.G., Yamakami, A., Contributions to the study of SMS spam filtering: New collection and results (2011) Proceedings of the 2011 ACM Symposium on Document Engineering, pp. 259-262. , Mountain View, CA, USAAlmeida, T.A., Yamakami, A., Facing the spammers: A very effective approach to avoid junk E-mails (2012) Expert Systems with Applications, pp. 1-5Almeida, T.A., Yamakami, A., Advances in spam filtering techniques (2012) Computational Intelligence for Privacy and Security, Ser. Studies in Computational Intelligence, 394, pp. 199-214. , D. Elizondo, A. Solanas, and A. Martinez-Balleste, Eds. SpringerGan, Q., Suel, T., Improving web spam classifiers using link structure (2007) Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb'07), pp. 17-20. , Banff, Alberta, CanadaNtoulas, A., Najork, M., Manasse, M., Fetterly, D., Detecting spam web pages through content analysis (2006) Proceedings of the World Wide Web Conference (WWW'06), pp. 83-92. , Edinburgh, ScotlandUrvoy, T., Chauveau, E., Filoche, P., Tracking web spam with html style similarities (2008) ACM Transactions on the Web, 2 (1), pp. 1-3. , FebruaryBíró, I., Siklósi, D., Szabó, J., Benczúr, A.A., Linked latent dirichlet allocation in web spam filtering (2009) Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web (AIRWebW), pp. 37-40. , Madrid, SpainAbernethy, J., Chapelle, O., Castillo, C., Graph regularization methods for web spam detection (2010) Machine Learning, 81 (2), pp. 207-225Castillo, C., Donate, D., Gionis, A., Know your neighbors: Web spam detection using the web topology (2007) Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'07), pp. 423-430. , Amsterdam, The NetherlandsErdélyi, M., Garzó, A., Benczúr, A.A., Web spam classification: A few features worth more (2011) Proceedings of the 2011 Joint WICOW/AIRWeb Workshop on Web Quality (WebQuality'11), pp. 27-34. , Hyderabad, IndiaGeng, G., Wang, C., Li, Q., Xu, L., Jin, X., Boosting the performance of web spam detection with ensemble under-sampling classification (2007) Proceedings of the 14th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD'07), pp. 583-587. , Haikou, ChinaLargillier, T., Peyronnet, S., Lightweight clustering methods for webspam demotion (2010) Proceedings of the 9th IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT'10), pp. 98-104. , Toronto, CanadaRen, Q., Feature-fusion framework for spam filtering based on svm (2010) Proceedings of the 7th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS'10), pp. 1-6. , Redmond, Washington, USAHaykin, S., (1998) Neural Networks: A Comprehensive Foundation, , 2nd ed. New York, NY, USA: Prentice HallLiu, H., On the levenberg-marquardt training method for feed-forward neural networks (2010) Proceedings of the 6th International Conference on Natural Computation (ICNC'10), pp. 456-460. , Yantai, ChinaBishop, C.M., (1995) Neural Networks for Pattern Recognition, , 1st ed. Oxford: Oxford PressHagan, M.T., Menhaj, M.B., Training feedforward networks with the marquardt algorithm (1994) IEEE Transactions on Neural Networks, 5 (6), pp. 989-993Kohonen, T., The self-organizing map (1990) Proceedings of the IEEE, 9 (78), pp. 1464-1480Orr, M.J.L., (1996) Introduction to Radial Basis Function Network

Repositorio da Producao Cientifica e Intelectual da Unicamp