8 research outputs found

    A Taxonomy of Hyperlink Hiding Techniques

    Full text link
    Hidden links are designed solely for search engines rather than visitors. To get high search engine rankings, link hiding techniques are usually used for the profitability of black industries, such as illicit game servers, false medical services, illegal gambling, and less attractive high-profit industry, etc. This paper investigates hyperlink hiding techniques on the Web, and gives a detailed taxonomy. We believe the taxonomy can help develop appropriate countermeasures. Study on 5,583,451 Chinese sites' home pages indicate that link hidden techniques are very prevalent on the Web. We also tried to explore the attitude of Google towards link hiding spam by analyzing the PageRank values of relative links. The results show that more should be done to punish the hidden link spam.Comment: 12 pages, 2 figure

    Fuzzy equivalence relation based clustering and its use to restructuring websites' hyperlinks and web pages

    Get PDF
    Quality design of websites implies that among other factors, hypelinks’ structure should allow the users to reach the information they seek with the minimum number of clicks. This paper utilises the fuzzy equivalence relation based clustering in adapting website hyperlinks’ structure so that the redesigned website allows users to meet as effectively as possible their informational and navigational requirements. The fuzzy tolerance relation is calculated based on the usage rate of hyperlinks in a website. The equivalence relation identifies clusters of hyperlinks. The clusters are then used to realocate hyperlinks in webpages and to rearrange webpages into the website structure hierarchy

    Examining the Impact of a Reasoning Aid to Help People Evaluate the Evidentiary Weight of Consensus

    Get PDF
    This item is only available electronically.Social media is a vortex of information and people may see distorted views of consensus, where the independence of information and sources is unclear. A tool that summarises consensus information might help people to navigate these important cues. This study examined whether a reasoning aid (in the form of a diagram) visually illustrating both the number of independent people supporting/disagreeing with a claim and the diversity of arguments would persuade people to change their original beliefs. Participants (n=605) were recruited through Amazon’s Mechanical Turk to evaluate 24 claims on a mock Twitter interface. Participants were randomly assigned to conditions with either tweets only, diagram only or tweets with a diagram. Participants rated their initial agreement level (0-100) with each claim and then saw the diagram and/or set of tweets, then were able to update their agreement level if their original opinion had now changed. The findings of this study show that without assistance, people mostly rely on cues of argument quantity, such as the number of tweets for a given stance. However, when presented with a diagram, people were able to utilise cues of argument quality, such as when there were different sources providing the information and when multiple arguments were used.Thesis (B.PsychSc(Hons)) -- University of Adelaide, School of Psychology, 202

    An Analysis Of Machine Learning Methods For Spam Host Detection

    No full text
    The web is becoming an increasingly important source of entertainment, communication, research, news and trade. In this way, the web sites compete to attract the attention of users and many of them achieve visibility through malicious strategies that try to circumvent the search engines. Such sites are known as web spam and they are generally responsible for personal injury and economic losses. Given this scenario, this paper presents a comprehensive performance evaluation of several established machine learning techniques used to automatically detect and filter hosts that disseminate web spam. Our experiments were diligently designed to ensure statistically sounds results and they indicate that bagging of decision trees, multilayer perceptron neural networks, random forest and adaptive boosting of decision trees are promising in the task of web spam classification and, hence, they can be used as a good baseline for further comparison. © 2012 IEEE.2227232Ledford, J.L., (2009) Search Engine Optimization Bible, , 2nd ed. Indianapolis, Indiana, USA: Wiley PublishingSvore, K.M., Wu, Q., Burges, C.J.C., Raman, A., Improving web spam classification using rank-time features (2007) ACM International Conference Proceeding Series, 215, pp. 9-16. , DOI 10.1145/1244408.1244411, AIRWeb 2007 - Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the WebGyongyi, Z., Garcia-Molina, H., (2005) Spam: It's Not Just For Inboxes Anymore Computer, 38 (10), pp. 28-34John, J.P., Yu, F., Xie, Y., Krishnamurthy, A., Abadi, M., Deseo: Combating search-result poisoning (2011) Proc. of the 20th SEC, pp. 20-20. , Berkeley, CA, USASilva, R.M., Almeida, T.A., Yamakami, A., Redes neurais artificiais para detecção de web spams (2012) Proc. of the 8th Brazilian Symposium on Information Systems-SBSI, pp. 636-641. , São Paulo, BrazilArtificial neural networks for content-based web spam detection Proc. of the 14th ICAI, 2012, pp. 1-7. , Las Vegas, NV, USATowards web spam filtering with neural-based approaches (2012) Proc. of the 13rd IBERAMIA, Ser, pp. 199-209. , Lecture Notes in Artificial Intelligence, 7637. Cartagena de Indias, Colombia: Springer Berlin HeidelbergLargillier, T., Peyronnet, S., Webspam demotion: Low complexity node aggregation methods (2012) Neurocomputing, 76 (1), pp. 105-113Liu, Y., Chen, F., Kong, W., Yu, H., Zhang, M., Ma, S., Ru, L., (2012) Identifying Web Spam With The Wisdom Of The Crowds, 6 (1), pp. 21-230. , ACM Trans. on the WebRungsawang, A., Taweesiriwate, A., Manaskasemsak, B., Spam host detection using ant colony optimization IT Convergence and Services, ser, 107 (2011), pp. 13-21. , Lecture Notes in Electrical Engineering, Springer NetherlandsCastillo, C., Donato, D., Gionis, A., Murdock, V., Silvestri, F., Know your neighbors: Web spam detection using the web topology (2007) Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07, pp. 423-430. , DOI 10.1145/1277741.1277814, Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H., Ng McLachlan, A., Steinberg, D., Top 10 algorithms in data mining (2008) Knowledge and Information Systems, 14 (1), pp. 1-37Haykin, S., (1998) Neural Networks: A Comprehensive Foundation, , 2nd ed. New York, NY, USA: Prentice HallBishop, C.M., (1995) Neural Networks for Pattern Recognition, , Oxford: Oxford PressHagan, M.T., Menhaj, M.B., Training feedforward networks with the marquardt algorithm (1994) IEEE Trans. on Neural Networks, 6 (5), pp. 989-993Cortes, C., Vapnik, V.N., Support-vector networks (1995) Machine Learning, pp. 273-297Chang, C.-C., Lin, C.-J., Libsvm: A library for support vector machines (2011) ACM Trans, 27 (2), pp. 1-27. , On Intelligent Systems and TechnologyHsu, C.-W., Chang, C.-C., Lin, C.-J., A practical guide to support vector classification (2003) National Taiwan University, Tech. Rep.Quinlan, J.R., (1993) C4.5: Programs For Machine Learning, , 1st ed. San Mateo, CA, USA: Morgan KaufmannBreiman, L., Random forests (2001) Machine Learning, 45 (1), pp. 5-32. , DOI 10.1023/A:1010933404324Aha David, W., Kibler Dennis, Albert Marc, K., Instance-based learning algorithms (1991) Machine Learning, 6 (1), pp. 37-66. , DOI 10.1023/A:1022689900470Witten, I.H., Frank, E., Mining, D., (2005) Practical Machine Learning Tools and Techniques, 2nd ed., , San Francisco, CA: Morgan KaufmannFreund, Y., Schapire, R.E., Experiments with a new boosting algorithm (1996) Proc. of the 13th ICML, pp. 148-156. , Bari, Italy: Morgan KaufmannBreiman, L., Bagging predictors (1996) Machine Learning, 24 (2), pp. 123-140Friedman, J., Hastie, T., Tibshirani, R., Additive logistic regression: A statistical view of boosting (2000) Annals of Statistics, 28 (2), pp. 337-407Becchetti, L., Castillo, C., Donato, D., Leonardi, S., Baeza-Yates, R., Using rank propagation and probabilistic counting for link-based spam detection (2006) Proc. of the WebKDD'06, , Philadelphia,USAShao, J., Linear model selection by cross-validation (1993) Journal of the American Statistical Association, 422 (88), pp. 486-494Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H., The weka data mining software: An update (2009) SIGKDD Explorations Newsletter, 11 (1), pp. 10-18Montgomery, D.C., Runger, G.C., (2002) Applied Statistics And Probability For Engineers 3rd Ed, , New York NY USA: John Wiley & Son
    corecore