7,349 research outputs found

    Characterizing web pornography consumption from passive measurements

    Get PDF
    Web pornography represents a large fraction of the Internet traffic, with thousands of websites and millions of users. Studying web pornography consumption allows understanding human behaviors and it is crucial for medical and psychological research. However, given the lack of public data, these works typically build on surveys, limited by different factors, e.g. unreliable answers that volunteers may (involuntarily) provide. In this work, we collect anonymized accesses to pornography websites using HTTP-level passive traces. Our dataset includes about 1500015\,000 broadband subscribers over a period of 3 years. We use it to provide quantitative information about the interactions of users with pornographic websites, focusing on time and frequency of use, habits, and trends. We distribute our anonymized dataset to the community to ease reproducibility and allow further studies.Comment: Passive and Active Measurements Conference 2019 (PAM 2019). 14 pages, 7 figure

    Agents, Bookmarks and Clicks: A topical model of Web traffic

    Full text link
    Analysis of aggregate and individual Web traffic has shown that PageRank is a poor model of how people navigate the Web. Using the empirical traffic patterns generated by a thousand users, we characterize several properties of Web traffic that cannot be reproduced by Markovian models. We examine both aggregate statistics capturing collective behavior, such as page and link traffic, and individual statistics, such as entropy and session size. No model currently explains all of these empirical observations simultaneously. We show that all of these traffic patterns can be explained by an agent-based model that takes into account several realistic browsing behaviors. First, agents maintain individual lists of bookmarks (a non-Markovian memory mechanism) that are used as teleportation targets. Second, agents can retreat along visited links, a branching mechanism that also allows us to reproduce behaviors such as the use of a back button and tabbed browsing. Finally, agents are sustained by visiting novel pages of topical interest, with adjacent pages being more topically related to each other than distant ones. This modulates the probability that an agent continues to browse or starts a new session, allowing us to recreate heterogeneous session lengths. The resulting model is capable of reproducing the collective and individual behaviors we observe in the empirical data, reconciling the narrowly focused browsing patterns of individual users with the extreme heterogeneity of aggregate traffic measurements. This result allows us to identify a few salient features that are necessary and sufficient to interpret the browsing patterns observed in our data. In addition to the descriptive and explanatory power of such a model, our results may lead the way to more sophisticated, realistic, and effective ranking and crawling algorithms.Comment: 10 pages, 16 figures, 1 table - Long version of paper to appear in Proceedings of the 21th ACM conference on Hypertext and Hypermedi

    Revealing User Behaviour on the World-Wide Web

    Get PDF
    This paper presents the results of a qualitative study of user behaviour on the World-Wide Web. Eight participants were filmed whilst performing user-defined tasks and then asked to review the video-taped session during prompted recall. This data forms the basis for a series of descriptions of user behaviour and the postulation of a number of underlying cognitive mechanisms. Our results indicate that people: lack ready made search strategies, prefer alternatives that are visible, immediately available and familiar, choose the path of least resistance, exhibit social forms of behaviour, engage in parallel activities, object to misleadingly presented information, have trouble orienting, are late in using appropriate strategies, are sensitive to matters of time, and are emotionally involved in the activity. The paper ends with a discussion of how these results can contribute to our understanding of hypermedia

    Second-Level Digital Divide: Mapping Differences in People's Online Skills

    Full text link
    Much of the existing approach to the digital divide suffers from an important limitation. It is based on a binary classification of Internet use by only considering whether someone is or is not an Internet user. To remedy this shortcoming, this project looks at the differences in people's level of skill with respect to finding information online. Findings suggest that people search for content in a myriad of ways and there is a large variance in how long people take to find various types of information online. Data are collected to see how user demographics, users' social support networks, people's experience with the medium, and their autonomy of use influence their level of user sophistication.Comment: 29th TPRC Conference, 200

    Moving Usability Testing onto the Web

    Get PDF
    Abstract: In order to remotely obtain detailed usability data by tracking user behaviors within a given web site, a server-based usability testing environment has been created. Web pages are annotated in such a way that arbitrary user actions (such as "mouse over link" or "click back button") can be selected for logging. In addition, the system allows the experiment designer to interleave interactive questions into the usability evaluation, which for instance could be triggered by a particular sequence of actions. The system works in conjunction with clustering and visualization algorithms that can be applied to the resulting log file data. A first version of the system has been used successfully to carry out a web usability evaluation

    Determining WWW User's Next Access and Its Application to Pre-fetching

    Full text link
    World-Wide Web (WWW) services have grown to levels where significant delays are expected to happen. Techniques like pre-fetching are likely to help users to personalize their needs, reducing their waiting times. However, pre-fetching is only effective if the right documents are identified and if user's move is correctly predicted. Otherwise, pre-fetching will only waste bandwidth. Therefore, it is productive to determine whether a revisit will occur or not, before starting pre-fetching. In this paper we develop two user models that help determining user's next move. One model uses Random Walk approximation and the other is based on Digital Signal Processing techniques. We also give hints on how to use such models with a simple pre-fetching technique that we are developing.CNP

    Characterizations of User Web Revisit Behavior

    Get PDF
    In this article we update and extend on earlier long-term studies on user's page revisit behavior. Revisits ar
    corecore