14 research outputs found
Characterizing web pornography consumption from passive measurements
Web pornography represents a large fraction of the Internet traffic, with
thousands of websites and millions of users. Studying web pornography
consumption allows understanding human behaviors and it is crucial for medical
and psychological research. However, given the lack of public data, these works
typically build on surveys, limited by different factors, e.g. unreliable
answers that volunteers may (involuntarily) provide.
In this work, we collect anonymized accesses to pornography websites using
HTTP-level passive traces. Our dataset includes about broadband
subscribers over a period of 3 years. We use it to provide quantitative
information about the interactions of users with pornographic websites,
focusing on time and frequency of use, habits, and trends. We distribute our
anonymized dataset to the community to ease reproducibility and allow further
studies.Comment: Passive and Active Measurements Conference 2019 (PAM 2019). 14 pages,
7 figure
Characterizing Web Pornography Consumption from Passive Measurements
Web pornography represents a large fraction of the Internet traffic, with thousands of websites and millions of users. Studying web pornography consumption allows understanding human behaviors and it is crucial for medical and psychological research.
However, given the lack of public data, these works typically build on surveys, limited by different factors, \eg unreliable answers that volunteers may (involuntarily) provide.
In this work, we collect anonymized accesses to pornography websites using HTTP-level passive traces. Our dataset includes about 15,000 broadband subscribers over a period of 3 years. We use it to provide quantitative information about the interactions of users with pornographic websites, focusing on time and frequency of use, habits, and trends. We distribute our anonymized dataset to the community to ease reproducibility and allow further studies
A Workload Characterization Methodology for WWW Applications
With the World Wide Web (WWW) traffic being the fastest growing portion of load on the internet, describing and characterizing this workload is a central issue for any performance evaluation study. In this paper, we present an approach for generating a profile of requests submitted to a WWW server (GET, POST, ...) which takes explicitly into account the user behavior when surfing the WWW (i.e. navigating through it via a WWW browser). We present Probabilistic Attributed Context Free Grammar (PACFG) as a model for translating from this user oriented view of the workload (namely the conversations made within browser windows) to the methods submitted to the Web servers (respectively to a proxy server). The characterization at this lower level are essential for estimating the traffic on the net and are thus the starting point for evaluations of net traffic
HyperScout: Darstellung erweiterter Typinformationen im World Wide Web — Konzepte und Auswirkungen
Evaluating the Performance of Navigation Prediction Model Based on Varied Session Length
Learning Web Request Patterns
Summary. Most requests on the Web are made on behalf of human users, and like other human-computer interactions, the actions of the user can be characterized by identifiable regularities. Much of these patterns of activity, both within a user, and between users, can be identified and exploited by intelligent mechanisms for learning Web request patterns. Our focus is on Markov-based probabilistic techniques, both for their predictive power and their popularity in Web modeling and other domains. Although history-based mechanisms can provide strong performance in predicting future requests, performance can be improved by including predictions from additional sources. In this chapter we review the common approaches to learning and predicting Web request patterns. We provide a consistent description of various algorithms (often independently proposed), and compare performance of those techniques on the same data sets. We also discuss concerns for accurate and realistic evaluation of these techniques.
