research

Heavy Tails, Generalized Coding, and Optimal Web Layout

Abstract

This paper considers Web layout design in the spirit of source coding for data compression and rate distortion theory, with the aim of minimizing the average size of files downloaded during Web browsing sessions. The novel aspect here is that the object of design is layout rather than codeword selection, and is subject to navigability constraints. This produces statistics for file transfers that are heavy tailed, completely unlike standard Shannon theory, and provides a natural and plausible explanation for the origin of observed power laws in Web traffic. We introduce a series of theoretical and simulation models for optimal Web layout design with varying levels of analytic tractability and realism with respect to modeling of structure, hyperlinks, and user behavior. All models produce power laws which are striking both for their consistency with each other and with observed data, and their robustness to modeling assumptions. These results suggest that heavy tails are a permanent and ubiquitous feature of Internet traffic, and not an artifice of current applications or user behavior. They also suggest new ways of thinking about protocol design that combines insights from information and control theory with traditional networking

    Similar works