1,161 research outputs found

    Zipf's Law for web surfers

    Get PDF
    One of the main activities of Web users, known as 'surfing', is to follow links. Lengthy navigation often leads to disorientation when users lose track of the context in which they are navigating and are unsure how to proceed in terms of the goal of their original query. Studying navigation patterns of Web users is thus important, since it can lead us to a better understanding of the problems users face when they are surfing. We derive Zipf's rank frequency law (i.e., an inverse power law) from an absorbing Markov chain model of surfers' behavior assuming that less probable navigation trails are, on average, longer than more probable ones. In our model the probability of a trail is interpreted as the relevance (or 'value') of the trail. We apply our model to two scenarios: in the first the probability of a user terminating the navigation session is independent of the number of links he has followed so far, and in the second the probability of a user terminating the navigation session increases by a constant each time the user follows a link. We analyze these scenarios using two sets of experimental data sets showing that, although the first scenario is only a rough approximation of surfers' behavior, the data is consistent with the second scenario and can thus provide an explanation of surfers' behavior

    Network as a computer: ranking paths to find flows

    Full text link
    We explore a simple mathematical model of network computation, based on Markov chains. Similar models apply to a broad range of computational phenomena, arising in networks of computers, as well as in genetic, and neural nets, in social networks, and so on. The main problem of interaction with such spontaneously evolving computational systems is that the data are not uniformly structured. An interesting approach is to try to extract the semantical content of the data from their distribution among the nodes. A concept is then identified by finding the community of nodes that share it. The task of data structuring is thus reduced to the task of finding the network communities, as groups of nodes that together perform some non-local data processing. Towards this goal, we extend the ranking methods from nodes to paths. This allows us to extract some information about the likely flow biases from the available static information about the network.Comment: 12 pages, CSR 200

    Computing the entropy of user navigation in the web

    Get PDF
    Navigation through the web, colloquially known as "surfing", is one of the main activities of users during web interaction. When users follow a navigation trail they often tend to get disoriented in terms of the goals of their original query and thus the discovery of typical user trails could be useful in providing navigation assistance. Herein, we give a theoretical underpinning of user navigation in terms of the entropy of an underlying Markov chain modelling the web topology. We present a novel method for online incremental computation of the entropy and a large deviation result regarding the length of a trail to realize the said entropy. We provide an error analysis for our estimation of the entropy in terms of the divergence between the empirical and actual probabilities. We then indicate applications of our algorithm in the area of web data mining. Finally, we present an extension of our technique to higher-order Markov chains by a suitable reduction of a higher-order Markov chain model to a first-order one

    Decolonising the Waters: Interspecies Encounters Between Sharks and Humans

    Get PDF
    Often portrayed as ‘man–eaters’, sharks are one of the most maligned apex species on earth. Media representation has fuelled public imagination, perpetuating fear and negative stereotypes of sharks and hysteria around human-shark interactions; whilst government initiatives such as beach netting and drum-lines target sharks for elimination. This interdisciplinary article, written from the points of view of environmental science and cultural studies, proposes humans as simply another species when entering the ocean, presenting a decolonising shift in paradigm that supports an interspecies ethics of engagement in understanding shark-human interactions. The shifting environmental, political, social and cultural realities of shark-human interactions are examined from the point of view of an endangered species that is hunted by humans in the pursuit of making beaches ‘safe’ for human leisure activities. The human ‘right to leisure’ enshrined in the Universal Declaration of Human Rights (1948) raises philosophical and ethical implications in respect of human rights taking precedence over a species’ right to live in its environment. The article builds upon philosophical debates in environmental ethics, offering a point of cultural recognition of the profound imbalance that is being imposed upon Nature. The article proposes a shift in approaches to human attitudes and uses of the ocean, decentralizing the anthropocentric, reinstating the ecological kinship of species

    A matter of words: NLP for quality evaluation of Wikipedia medical articles

    Get PDF
    Automatic quality evaluation of Web information is a task with many fields of applications and of great relevance, especially in critical domains like the medical one. We move from the intuition that the quality of content of medical Web documents is affected by features related with the specific domain. First, the usage of a specific vocabulary (Domain Informativeness); then, the adoption of specific codes (like those used in the infoboxes of Wikipedia articles) and the type of document (e.g., historical and technical ones). In this paper, we propose to leverage specific domain features to improve the results of the evaluation of Wikipedia medical articles. In particular, we evaluate the articles adopting an "actionable" model, whose features are related to the content of the articles, so that the model can also directly suggest strategies for improving a given article quality. We rely on Natural Language Processing (NLP) and dictionaries-based techniques in order to extract the bio-medical concepts in a text. We prove the effectiveness of our approach by classifying the medical articles of the Wikipedia Medicine Portal, which have been previously manually labeled by the Wiki Project team. The results of our experiments confirm that, by considering domain-oriented features, it is possible to obtain sensible improvements with respect to existing solutions, mainly for those articles that other approaches have less correctly classified. Other than being interesting by their own, the results call for further research in the area of domain specific features suitable for Web data quality assessment

    The Think-Aloud approach: A Promising Tool for Online Reading Comprehension

    Get PDF
    Despite its unquestionable interest from a theoretical and practical point of view, so far there has been little research on online reading and there is a lack of attention paid to this topic in most European educational institutions. In particular, primary and secondary school teachers are not adequately trained on how and when to intervene to support students’ proficiency in the online reading comprehension. After presenting a rationale demonstrating why students may struggle with online reading comprehension and the importance to adopt a self-regulated reading, this study proposes a Teacher’s Guide that could support late primary and secondary school teachers in planning online reading lessons with the Think-Aloud (TA) metacognitive technique

    A Taxonomy of Hyperlink Hiding Techniques

    Full text link
    Hidden links are designed solely for search engines rather than visitors. To get high search engine rankings, link hiding techniques are usually used for the profitability of black industries, such as illicit game servers, false medical services, illegal gambling, and less attractive high-profit industry, etc. This paper investigates hyperlink hiding techniques on the Web, and gives a detailed taxonomy. We believe the taxonomy can help develop appropriate countermeasures. Study on 5,583,451 Chinese sites' home pages indicate that link hidden techniques are very prevalent on the Web. We also tried to explore the attitude of Google towards link hiding spam by analyzing the PageRank values of relative links. The results show that more should be done to punish the hidden link spam.Comment: 12 pages, 2 figure

    Desk Set: Ready Reference on the Web

    Get PDF

    Simulating the conflict between reputation and profitability for online rating portals

    Get PDF
    We simulate the process of possible interactions between a set of competitive services and a set of portals that provide online rating for these services. We argue that to have a profitable business, these portals are forced to have subscribed services that are rated by the portals. To satisfy the subscribing services, we make the assumption that the portals improve the rating of a given service by one unit per transaction that involves payment. In this study we follow the 'what-if' methodology, analysing strategies that a service may choose from to select the best portal for it to subscribe to, and strategies for a portal to accept the subscription such that its reputation loss, in terms of the integrity of its ratings, is minimised. We observe that the behaviour of the simulated agents in accordance to our model is quite natural from the real-would perspective. One conclusion from the simulations is that under reasonable conditions, if most of the services and rating portals in a given industry do not accept a subscription policy similar to the one indicated above, they will lose, respectively, their ratings and reputations, and, moreover the rating portals will have problems in making a profit. Our prediction is that the modern portal-rating based economy sector will eventually evolve into a subscription process similar to the one we suggest in this study, as an alternative to a business model based purely on advertising
    corecore