Search CORE

3,962 research outputs found

A fine grained heuristic to capture web navigation patterns

Author: Borges J.
Charniak E.
José Borges
Mark Levene
Perkowitz M.
Perkowltz M.
Spiliopoulou M.
Stout R.
Zin N.
Publication venue: KDD
Publication date: 01/01/2000
Field of study

In previous work we have proposed a statistical model to capture the user behaviour when browsing the web. The user navigation information obtained from web logs is modelled as a hypertext probabilistic grammar (HPG) which is within the class of regular probabilistic grammars. The set of highest probability strings generated by the grammar corresponds to the user preferred navigation trails. We have previously conducted experiments with a Breadth-First Search algorithm (BFS) to perform the exhaustive computation of all the strings with probability above a specified cut-point, which we call the rules. Although the algorithm’s running time varies linearly with the number of grammar states, it has the drawbacks of returning a large number of rules when the cut-point is small and a small set of very short rules when the cut-point is high. In this work, we present a new heuristic that implements an iterative deepening search wherein the set of rules is incrementally augmented by first exploring trails with high probability. A stopping parameter is provided which measures the distance between the current rule-set and its corresponding maximal set obtained by the BFS algorithm. When the stopping parameter takes the value zero the heuristic corresponds to the BFS algorithm and as the parameter takes values closer to one the number of rules obtained decreases accordingly. Experiments were conducted with both real and synthetic data and the results show that for a given cut-point the number of rules induced increases smoothly with the decrease of the stopping criterion. Therefore, by setting the value of the stopping criterion the analyst can determine the number and quality of rules to be induced; the quality of a rule is measured by both its length and probability

CiteSeerX

Crossref

Birkbeck Institutional Research Online

Exploring scholarly data with Rexplore.

Author: C. Bizer
C. Dunne
F. Osborne
I.O. Popov
J. Brooke
K. Möller
M. Hildebrand
M.C. Schraefel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Despite the large number and variety of tools and services available today for exploring scholarly data, current support is still very limited in the context of sensemaking tasks, which go beyond standard search and ranking of authors and publications, and focus instead on i) understanding the dynamics of research areas, ii) relating authors ‘semantically’ (e.g., in terms of common interests or shared academic trajectories), or iii) performing fine-grained academic expert search along multiple dimensions. To address this gap we have developed a novel tool, Rexplore, which integrates statistical analysis, semantic technologies, and visual analytics to provide effective support for exploring and making sense of scholarly data. Here, we describe the main innovative elements of the tool and we present the results from a task-centric empirical evaluation, which shows that Rexplore is highly effective at providing support for the aforementioned sensemaking tasks. In addition, these results are robust both with respect to the background of the users (i.e., expert analysts vs. ‘ordinary’ users) and also with respect to whether the tasks are selected by the evaluators or proposed by the users themselves

Crossref

Open Research Online (The Open University)

Layered evaluation of interactive adaptive systems : framework and formative methods

Author: Masthoff Judith
Paramythis Alexandros
Weibelzahl Stephan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/11/2010
Field of study

Peer reviewedPostprin

Aberdeen University Research

Crossref

An average linear time algorithm for web data mining

Author: JOSÉ BORGES
Kemeny J. G.
MARK LEVENE
Srivastava J.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2004
Field of study

In this paper, we study the complexity of a data mining algorithm for extracting patterns from user web navigation data that was proposed in previous work.3 The user web navigation sessions are inferred from log data and modeled as a Markov chain. The chain's higher probability trails correspond to the preferred trails on the web site. The algorithm implements a depth-first search that scans the Markov chain for the high probability trails. We show that the average behaviour of the algorithm is linear time in the number of web pages accessed

CiteSeerX

Crossref

Birkbeck Institutional Research Online

A Trio Neural Model for Dynamic Entity Relatedness Ranking

Author: Nejdl Wolfgang
Nguyen Tu Ngoc
Tran Tuan
Publication venue
Publication date: 06/09/2018
Field of study

Measuring entity relatedness is a fundamental task for many natural language processing and information retrieval applications. Prior work often studies entity relatedness in static settings and an unsupervised manner. However, entities in real-world are often involved in many different relationships, consequently entity-relations are very dynamic over time. In this work, we propose a neural networkbased approach for dynamic entity relatedness, leveraging the collective attention as supervision. Our model is capable of learning rich and different entity representations in a joint framework. Through extensive experiments on large-scale datasets, we demonstrate that our method achieves better results than competitive baselines.Comment: In Proceedings of CoNLL 201

arXiv.org e-Print Archive

Tackling mobile traffic critical path analysis with passive and active measurements

Author: agababov
blog
brutlag
butkiewicz
collins
colwyn
erman
hofeld
kelton
li
mok
ravindranath
tangari
wang
wang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/06/2019
Field of study

Critical Path Analysis (CPA) studies the delivery of webpages to identify page resources, their interrelations, as well as their impact on the page loading latency. Despite CPA being a generic methodology, its mechanisms have been applied only to browsers and web traffic, but those do not directly apply to study generic mobile apps. Likewise, web browsing represents only a small fraction of the overall mobile traffic. In this paper, we take a first step towards filling this gap by exploring how CPA can be performed for generic mobile applications. We propose Mobile Critical Path Analysis (MCPA), a methodology based on passive and active network measurements that is applicable to a broad set of apps to expose a fine-grained view of their traffic dynamics. We validate MCPA on popular apps across different categories and usage scenarios. We show that MCPA can identify user interactions with mobile apps only based on traffic monitoring, and the relevant network activities that are bottlenecks. Overall, we observe that apps spend 60% of time and 84% of bytes on critical traffic on average, corresponding to +22% time and +13% bytes than what observed for browsing

Crossref

UCL Discovery