Search CORE

16 research outputs found

A differential semantic algorithm for query relevant web page recommendation

Author: Babu M.S.H.
Deepak G.
Priyadarshini J.S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/03/2017
Field of study

With the exponential rise in the amount of information in the World Wide Web, there is a need for a much efficient algorithm for Web Search. The traditional keyword matching as well as the standard statistical techniques is insufficient as the Web Pages they recommend are not highly relevant to the query. With the growth in Semantic Web, an algorithm which semantically computes the most relevant Web Pages is required. In this paper, a methodology which computes the semantic heterogeneity between the keywords, content words and query words for web page recommendation is incorporated. A Differential Adaptive PMI Algorithm is formulated for with varied thresholds for recommending the Web Pages based on the input query. The proposed methodology yields an accuracy of 0.87 which is much better than the existing strategies. Â© 2016 IEEE

ePrints@Bangalore University

A taxonomy of web prediction algorithms

Author: Aciar
Ana Pont
Bernardo de la Ossa
Bestavros
Chen
Cohen
Davison
de la Ossa
Domenech
Dongshan
Georgakis
Jose A. Gil
Josep Domenech
Julio Sahuquillo
Mobasher
Nanopoulos
Padmanabhan
Pallis
Pons
Pons
Rabinovich
Rangarajan
Sidiropoulos
Teng
Venkataramani
Wu
Xu
Yang
Publication venue: 'Elsevier BV'
Publication date: 01/07/2012
Field of study

Web prefetching techniques are an attractive solution to reduce the user-perceived latency. These techniques are driven by a prediction engine or algorithm that guesses following actions of web users. A large amount of prediction algorithms has been proposed since the first prefetching approach was published, although it is only over the last two or three years when they have begun to be successfully implemented in commercial products. These algorithms can be implemented in any element of the web architecture and can use a wide variety of information as input. This affects their structure, data system, computational resources and accuracy. The knowledge of the input information and the understanding of how it can be handled to make predictions can help to improve the design of current prediction engines, and consequently prefetching techniques. This paper analyzes fifty of the most relevant algorithms proposed along 15 years of prefetching research and proposes a taxonomy where the algorithms are classified according to the input data they use. For each group, the main advantages and shortcomings are highlighted. © 2012 Elsevier Ltd. All rights reserved.This work has been partially supported by Spanish Ministry of Science and Innovation under Grant TIN2009-08201, Generalitat Valenciana under Grant GV/2011/002 and Universitat Politecnica de Valencia under Grant PAID-06-10/2424.Domenech, J.; De La Ossa Perez, BA.; Sahuquillo Borrás, J.; Gil Salinas, JA.; Pont Sanjuan, A. (2012). A taxonomy of web prediction algorithms. Expert Systems with Applications. 39(9):8496-8502. https://doi.org/10.1016/j.eswa.2012.01.140S8496850239

Crossref

RiuNet

Referrer Graph: A cost-effective algorithm and pruning method for predicting web accesses

Author: A. Pont
B. de la Ossa
Deshpande
Doménech
J. Sahuquillo
J.A. Gil
Nanopoulos
Yang
Publication venue: 'Elsevier BV'
Publication date: 01/05/2013
Field of study

This paper presents the Referrer Graph (RG) web prediction algorithm and a pruning method for the associated graph as a low-cost solution to predict next web users accesses. RG is aimed at being used in a real web system with prefetching capabilities without degrading its performance. The algorithm learns from users accesses and builds a Markov model. These kinds of algorithms use the sequence of the user accesses to make predictions. Unlike previous Markov model based proposals, the RG algorithm differentiates dependencies in objects of the same page from objects of different pages by using the object URI and the referrer in each request. Although its design permits us to build a simple data structure that is easier to handle and, consequently, needs lower computational cost in comparison with other algorithms, a pruning mechanism has been devised to avoid the continuous growing of this data structure. Results show that, compared with the best prediction algorithms proposed in the open literature, the RG algorithm achieves similar precision values and page latency savings but requiring much less computational and memory resources. Furthermore, when pruning is applied, additional and notable resource consumption savings can be achieved without degrading original performance. In order to reduce further the resource consumption, a mechanism to prune de graph has been devised, which reduces resource consumption of the baseline system without degrading the latency savings. 2013 Elsevier B.V. All rights reserved.This work has been partially supported by Spanish Ministry of Science and Innovation under Grant TIN2009-08201. The authors would also like to thank the technical staff of the School of Computer Science at the Polytechnic University of Valencia for providing us recent and customized trace files logged by their web server.De La Ossa Perez, BA.; Gil Salinas, JA.; Sahuquillo Borrás, J.; Pont Sanjuan, A. (2013). Referrer Graph: A cost-effective algorithm and pruning method for predicting web accesses. Computer Communications. 36(8):881-894. https://doi.org/10.1016/j.comcom.2013.02.005S88189436

Crossref

RiuNet

Time-weighted multi-touch attribution and channel relevance in the customer journey to online purchase

Author: Anderson J.M.
Wooff D.A.
Publication venue: Springer
Publication date: 22/05/2014
Field of study

We address statistical issues in attributing revenue to marketing channels and inferring the importance of individual channels in customer journeys towards an online purchase. We describe the relevant data structures and introduce an example. We suggest an asymmetric bathtub shape as appropriate for time-weighted revenue attribution to the customer journey, provide an algorithm, and illustrate the method. We suggest a modification to this method when there is independent information available on the relative values of the channels. To infer channel importance, we employ sequential data analysis ideas and restrict to data which ends in a purchase. We propose metrics for source, intermediary, and destination channels based on twoand three-step transitions in fragments of the customer journey. We comment on the practicalities of formal hypothesis testing. We illustrate the ideas and computations using data from a major UK online retailer. Finally, we compare the revenue attributions suggested by the methods in this paper with several common attribution methods

Durham Research Online

Crossref

An angle-based interest model for text recommendation

Author: Xu Bei
Zhuge Hai
Publication venue: 'Elsevier BV'
Publication date: 01/11/2016
Field of study

Building an interest model is the key to realize personalized text recommendation. Previous interest models neglect the fact that a user may have multiple angles of interests. Different angles of interest provide different requests and criteria for text recommendation. This paper proposes an interest model that consists of two kinds of angles: persistence and pattern, which can be combined to form complex angles. The model uses a new method to represent the long-term interest and the short-term interest, and distinguishes the interest on object and the interest on the link structure of objects. Experiments with news-scale text data show that the interest on object and the interest on link structure have real requirements, and it is effective to recommend texts according to the angles

Crossref

Aston Publications Explorer

대규모 TV 시청로그 클러스터링을 통한 시청행위 및 시청가구 유형 분석 연구

Author: 이태영
Publication venue: 서울대학교 융합과학기술대학원
Publication date: 01/08/2017
Field of study

학위논문 (석사)-- 서울대학교 융합과학기술대학원 융합과학부, 2017. 8. 서봉원.최근에는 더 이상 과거처럼 같은 시간에 온 가족이 모여 앉아 소위 본방사수를 하는 행동만으로는 TV 시청을 이해할 수 없을 만큼 TV 시청 행태가 매우 복잡해졌다. 다양한 매체와 콘텐츠 공급 서비스들과 상호작용하며 서로 얽히고설킨 복잡한 시청 행동을 보이고 있는 것이다. TV 시청 환경은 콘텐츠 플랫폼 및 디바이스 환경 변화로 인해 과거와 달리 훨씬 예측하기 어려운 복잡한 환경으로 변모하게 되었다. TV를 둘러싼 환경이 더욱 복잡해지는 상황에서도 TV 시청에 대한 이해는 여전히 중요하게 여겨진다. N-스크린 시청 환경이 보편화되면서 TV에 대한 비중이 하락하고는 있으나, 아직까지는 TV 시청에 많은 시간을 보내고 있고 일상 생활에서의 중요도도 높은 만큼, TV는 여전히 콘텐츠 소비에 있어 중요한 역할을 하고 있기 때문이다. 달라진 환경 속에서도 TV 시청은 여전히 건재한 여가 활동이라 할 수 있으며, 복잡한 환경 속에서 달라진 TV 시청자와 시청 행태에 대한 이해가 더욱 필요한 상황이라는 점을 시사한다. 본 연구에서는 행동을 중심으로 TV 시청을 이해하고자 했던 기존 연구의 확장성 등의 한계점을 극복하고 궁극적으로는 전통적 관점에서 벗어나 다변화된 TV 시청 환경에서의 TV 시청에 대한 이해의 폭을 넓히고자, 디지털 케이블 TV 셋톱박스 로그를 통해 획득한 대규모 TV 시청 로그를 바탕으로 TV 시청 패턴을 행동 중심으로 유형화하고, 이를 다시 사용자 중심으로 조합하여 해석하는 프레임워크를 제시하였다. 이를 위해 기존의 웹 사용 마이닝 분야에서 사용되었던 세션 클러스터링 기반 유형화 분석 기법을 TV 시청 로그에 적용하였다. 또한 유형화 된 시청 행동과 서비스 해지율 간의 상관관계를 살펴봄으로써 본 연구의 접근 방식이 유효함을 입증하고자 하였다. 제안된 분석 프레임워크를 통해, 본 연구에서는 총 7개의 시청 행동 유형과 이를 통해 조합된 8개의 시청 가구 유형을 도출하였다. 또한 각 시청 가구 유형 그룹 내의 서비스 해지율과 시청 행동 유형 구성비 간의 상관관계 도출을 통해, 본 연구에서 도출한 시청 행동 유형이 서비스 해지를 의미 있게 설명할 수 있다는 것을 확인할 수 있었다. 본 연구를 통해 제안된 분석 프레임워크를 활용하여 행동을 기반으로 시청 패턴을 분석함으로써, 기존의 거시적 맥락에서 이루어진 선행 연구를 확장하여 현재 미디어 환경에서의 TV 시청 행위에 대해 더욱 풍부한 이해를 도울 수 있을 것으로 기대한다.제 1 장 서 론 1 제 1 절 연구의 배경 1 제 2 절 연구의 목표 10 제 2 장 선행연구 13 제 1 절 이론적 배경 13 제 2 절 기술적 배경 24 제 3 장 연구 문제 30 제 1 절 연구 문제 30 제 4 장 연구 방법 33 제 1 절 데이터 개요 및 전처리 34 제 2 절 세션 유형화 38 제 3 절 시청 가구 유형화 46 제 4 절 세션 유형과 서비스 해지의 연관성 48 제 5 장 연구 결과 49 제 1 절 세션 유형 49 제 2 절 시청 가구 유형 60 제 3 절 세션 유형과 서비스 해지의 연관성 77 제 6 장 결 론 85 제 1 절 연구 요약 85 제 2 절 연구의 시사점 87 제 3 절 연구의 한계 90 참고문헌 92Maste

SNU Open Repository and Archive

A web page prediction model based on click-stream tree representation of user behavior

Author: S Ule Gündüz
Publication venue
Publication date: 01/01/2003
Field of study

Predicting the next request of a user as she visits Web pages has gained importance as Web-based activity increases. Markov models and their variations, or models based on sequence mining have been found well suited for this problem. However, higher order Markov models are extremely complicated due to their large number of states whereas lower order Markov models do not capture the entire behavior of a user in a session. The models that are based on sequential pattern mining only consider the frequent sequences in the data set, making it difficult to predict the next request following a page that is not in the sequential pattern. Furthermore, it is hard to find models for mining two different kinds of information of a user session. We propose a new model that considers both the order information of pages in a session and the time spent on them. We cluster user sessions based on their pair-wise similarity and represent the resulting clusters by a click-stream tree. The new user session is then assigned to a cluster based on a similarity measure. The click-stream tree of that cluster is used to generate the recommendation set. The model can be used as part of a cache prefetching system as well as a recommendation model. Categories and Subject Descriptors I.5.2 [Pattern Recognition]: Design Methodology- classifier design and evaluatio

CiteSeerX