16 research outputs found
A differential semantic algorithm for query relevant web page recommendation
With the exponential rise in the amount of information in the World Wide Web, there is a need for a much efficient algorithm for Web Search. The traditional keyword matching as well as the standard statistical techniques is insufficient as the Web Pages they recommend are not highly relevant to the query. With the growth in Semantic Web, an algorithm which semantically computes the most relevant Web Pages is required. In this paper, a methodology which computes the semantic heterogeneity between the keywords, content words and query words for web page recommendation is incorporated. A Differential Adaptive PMI Algorithm is formulated for with varied thresholds for recommending the Web Pages based on the input query. The proposed methodology yields an accuracy of 0.87 which is much better than the existing strategies. รยฉ 2016 IEEE
A taxonomy of web prediction algorithms
Web prefetching techniques are an attractive solution to reduce the user-perceived latency. These techniques are driven by a prediction engine or algorithm that guesses following actions of web users. A large amount of prediction algorithms has been proposed since the first prefetching approach was published, although it is only over the last two or three years when they have begun to be successfully implemented in commercial products. These algorithms can be implemented in any element of the web architecture and can use a wide variety of information as input. This affects their structure, data system, computational resources and accuracy. The knowledge of the input information and the understanding of how it can be handled to make predictions can help to improve the design of current prediction engines, and consequently prefetching techniques. This paper analyzes fifty of the most relevant algorithms proposed along 15 years of prefetching research and proposes a taxonomy where the algorithms are classified according to the input data they use. For each group, the main advantages and shortcomings are highlighted. ยฉ 2012 Elsevier Ltd. All rights reserved.This work has been partially supported by Spanish Ministry of Science and Innovation under Grant TIN2009-08201, Generalitat Valenciana under Grant GV/2011/002 and Universitat Politecnica de Valencia under Grant PAID-06-10/2424.Domenech, J.; De La Ossa Perez, BA.; Sahuquillo Borrรกs, J.; Gil Salinas, JA.; Pont Sanjuan, A. (2012). A taxonomy of web prediction algorithms. Expert Systems with Applications. 39(9):8496-8502. https://doi.org/10.1016/j.eswa.2012.01.140S8496850239
Referrer Graph: A cost-effective algorithm and pruning method for predicting web accesses
This paper presents the Referrer Graph (RG) web prediction algorithm and a pruning method for the associated
graph as a low-cost solution to predict next web users accesses. RG is aimed at being used in a real
web system with prefetching capabilities without degrading its performance. The algorithm learns from
users accesses and builds a Markov model. These kinds of algorithms use the sequence of the user accesses
to make predictions. Unlike previous Markov model based proposals, the RG algorithm differentiates
dependencies in objects of the same page from objects of different pages by using the object URI and the
referrer in each request. Although its design permits us to build a simple data structure that is easier to
handle and, consequently, needs lower computational cost in comparison with other algorithms, a pruning
mechanism has been devised to avoid the continuous growing of this data structure. Results show
that, compared with the best prediction algorithms proposed in the open literature, the RG algorithm
achieves similar precision values and page latency savings but requiring much less computational and
memory resources. Furthermore, when pruning is applied, additional and notable resource consumption
savings can be achieved without degrading original performance. In order to reduce further the resource
consumption, a mechanism to prune de graph has been devised, which reduces resource consumption of
the baseline system without degrading the latency savings.
2013 Elsevier B.V. All rights reserved.This work has been partially supported by Spanish Ministry of Science and Innovation under Grant TIN2009-08201. The authors would also like to thank the technical staff of the School of Computer Science at the Polytechnic University of Valencia for providing us recent and customized trace files logged by their web server.De La Ossa Perez, BA.; Gil Salinas, JA.; Sahuquillo Borrรกs, J.; Pont Sanjuan, A. (2013). Referrer Graph: A cost-effective algorithm and pruning method for predicting web accesses. Computer Communications. 36(8):881-894. https://doi.org/10.1016/j.comcom.2013.02.005S88189436
Time-weighted multi-touch attribution and channel relevance in the customer journey to online purchase
We address statistical issues in attributing revenue to marketing channels and inferring the importance of individual channels in customer journeys towards an online purchase. We describe the relevant data structures and introduce an example. We suggest an asymmetric bathtub shape as appropriate for time-weighted revenue attribution to the customer journey, provide an algorithm, and illustrate the method. We suggest a modification to this method when there is independent information available on the relative values of the channels. To infer channel importance, we employ sequential data analysis ideas and restrict to data which ends in a purchase. We propose metrics for source, intermediary, and destination channels based on twoand three-step transitions in fragments of the customer journey. We comment on the practicalities of formal hypothesis testing. We illustrate the ideas and computations using data from a major UK online retailer. Finally, we compare the revenue attributions suggested by the methods in this paper with several common attribution methods
An angle-based interest model for text recommendation
Building an interest model is the key to realize personalized text recommendation. Previous interest models neglect the fact that a user may have multiple angles of interests. Different angles of interest provide different requests and criteria for text recommendation. This paper proposes an interest model that consists of two kinds of angles: persistence and pattern, which can be combined to form complex angles. The model uses a new method to represent the long-term interest and the short-term interest, and distinguishes the interest on object and the interest on the link structure of objects. Experiments with news-scale text data show that the interest on object and the interest on link structure have real requirements, and it is effective to recommend texts according to the angles
๋๊ท๋ชจ TV ์์ฒญ๋ก๊ทธ ํด๋ฌ์คํฐ๋ง์ ํตํ ์์ฒญํ์ ๋ฐ ์์ฒญ๊ฐ๊ตฌ ์ ํ ๋ถ์ ์ฐ๊ตฌ
ํ์๋
ผ๋ฌธ (์์ฌ)-- ์์ธ๋ํ๊ต ์ตํฉ๊ณผํ๊ธฐ์ ๋ํ์ ์ตํฉ๊ณผํ๋ถ, 2017. 8. ์๋ด์.์ต๊ทผ์๋ ๋ ์ด์ ๊ณผ๊ฑฐ์ฒ๋ผ ๊ฐ์ ์๊ฐ์ ์จ ๊ฐ์กฑ์ด ๋ชจ์ฌ ์์ ์์ ๋ณธ๋ฐฉ์ฌ์๋ฅผ ํ๋ ํ๋๋ง์ผ๋ก๋ TV ์์ฒญ์ ์ดํดํ ์ ์์ ๋งํผ TV ์์ฒญ ํํ๊ฐ ๋งค์ฐ ๋ณต์กํด์ก๋ค. ๋ค์ํ ๋งค์ฒด์ ์ฝํ
์ธ ๊ณต๊ธ ์๋น์ค๋ค๊ณผ ์ํธ์์ฉํ๋ฉฐ ์๋ก ์ฝํ๊ณ ์คํจ ๋ณต์กํ ์์ฒญ ํ๋์ ๋ณด์ด๊ณ ์๋ ๊ฒ์ด๋ค. TV ์์ฒญ ํ๊ฒฝ์ ์ฝํ
์ธ ํ๋ซํผ ๋ฐ ๋๋ฐ์ด์ค ํ๊ฒฝ ๋ณํ๋ก ์ธํด ๊ณผ๊ฑฐ์ ๋ฌ๋ฆฌ ํจ์ฌ ์์ธกํ๊ธฐ ์ด๋ ค์ด ๋ณต์กํ ํ๊ฒฝ์ผ๋ก ๋ณ๋ชจํ๊ฒ ๋์๋ค.
TV๋ฅผ ๋๋ฌ์ผ ํ๊ฒฝ์ด ๋์ฑ ๋ณต์กํด์ง๋ ์ํฉ์์๋ TV ์์ฒญ์ ๋ํ ์ดํด๋ ์ฌ์ ํ ์ค์ํ๊ฒ ์ฌ๊ฒจ์ง๋ค. N-์คํฌ๋ฆฐ ์์ฒญ ํ๊ฒฝ์ด ๋ณดํธํ๋๋ฉด์ TV์ ๋ํ ๋น์ค์ด ํ๋ฝํ๊ณ ๋ ์์ผ๋, ์์ง๊น์ง๋ TV ์์ฒญ์ ๋ง์ ์๊ฐ์ ๋ณด๋ด๊ณ ์๊ณ ์ผ์ ์ํ์์์ ์ค์๋๋ ๋์ ๋งํผ, TV๋ ์ฌ์ ํ ์ฝํ
์ธ ์๋น์ ์์ด ์ค์ํ ์ญํ ์ ํ๊ณ ์๊ธฐ ๋๋ฌธ์ด๋ค. ๋ฌ๋ผ์ง ํ๊ฒฝ ์์์๋ TV ์์ฒญ์ ์ฌ์ ํ ๊ฑด์ฌํ ์ฌ๊ฐ ํ๋์ด๋ผ ํ ์ ์์ผ๋ฉฐ, ๋ณต์กํ ํ๊ฒฝ ์์์ ๋ฌ๋ผ์ง TV ์์ฒญ์์ ์์ฒญ ํํ์ ๋ํ ์ดํด๊ฐ ๋์ฑ ํ์ํ ์ํฉ์ด๋ผ๋ ์ ์ ์์ฌํ๋ค.
๋ณธ ์ฐ๊ตฌ์์๋ ํ๋์ ์ค์ฌ์ผ๋ก TV ์์ฒญ์ ์ดํดํ๊ณ ์ ํ๋ ๊ธฐ์กด ์ฐ๊ตฌ์ ํ์ฅ์ฑ ๋ฑ์ ํ๊ณ์ ์ ๊ทน๋ณตํ๊ณ ๊ถ๊ทน์ ์ผ๋ก๋ ์ ํต์ ๊ด์ ์์ ๋ฒ์ด๋ ๋ค๋ณํ๋ TV ์์ฒญ ํ๊ฒฝ์์์ TV ์์ฒญ์ ๋ํ ์ดํด์ ํญ์ ๋ํ๊ณ ์, ๋์งํธ ์ผ์ด๋ธ TV ์
ํฑ๋ฐ์ค ๋ก๊ทธ๋ฅผ ํตํด ํ๋ํ ๋๊ท๋ชจ TV ์์ฒญ ๋ก๊ทธ๋ฅผ ๋ฐํ์ผ๋ก TV ์์ฒญ ํจํด์ ํ๋ ์ค์ฌ์ผ๋ก ์ ํํํ๊ณ , ์ด๋ฅผ ๋ค์ ์ฌ์ฉ์ ์ค์ฌ์ผ๋ก ์กฐํฉํ์ฌ ํด์ํ๋ ํ๋ ์์ํฌ๋ฅผ ์ ์ํ์๋ค. ์ด๋ฅผ ์ํด ๊ธฐ์กด์ ์น ์ฌ์ฉ ๋ง์ด๋ ๋ถ์ผ์์ ์ฌ์ฉ๋์๋ ์ธ์
ํด๋ฌ์คํฐ๋ง ๊ธฐ๋ฐ ์ ํํ ๋ถ์ ๊ธฐ๋ฒ์ TV ์์ฒญ ๋ก๊ทธ์ ์ ์ฉํ์๋ค. ๋ํ ์ ํํ ๋ ์์ฒญ ํ๋๊ณผ ์๋น์ค ํด์ง์จ ๊ฐ์ ์๊ด๊ด๊ณ๋ฅผ ์ดํด๋ด์ผ๋ก์จ ๋ณธ ์ฐ๊ตฌ์ ์ ๊ทผ ๋ฐฉ์์ด ์ ํจํจ์ ์
์ฆํ๊ณ ์ ํ์๋ค.
์ ์๋ ๋ถ์ ํ๋ ์์ํฌ๋ฅผ ํตํด, ๋ณธ ์ฐ๊ตฌ์์๋ ์ด 7๊ฐ์ ์์ฒญ ํ๋ ์ ํ๊ณผ ์ด๋ฅผ ํตํด ์กฐํฉ๋ 8๊ฐ์ ์์ฒญ ๊ฐ๊ตฌ ์ ํ์ ๋์ถํ์๋ค. ๋ํ ๊ฐ ์์ฒญ ๊ฐ๊ตฌ ์ ํ ๊ทธ๋ฃน ๋ด์ ์๋น์ค ํด์ง์จ๊ณผ ์์ฒญ ํ๋ ์ ํ ๊ตฌ์ฑ๋น ๊ฐ์ ์๊ด๊ด๊ณ ๋์ถ์ ํตํด, ๋ณธ ์ฐ๊ตฌ์์ ๋์ถํ ์์ฒญ ํ๋ ์ ํ์ด ์๋น์ค ํด์ง๋ฅผ ์๋ฏธ ์๊ฒ ์ค๋ช
ํ ์ ์๋ค๋ ๊ฒ์ ํ์ธํ ์ ์์๋ค.
๋ณธ ์ฐ๊ตฌ๋ฅผ ํตํด ์ ์๋ ๋ถ์ ํ๋ ์์ํฌ๋ฅผ ํ์ฉํ์ฌ ํ๋์ ๊ธฐ๋ฐ์ผ๋ก ์์ฒญ ํจํด์ ๋ถ์ํจ์ผ๋ก์จ, ๊ธฐ์กด์ ๊ฑฐ์์ ๋งฅ๋ฝ์์ ์ด๋ฃจ์ด์ง ์ ํ ์ฐ๊ตฌ๋ฅผ ํ์ฅํ์ฌ ํ์ฌ ๋ฏธ๋์ด ํ๊ฒฝ์์์ TV ์์ฒญ ํ์์ ๋ํด ๋์ฑ ํ๋ถํ ์ดํด๋ฅผ ๋์ธ ์ ์์ ๊ฒ์ผ๋ก ๊ธฐ๋ํ๋ค.์ 1 ์ฅ ์ ๋ก 1
์ 1 ์ ์ฐ๊ตฌ์ ๋ฐฐ๊ฒฝ 1
์ 2 ์ ์ฐ๊ตฌ์ ๋ชฉํ 10
์ 2 ์ฅ ์ ํ์ฐ๊ตฌ 13
์ 1 ์ ์ด๋ก ์ ๋ฐฐ๊ฒฝ 13
์ 2 ์ ๊ธฐ์ ์ ๋ฐฐ๊ฒฝ 24
์ 3 ์ฅ ์ฐ๊ตฌ ๋ฌธ์ 30
์ 1 ์ ์ฐ๊ตฌ ๋ฌธ์ 30
์ 4 ์ฅ ์ฐ๊ตฌ ๋ฐฉ๋ฒ 33
์ 1 ์ ๋ฐ์ดํฐ ๊ฐ์ ๋ฐ ์ ์ฒ๋ฆฌ 34
์ 2 ์ ์ธ์
์ ํํ 38
์ 3 ์ ์์ฒญ ๊ฐ๊ตฌ ์ ํํ 46
์ 4 ์ ์ธ์
์ ํ๊ณผ ์๋น์ค ํด์ง์ ์ฐ๊ด์ฑ 48
์ 5 ์ฅ ์ฐ๊ตฌ ๊ฒฐ๊ณผ 49
์ 1 ์ ์ธ์
์ ํ 49
์ 2 ์ ์์ฒญ ๊ฐ๊ตฌ ์ ํ 60
์ 3 ์ ์ธ์
์ ํ๊ณผ ์๋น์ค ํด์ง์ ์ฐ๊ด์ฑ 77
์ 6 ์ฅ ๊ฒฐ ๋ก 85
์ 1 ์ ์ฐ๊ตฌ ์์ฝ 85
์ 2 ์ ์ฐ๊ตฌ์ ์์ฌ์ 87
์ 3 ์ ์ฐ๊ตฌ์ ํ๊ณ 90
์ฐธ๊ณ ๋ฌธํ 92Maste
A web page prediction model based on click-stream tree representation of user behavior
Predicting the next request of a user as she visits Web pages has gained importance as Web-based activity increases. Markov models and their variations, or models based on sequence mining have been found well suited for this problem. However, higher order Markov models are extremely complicated due to their large number of states whereas lower order Markov models do not capture the entire behavior of a user in a session. The models that are based on sequential pattern mining only consider the frequent sequences in the data set, making it difficult to predict the next request following a page that is not in the sequential pattern. Furthermore, it is hard to find models for mining two different kinds of information of a user session. We propose a new model that considers both the order information of pages in a session and the time spent on them. We cluster user sessions based on their pair-wise similarity and represent the resulting clusters by a click-stream tree. The new user session is then assigned to a cluster based on a similarity measure. The click-stream tree of that cluster is used to generate the recommendation set. The model can be used as part of a cache prefetching system as well as a recommendation model. Categories and Subject Descriptors I.5.2 [Pattern Recognition]: Design Methodology- classifier design and evaluatio