9 research outputs found

    Towards Privacy Compliant and Anytime Recommender Systems

    Get PDF
    The original publication is available at www.springerlink.comInternational audienceRecommendation technologies have traditionally been used in domains such as E-commerce and Web navigation to recommend resources to customers so as to help them to get the pertinent resources. Among the possible approaches is collaborative filtering that does not take into account the content of the resources: only the traces of usage of the resources are considered. State of the art models, such as sequential association-rules and Markov models, that can be used in the frame of privacy concerns, are usually studied in terms of performance, state space complexity and time complexity. Many of them have a large time complexity and require a long time to compute recommendations. However, there are domains of application of the models where recommendations may be required quickly. This paper focuses on the study of how these state of the art models can be adapted so as to be anytime. In that case recommendations can be proposed to the user whatever is the computation time available, the quality of the recommendations increases according to the computation time. We show that such models can be adapted so as to be anytime and we propose several strategies to compute recommendations iteratively. We also show that the computation time needed by these new models is not increased compared to classical ones; even so, it sometimes decreases

    Analysis of Users' Behavior in Structured e-Commerce Websites

    Get PDF
    Online shopping is becoming more and more common in our daily lives. Understanding users'' interests and behavior is essential to adapt e-commerce websites to customers'' requirements. The information about users'' behavior is stored in the Web server logs. The analysis of such information has focused on applying data mining techniques, where a rather static characterization is used to model users'' behavior, and the sequence of the actions performed by them is not usually considered. Therefore, incorporating a view of the process followed by users during a session can be of great interest to identify more complex behavioral patterns. To address this issue, this paper proposes a linear-temporal logic model checking approach for the analysis of structured e-commerce Web logs. By defining a common way of mapping log records according to the e-commerce structure, Web logs can be easily converted into event logs where the behavior of users is captured. Then, different predefined queries can be performed to identify different behavioral patterns that consider the different actions performed by a user during a session. Finally, the usefulness of the proposed approach has been studied by applying it to a real case study of a Spanish e-commerce website. The results have identified interesting findings that have made possible to propose some improvements in the website design with the aim of increasing its efficiency

    ๋Œ€๊ทœ๋ชจ TV ์‹œ์ฒญ๋กœ๊ทธ ํด๋Ÿฌ์Šคํ„ฐ๋ง์„ ํ†ตํ•œ ์‹œ์ฒญํ–‰์œ„ ๋ฐ ์‹œ์ฒญ๊ฐ€๊ตฌ ์œ ํ˜• ๋ถ„์„ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ์œตํ•ฉ๊ณผํ•™๊ธฐ์ˆ ๋Œ€ํ•™์› ์œตํ•ฉ๊ณผํ•™๋ถ€, 2017. 8. ์„œ๋ด‰์›.์ตœ๊ทผ์—๋Š” ๋” ์ด์ƒ ๊ณผ๊ฑฐ์ฒ˜๋Ÿผ ๊ฐ™์€ ์‹œ๊ฐ„์— ์˜จ ๊ฐ€์กฑ์ด ๋ชจ์—ฌ ์•‰์•„ ์†Œ์œ„ ๋ณธ๋ฐฉ์‚ฌ์ˆ˜๋ฅผ ํ•˜๋Š” ํ–‰๋™๋งŒ์œผ๋กœ๋Š” TV ์‹œ์ฒญ์„ ์ดํ•ดํ•  ์ˆ˜ ์—†์„ ๋งŒํผ TV ์‹œ์ฒญ ํ–‰ํƒœ๊ฐ€ ๋งค์šฐ ๋ณต์žกํ•ด์กŒ๋‹ค. ๋‹ค์–‘ํ•œ ๋งค์ฒด์™€ ์ฝ˜ํ…์ธ  ๊ณต๊ธ‰ ์„œ๋น„์Šค๋“ค๊ณผ ์ƒํ˜ธ์ž‘์šฉํ•˜๋ฉฐ ์„œ๋กœ ์–ฝํžˆ๊ณ ์„คํ‚จ ๋ณต์žกํ•œ ์‹œ์ฒญ ํ–‰๋™์„ ๋ณด์ด๊ณ  ์žˆ๋Š” ๊ฒƒ์ด๋‹ค. TV ์‹œ์ฒญ ํ™˜๊ฒฝ์€ ์ฝ˜ํ…์ธ  ํ”Œ๋žซํผ ๋ฐ ๋””๋ฐ”์ด์Šค ํ™˜๊ฒฝ ๋ณ€ํ™”๋กœ ์ธํ•ด ๊ณผ๊ฑฐ์™€ ๋‹ฌ๋ฆฌ ํ›จ์”ฌ ์˜ˆ์ธกํ•˜๊ธฐ ์–ด๋ ค์šด ๋ณต์žกํ•œ ํ™˜๊ฒฝ์œผ๋กœ ๋ณ€๋ชจํ•˜๊ฒŒ ๋˜์—ˆ๋‹ค. TV๋ฅผ ๋‘˜๋Ÿฌ์‹ผ ํ™˜๊ฒฝ์ด ๋”์šฑ ๋ณต์žกํ•ด์ง€๋Š” ์ƒํ™ฉ์—์„œ๋„ TV ์‹œ์ฒญ์— ๋Œ€ํ•œ ์ดํ•ด๋Š” ์—ฌ์ „ํžˆ ์ค‘์š”ํ•˜๊ฒŒ ์—ฌ๊ฒจ์ง„๋‹ค. N-์Šคํฌ๋ฆฐ ์‹œ์ฒญ ํ™˜๊ฒฝ์ด ๋ณดํŽธํ™”๋˜๋ฉด์„œ TV์— ๋Œ€ํ•œ ๋น„์ค‘์ด ํ•˜๋ฝํ•˜๊ณ ๋Š” ์žˆ์œผ๋‚˜, ์•„์ง๊นŒ์ง€๋Š” TV ์‹œ์ฒญ์— ๋งŽ์€ ์‹œ๊ฐ„์„ ๋ณด๋‚ด๊ณ  ์žˆ๊ณ  ์ผ์ƒ ์ƒํ™œ์—์„œ์˜ ์ค‘์š”๋„๋„ ๋†’์€ ๋งŒํผ, TV๋Š” ์—ฌ์ „ํžˆ ์ฝ˜ํ…์ธ  ์†Œ๋น„์— ์žˆ์–ด ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•˜๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๋‹ฌ๋ผ์ง„ ํ™˜๊ฒฝ ์†์—์„œ๋„ TV ์‹œ์ฒญ์€ ์—ฌ์ „ํžˆ ๊ฑด์žฌํ•œ ์—ฌ๊ฐ€ ํ™œ๋™์ด๋ผ ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋ณต์žกํ•œ ํ™˜๊ฒฝ ์†์—์„œ ๋‹ฌ๋ผ์ง„ TV ์‹œ์ฒญ์ž์™€ ์‹œ์ฒญ ํ–‰ํƒœ์— ๋Œ€ํ•œ ์ดํ•ด๊ฐ€ ๋”์šฑ ํ•„์š”ํ•œ ์ƒํ™ฉ์ด๋ผ๋Š” ์ ์„ ์‹œ์‚ฌํ•œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ํ–‰๋™์„ ์ค‘์‹ฌ์œผ๋กœ TV ์‹œ์ฒญ์„ ์ดํ•ดํ•˜๊ณ ์ž ํ–ˆ๋˜ ๊ธฐ์กด ์—ฐ๊ตฌ์˜ ํ™•์žฅ์„ฑ ๋“ฑ์˜ ํ•œ๊ณ„์ ์„ ๊ทน๋ณตํ•˜๊ณ  ๊ถ๊ทน์ ์œผ๋กœ๋Š” ์ „ํ†ต์  ๊ด€์ ์—์„œ ๋ฒ—์–ด๋‚˜ ๋‹ค๋ณ€ํ™”๋œ TV ์‹œ์ฒญ ํ™˜๊ฒฝ์—์„œ์˜ TV ์‹œ์ฒญ์— ๋Œ€ํ•œ ์ดํ•ด์˜ ํญ์„ ๋„“ํžˆ๊ณ ์ž, ๋””์ง€ํ„ธ ์ผ€์ด๋ธ” TV ์…‹ํ†ฑ๋ฐ•์Šค ๋กœ๊ทธ๋ฅผ ํ†ตํ•ด ํš๋“ํ•œ ๋Œ€๊ทœ๋ชจ TV ์‹œ์ฒญ ๋กœ๊ทธ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ TV ์‹œ์ฒญ ํŒจํ„ด์„ ํ–‰๋™ ์ค‘์‹ฌ์œผ๋กœ ์œ ํ˜•ํ™”ํ•˜๊ณ , ์ด๋ฅผ ๋‹ค์‹œ ์‚ฌ์šฉ์ž ์ค‘์‹ฌ์œผ๋กœ ์กฐํ•ฉํ•˜์—ฌ ํ•ด์„ํ•˜๋Š” ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ•˜์˜€๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๊ธฐ์กด์˜ ์›น ์‚ฌ์šฉ ๋งˆ์ด๋‹ ๋ถ„์•ผ์—์„œ ์‚ฌ์šฉ๋˜์—ˆ๋˜ ์„ธ์…˜ ํด๋Ÿฌ์Šคํ„ฐ๋ง ๊ธฐ๋ฐ˜ ์œ ํ˜•ํ™” ๋ถ„์„ ๊ธฐ๋ฒ•์„ TV ์‹œ์ฒญ ๋กœ๊ทธ์— ์ ์šฉํ•˜์˜€๋‹ค. ๋˜ํ•œ ์œ ํ˜•ํ™” ๋œ ์‹œ์ฒญ ํ–‰๋™๊ณผ ์„œ๋น„์Šค ํ•ด์ง€์œจ ๊ฐ„์˜ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ์‚ดํŽด๋ด„์œผ๋กœ์จ ๋ณธ ์—ฐ๊ตฌ์˜ ์ ‘๊ทผ ๋ฐฉ์‹์ด ์œ ํšจํ•จ์„ ์ž…์ฆํ•˜๊ณ ์ž ํ•˜์˜€๋‹ค. ์ œ์•ˆ๋œ ๋ถ„์„ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ํ†ตํ•ด, ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ์ด 7๊ฐœ์˜ ์‹œ์ฒญ ํ–‰๋™ ์œ ํ˜•๊ณผ ์ด๋ฅผ ํ†ตํ•ด ์กฐํ•ฉ๋œ 8๊ฐœ์˜ ์‹œ์ฒญ ๊ฐ€๊ตฌ ์œ ํ˜•์„ ๋„์ถœํ•˜์˜€๋‹ค. ๋˜ํ•œ ๊ฐ ์‹œ์ฒญ ๊ฐ€๊ตฌ ์œ ํ˜• ๊ทธ๋ฃน ๋‚ด์˜ ์„œ๋น„์Šค ํ•ด์ง€์œจ๊ณผ ์‹œ์ฒญ ํ–‰๋™ ์œ ํ˜• ๊ตฌ์„ฑ๋น„ ๊ฐ„์˜ ์ƒ๊ด€๊ด€๊ณ„ ๋„์ถœ์„ ํ†ตํ•ด, ๋ณธ ์—ฐ๊ตฌ์—์„œ ๋„์ถœํ•œ ์‹œ์ฒญ ํ–‰๋™ ์œ ํ˜•์ด ์„œ๋น„์Šค ํ•ด์ง€๋ฅผ ์˜๋ฏธ ์žˆ๊ฒŒ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋ฅผ ํ†ตํ•ด ์ œ์•ˆ๋œ ๋ถ„์„ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ํ–‰๋™์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์‹œ์ฒญ ํŒจํ„ด์„ ๋ถ„์„ํ•จ์œผ๋กœ์จ, ๊ธฐ์กด์˜ ๊ฑฐ์‹œ์  ๋งฅ๋ฝ์—์„œ ์ด๋ฃจ์–ด์ง„ ์„ ํ–‰ ์—ฐ๊ตฌ๋ฅผ ํ™•์žฅํ•˜์—ฌ ํ˜„์žฌ ๋ฏธ๋””์–ด ํ™˜๊ฒฝ์—์„œ์˜ TV ์‹œ์ฒญ ํ–‰์œ„์— ๋Œ€ํ•ด ๋”์šฑ ํ’๋ถ€ํ•œ ์ดํ•ด๋ฅผ ๋„์šธ ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€ํ•œ๋‹ค.์ œ 1 ์žฅ ์„œ ๋ก  1 ์ œ 1 ์ ˆ ์—ฐ๊ตฌ์˜ ๋ฐฐ๊ฒฝ 1 ์ œ 2 ์ ˆ ์—ฐ๊ตฌ์˜ ๋ชฉํ‘œ 10 ์ œ 2 ์žฅ ์„ ํ–‰์—ฐ๊ตฌ 13 ์ œ 1 ์ ˆ ์ด๋ก ์  ๋ฐฐ๊ฒฝ 13 ์ œ 2 ์ ˆ ๊ธฐ์ˆ ์  ๋ฐฐ๊ฒฝ 24 ์ œ 3 ์žฅ ์—ฐ๊ตฌ ๋ฌธ์ œ 30 ์ œ 1 ์ ˆ ์—ฐ๊ตฌ ๋ฌธ์ œ 30 ์ œ 4 ์žฅ ์—ฐ๊ตฌ ๋ฐฉ๋ฒ• 33 ์ œ 1 ์ ˆ ๋ฐ์ดํ„ฐ ๊ฐœ์š” ๋ฐ ์ „์ฒ˜๋ฆฌ 34 ์ œ 2 ์ ˆ ์„ธ์…˜ ์œ ํ˜•ํ™” 38 ์ œ 3 ์ ˆ ์‹œ์ฒญ ๊ฐ€๊ตฌ ์œ ํ˜•ํ™” 46 ์ œ 4 ์ ˆ ์„ธ์…˜ ์œ ํ˜•๊ณผ ์„œ๋น„์Šค ํ•ด์ง€์˜ ์—ฐ๊ด€์„ฑ 48 ์ œ 5 ์žฅ ์—ฐ๊ตฌ ๊ฒฐ๊ณผ 49 ์ œ 1 ์ ˆ ์„ธ์…˜ ์œ ํ˜• 49 ์ œ 2 ์ ˆ ์‹œ์ฒญ ๊ฐ€๊ตฌ ์œ ํ˜• 60 ์ œ 3 ์ ˆ ์„ธ์…˜ ์œ ํ˜•๊ณผ ์„œ๋น„์Šค ํ•ด์ง€์˜ ์—ฐ๊ด€์„ฑ 77 ์ œ 6 ์žฅ ๊ฒฐ ๋ก  85 ์ œ 1 ์ ˆ ์—ฐ๊ตฌ ์š”์•ฝ 85 ์ œ 2 ์ ˆ ์—ฐ๊ตฌ์˜ ์‹œ์‚ฌ์  87 ์ œ 3 ์ ˆ ์—ฐ๊ตฌ์˜ ํ•œ๊ณ„ 90 ์ฐธ๊ณ ๋ฌธํ—Œ 92Maste

    Understanding, Analyzing and Predicting Online User Behavior

    Get PDF
    abstract: Due to the growing popularity of the Internet and smart mobile devices, massive data has been produced every day, particularly, more and more usersโ€™ online behavior and activities have been digitalized. Making a better usage of the massive data and a better understanding of the user behavior become at the very heart of industrial firms as well as the academia. However, due to the large size and unstructured format of user behavioral data, as well as the heterogeneous nature of individuals, it leveled up the difficulty to identify the SPECIFIC behavior that researchers are looking at, HOW to distinguish, and WHAT is resulting from the behavior. The difference in user behavior comes from different causes; in my dissertation, I am studying three circumstances of behavior that potentially bring in turbulent or detrimental effects, from precursory culture to preparatory strategy and delusory fraudulence. Meanwhile, I have access to the versatile toolkit of analysis: econometrics, quasi-experiment, together with machine learning techniques such as text mining, sentiment analysis, and predictive analytics etc. This study creatively leverages the power of the combined methodologies, and apply it beyond individual level data and network data. This dissertation makes a first step to discover user behavior in the newly boosting contexts. My study conceptualize theoretically and test empirically the effect of cultural values on rating and I find that an individualist cultural background are more likely to lead to deviation and more expression in review behaviors. I also find evidence of strategic behavior that users tend to leverage the reporting to increase the likelihood to maximize the benefits. Moreover, it proposes the features that moderate the preparation behavior. Finally, it introduces a unified and scalable framework for delusory behavior detection that meets the current needs to fully utilize multiple data sources.Dissertation/ThesisDoctoral Dissertation Business Administration 201

    Adaptive Resource Management Schemes for Web Services

    Get PDF
    Web cluster systems provide cost-effective solutions when scalable and reliable web services are required. However, as the number of servers in web cluster systems increase, web cluster systems incur long and unpredictable delays to manage servers. This study presents the efficient management schemes for web cluster systems. First of all, we propose an efficient request distribution scheme in web cluster systems. Distributor-based systems forward user requests to a balanced set of waiting servers in complete transparency to the users. The policy employed in forwarding requests from the frontend distributor to the backend servers plays an important role in the overall system performance. In this study, we present a proactive request distribution (ProRD) to provide an intelligent distribution at the distributor. Second, we propose the heuristic memory management schemes through a web prefetching scheme. For this study, we design a Double Prediction-by-Partial-Match Scheme (DPS) that can be adapted to the modern web frameworks. In addition, we present an Adaptive Rate Controller (ARC) to determine the prefetch rate depending on the memory status dynamically. For evaluating the prefetch gain in a server node, we implement an Apache module. Lastly, we design an adaptive web streaming system in wireless networks. The rapid growth of new wireless and mobile devices accessing the internet has contributed to a whole new level of heterogeneity in web streaming systems. Particularly, in-home networks have also increased in heterogeneity by using various devices such as laptops, cell phone and PDAs. In our study, a set-top box(STB) is the access pointer between the internet and a home network. We design an ActiveSTB which has a capability of buffering and quality adaptation based on the estimation for the available bandwidth in the wireless LAN

    Mining Significant Usage Patterns from Clickstream Data

    No full text
    Abstract. Discovery of usage patterns from Web data is one of the primary purposes for Web Usage Mining. In this paper, a technique to generate Significant Usage Patterns (SUP) is proposed and used to acquire significant โ€œuser preferred navigational trailsโ€. The technique uses pipelined processing phases including sub-abstraction of sessionized Web clickstreams, clustering of the abstracted Web sessions, concept-based abstraction of the clustered sessions, and SUP generation. Using this technique, valuable customer behavior information can be extracted by Web site practitioners. Experiments conducted using Web log data provided by J.C.Penney demonstrate that SUPs of different types of customers are distinguishable and interpretable. This technique is particularly suited for analysis of dynamic websites.

    Combating Attacks and Abuse in Large Online Communities

    Get PDF
    Internet users today are connected more widely and ubiquitously than ever before. As a result, various online communities are formed, ranging from online social networks (Facebook, Twitter), to mobile communities (Foursquare, Waze), to content/interests based networks (Wikipedia, Yelp, Quora). While users are benefiting from the ease of access to information and social interactions, there is a growing concern for users' security and privacy against various attacks such as spam, phishing, malware infection and identity theft. Combating attacks and abuse in online communities is challenging. First, todayโ€™s online communities are increasingly dependent on users and user-generated content. Securing online systems demands a deep understanding of the complex and often unpredictable human behaviors. Second, online communities can easily have millions or even billions of users, which requires the corresponding security mechanisms to be highly scalable. Finally, cybercriminals are constantly evolving to launch new types of attacks. This further demands high robustness of security defenses. In this thesis, we take concrete steps towards measuring, understanding, and defending against attacks and abuse in online communities. We begin with a series of empirical measurements to understand user behaviors in different online services and the uniquesecurity and privacy challenges that users are facing with. This effort covers a broad set of popular online services including social networks for question and answering (Quora), anonymous social networks (Whisper), and crowdsourced mobile communities (Waze). Despite the differences of specific online communities, our study provides a first look at their user activity patterns based on empirical data, and reveals the need for reliable mechanisms to curate user content, protect privacy, and defend against emerging attacks. Next, we turn our attention to attacks targeting online communities, with focus on spam campaigns. While traditional spam is mostly generated by automated software, attackers today start to introduce "human intelligence" to implement attacks. This is maliciouscrowdsourcing (or crowdturfing) where a large group of real-users are organized to carry out malicious campaigns, such as writing fake reviews or spreading rumors on social media. Using collective human efforts, attackers can easily bypass many existing defenses (e.g.,CAPTCHA). To understand the ecosystem of crowdturfing, we first use measurements to examine their detailed campaign organization, workers and revenue. Based on insights from empirical data, we develop effective machine learning classifiers to detect crowdturfingactivities. In the meantime, considering the adversarial nature of crowdturfing, we also build practical adversarial models to simulate how attackers can evade or disrupt machine learning based defenses. To aid in this effort, we next explore using user behavior models to detect a wider range of attacks. Instead of making assumptions about attacker behavior, our idea is to model normal user behaviors and capture (malicious) behaviors that are deviated from norm. In this way, we can detect previously unknown attacks. Our behavior model is based on detailed clickstream data, which are sequences of click events generated by users when using the service. We build a similarity graph where each user is a node and the edges are weightedby clickstream similarity. By partitioning this graph, we obtain "clusters" of users with similar behaviors. We then use a small set of known good users to "color" these clusters to differentiate the malicious ones. This technique has been adopted by real-world social networks (Renren and LinkedIn), and already detected unexpected attacks. Finally, we extend clickstream model to understanding more-grained behaviors of attackers (and real users), and tracking how user behavior changes over time. In summary, this thesis illustrates a data-driven approach to understanding and defending against attacks and abuse in online communities. Our measurements have revealed new insights about how attackers are evolving to bypass existing security defenses today. Inaddition, our data-driven systems provide new solutions for online services to gain a deep understanding of their users, and defend them from emerging attacks and abuse
    corecore