5,109 research outputs found
A personalized web page content filtering model based on segmentation
In the view of massive content explosion in World Wide Web through diverse
sources, it has become mandatory to have content filtering tools. The filtering
of contents of the web pages holds greater significance in cases of access by
minor-age people. The traditional web page blocking systems goes by the Boolean
methodology of either displaying the full page or blocking it completely. With
the increased dynamism in the web pages, it has become a common phenomenon that
different portions of the web page holds different types of content at
different time instances. This paper proposes a model to block the contents at
a fine-grained level i.e. instead of completely blocking the page it would be
efficient to block only those segments which holds the contents to be blocked.
The advantages of this method over the traditional methods are fine-graining
level of blocking and automatic identification of portions of the page to be
blocked. The experiments conducted on the proposed model indicate 88% of
accuracy in filtering out the segments.Comment: 11 Pages, 6 Figure
Customer purchase behavior prediction in E-commerce: a conceptual framework and research agenda
Digital retailers are experiencing an increasing number of transactions coming from their consumers online, a consequence of the convenience in buying goods via E-commerce platforms. Such interactions compose complex behavioral patterns which can be analyzed through predictive analytics to enable businesses to understand consumer needs. In this abundance of big data and possible tools to analyze them, a systematic review of the literature is missing. Therefore, this paper presents a systematic literature review of recent research dealing with customer purchase prediction in the E-commerce context. The main contributions are a novel analytical framework and a research agenda in the field. The framework reveals three main tasks in this review, namely, the prediction of customer intents, buying sessions, and purchase decisions. Those are followed by their employed predictive methodologies and are analyzed from three perspectives. Finally, the research agenda provides major existing issues for further research in the field of purchase behavior prediction online
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
Competitive Intelligence and Internet Sources
In the Knowledge Age to maintain profitability and in some cases to remain in the market, companies must focus their actions in activities such as collecting, filtering, and disseminating information about market, about competitors and their actions. Those are part of Competitive Intelligence (CI) concept. In digital age, most of the information needed for CI projects is available on the web. This paper focuses on this field and presents a mix of directions that companies need to take into consideration in their CI projects in order to achieve the goals.competitive intelligence, web mining, information
Profiling user activities with minimal traffic traces
Understanding user behavior is essential to personalize and enrich a user's
online experience. While there are significant benefits to be accrued from the
pursuit of personalized services based on a fine-grained behavioral analysis,
care must be taken to address user privacy concerns. In this paper, we consider
the use of web traces with truncated URLs - each URL is trimmed to only contain
the web domain - for this purpose. While such truncation removes the
fine-grained sensitive information, it also strips the data of many features
that are crucial to the profiling of user activity. We show how to overcome the
severe handicap of lack of crucial features for the purpose of filtering out
the URLs representing a user activity from the noisy network traffic trace
(including advertisement, spam, analytics, webscripts) with high accuracy. This
activity profiling with truncated URLs enables the network operators to provide
personalized services while mitigating privacy concerns by storing and sharing
only truncated traffic traces.
In order to offset the accuracy loss due to truncation, our statistical
methodology leverages specialized features extracted from a group of
consecutive URLs that represent a micro user action like web click, chat reply,
etc., which we call bursts. These bursts, in turn, are detected by a novel
algorithm which is based on our observed characteristics of the inter-arrival
time of HTTP records. We present an extensive experimental evaluation on a real
dataset of mobile web traces, consisting of more than 130 million records,
representing the browsing activities of 10,000 users over a period of 30 days.
Our results show that the proposed methodology achieves around 90% accuracy in
segregating URLs representing user activities from non-representative URLs
A Contextual-Bandit Approach to Personalized News Article Recommendation
Personalized web services strive to adapt their services (advertisements,
news articles, etc) to individual users by making use of both content and user
information. Despite a few recent advances, this problem remains challenging
for at least two reasons. First, web service is featured with dynamically
changing pools of content, rendering traditional collaborative filtering
methods inapplicable. Second, the scale of most web services of practical
interest calls for solutions that are both fast in learning and computation.
In this work, we model personalized recommendation of news articles as a
contextual bandit problem, a principled approach in which a learning algorithm
sequentially selects articles to serve users based on contextual information
about the users and articles, while simultaneously adapting its
article-selection strategy based on user-click feedback to maximize total user
clicks.
The contributions of this work are three-fold. First, we propose a new,
general contextual bandit algorithm that is computationally efficient and well
motivated from learning theory. Second, we argue that any bandit algorithm can
be reliably evaluated offline using previously recorded random traffic.
Finally, using this offline evaluation method, we successfully applied our new
algorithm to a Yahoo! Front Page Today Module dataset containing over 33
million events. Results showed a 12.5% click lift compared to a standard
context-free bandit algorithm, and the advantage becomes even greater when data
gets more scarce.Comment: 10 pages, 5 figure
- âŠ