3,691 research outputs found
Integrating E-Commerce and Data Mining: Architecture and Challenges
We show that the e-commerce domain can provide all the right ingredients for
successful data mining and claim that it is a killer domain for data mining. We
describe an integrated architecture, based on our expe-rience at Blue Martini
Software, for supporting this integration. The architecture can dramatically
reduce the pre-processing, cleaning, and data understanding effort often
documented to take 80% of the time in knowledge discovery projects. We
emphasize the need for data collection at the application server layer (not the
web server) in order to support logging of data and metadata that is essential
to the discovery process. We describe the data transformation bridges required
from the transaction processing systems and customer event streams (e.g.,
clickstreams) to the data warehouse. We detail the mining workbench, which
needs to provide multiple views of the data through reporting, data mining
algorithms, visualization, and OLAP. We con-clude with a set of challenges.Comment: KDD workshop: WebKDD 200
Toward a model of computational attention based on expressive behavior: applications to cultural heritage scenarios
Our project goals consisted in the development of attention-based analysis of human expressive behavior and the implementation of real-time algorithm in EyesWeb XMI in order to improve naturalness of human-computer interaction and context-based monitoring of human behavior. To this aim, perceptual-model that mimic human attentional processes was developed for expressivity analysis and modeled by entropy. Museum scenarios were selected as an ecological test-bed to elaborate three experiments that focus on visitor profiling and visitors flow regulation
Denial of Service in Web-Domains: Building Defenses Against Next-Generation Attack Behavior
The existing state-of-the-art in the field of application layer Distributed Denial of Service (DDoS) protection is generally designed, and thus effective, only for static web domains. To the best of our knowledge, our work is the first that studies the problem of application layer DDoS defense in web domains of dynamic content and organization, and for next-generation bot behaviour. In the first part of this thesis, we focus on the following research tasks: 1) we identify the main weaknesses of the existing application-layer anti-DDoS solutions as proposed in research literature and in the industry, 2) we obtain a comprehensive picture of the current-day as well as the next-generation application-layer attack behaviour and 3) we propose novel techniques, based on a multidisciplinary approach that combines offline machine learning algorithms and statistical analysis, for detection of suspicious web visitors in static web domains. Then, in the second part of the thesis, we propose and evaluate a novel anti-DDoS system that detects a broad range of application-layer DDoS attacks, both in static and dynamic web domains, through the use of advanced techniques of data mining. The key advantage of our system relative to other systems that resort to the use of challenge-response tests (such as CAPTCHAs) in combating malicious bots is that our system minimizes the number of these tests that are presented to valid human visitors while succeeding in preventing most malicious attackers from accessing the web site. The results of the experimental evaluation of the proposed system demonstrate effective detection of current and future variants of application layer DDoS attacks
Recreation, tourism and nature in a changing world : proceedings of the fifth international conference on monitoring and management of visitor flows in recreational and protected areas : Wageningen, the Netherlands, May 30-June 3, 2010
Proceedings of the fifth international conference on monitoring and management of visitor flows in recreational and protected areas : Wageningen, the Netherlands, May 30-June 3, 201
adPerf: Characterizing the Performance of Third-party Ads
Monetizing websites and web apps through online advertising is widespread in
the web ecosystem. The online advertising ecosystem nowadays forces publishers
to integrate ads from these third-party domains. On the one hand, this raises
several privacy and security concerns that are actively studied in recent
years. On the other hand, given the ability of today's browsers to load dynamic
web pages with complex animations and Javascript, online advertising has also
transformed and can have a significant impact on webpage performance. The
performance cost of online ads is critical since it eventually impacts user
satisfaction as well as their Internet bill and device energy consumption.
In this paper, we apply an in-depth and first-of-a-kind performance
evaluation of web ads. Unlike prior efforts that rely primarily on adblockers,
we perform a fine-grained analysis on the web browser's page loading process to
demystify the performance cost of web ads. We aim to characterize the cost by
every component of an ad, so the publisher, ad syndicate, and advertiser can
improve the ad's performance with detailed guidance. For this purpose, we
develop an infrastructure, adPerf, for the Chrome browser that classifies page
loading workloads into ad-related and main-content at the granularity of
browser activities (such as Javascript and Layout). Our evaluations show that
online advertising entails more than 15% of browser page loading workload and
approximately 88% of that is spent on JavaScript. We also track the sources and
delivery chain of web ads and analyze performance considering the origin of the
ad contents. We observe that 2 of the well-known third-party ad domains
contribute to 35% of the ads performance cost and surprisingly, top news
websites implicitly include unknown third-party ads which in some cases build
up to more than 37% of the ads performance cost
- …