56,048 research outputs found
Binary Particle Swarm Optimization based Biclustering of Web usage Data
Web mining is the nontrivial process to discover valid, novel, potentially
useful knowledge from web data using the data mining techniques or methods. It
may give information that is useful for improving the services offered by web
portals and information access and retrieval tools. With the rapid development
of biclustering, more researchers have applied the biclustering technique to
different fields in recent years. When biclustering approach is applied to the
web usage data it automatically captures the hidden browsing patterns from it
in the form of biclusters. In this work, swarm intelligent technique is
combined with biclustering approach to propose an algorithm called Binary
Particle Swarm Optimization (BPSO) based Biclustering for Web Usage Data. The
main objective of this algorithm is to retrieve the global optimal bicluster
from the web usage data. These biclusters contain relationships between web
users and web pages which are useful for the E-Commerce applications like web
advertising and marketing. Experiments are conducted on real dataset to prove
the efficiency of the proposed algorithms
Measuring the Use of the Active and Assisted Living Prototype CARIMO for Home Care Service Users: Evaluation Framework and Results
To address the challenges of aging societies, various information and communication technology (ICT)-based systems for older people have been developed in recent years. Currently, the evaluation of these so-called active and assisted living (AAL) systems usually focuses on the analyses of usability and acceptance, while some also assess their impact. Little is known about
the actual take-up of these assistive technologies. This paper presents a framework for measuring the take-up by analyzing the actual usage of AAL systems. This evaluation framework covers detailed information regarding the entire process including usage data logging, data preparation, and usage data analysis. We applied the framework on the AAL prototype CARIMO for measuring
its take-up during an eight-month field trial in Austria and Italy. The framework was designed to guide systematic, comparable, and reproducible usage data evaluation in the AAL field; however, the general applicability of the framework has yet to be validated
BlogForever D2.6: Data Extraction Methodology
This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform
i-JEN: Visual interactive Malaysia crime news retrieval system
Supporting crime news investigation involves a mechanism to help monitor the current and past status of criminal events. We believe this could be well facilitated by focusing on the user interfaces and the event crime model aspects. In this paper we discuss on a development of Visual Interactive Malaysia Crime News Retrieval System (i-JEN) and describe the approach, user studies and planned, the system architecture and future plan. Our main objectives are to construct crime-based event; investigate the use of crime-based event in improving the classification and clustering; develop an interactive crime news retrieval system; visualize crime news in an effective and interactive way; integrate them into a usable and robust system and evaluate the usability and system performance. The system will serve as a news monitoring system which aims to automatically organize, retrieve and present the crime news in such a way as to support an effective monitoring, searching, and browsing for the target users groups of general public, news analysts and policemen or crime investigators. The study will contribute to the better understanding of the crime data consumption in the Malaysian context as well as the developed system with the visualisation features to address crime data and the eventual goal of combating the crimes
Profiling user activities with minimal traffic traces
Understanding user behavior is essential to personalize and enrich a user's
online experience. While there are significant benefits to be accrued from the
pursuit of personalized services based on a fine-grained behavioral analysis,
care must be taken to address user privacy concerns. In this paper, we consider
the use of web traces with truncated URLs - each URL is trimmed to only contain
the web domain - for this purpose. While such truncation removes the
fine-grained sensitive information, it also strips the data of many features
that are crucial to the profiling of user activity. We show how to overcome the
severe handicap of lack of crucial features for the purpose of filtering out
the URLs representing a user activity from the noisy network traffic trace
(including advertisement, spam, analytics, webscripts) with high accuracy. This
activity profiling with truncated URLs enables the network operators to provide
personalized services while mitigating privacy concerns by storing and sharing
only truncated traffic traces.
In order to offset the accuracy loss due to truncation, our statistical
methodology leverages specialized features extracted from a group of
consecutive URLs that represent a micro user action like web click, chat reply,
etc., which we call bursts. These bursts, in turn, are detected by a novel
algorithm which is based on our observed characteristics of the inter-arrival
time of HTTP records. We present an extensive experimental evaluation on a real
dataset of mobile web traces, consisting of more than 130 million records,
representing the browsing activities of 10,000 users over a period of 30 days.
Our results show that the proposed methodology achieves around 90% accuracy in
segregating URLs representing user activities from non-representative URLs
Using webcrawling of publicly available websites to assess E-commerce relationships
We investigate e-commerce success factors concerning their impact on the success of commerce transactions between businesses companies. In scientific literature, many e-commerce success factors are introduced. Most of them are focused on companies' website quality. They are evaluated concerning companies' success in the business-to- consumer (B2C) environment where consumers choose their preferred e-commerce websites based on these success factors e.g. website content quality, website interaction, and website customization. In contrast to previous work, this research focuses on the usage of existing e-commerce success factors for predicting successfulness of business-to-business (B2B) ecommerce. The introduced methodology is based on the identification of semantic textual patterns representing success factors from the websites of B2B companies. The successfulness of the identified success factors in B2B ecommerce is evaluated by regression modeling. As a result, it is shown that some B2C e-commerce success factors also enable the predicting of B2B e-commerce success while others do not. This contributes to the existing literature concerning ecommerce success factors. Further, these findings are valuable for B2B e-commerce websites creation
CASP-DM: Context Aware Standard Process for Data Mining
We propose an extension of the Cross Industry Standard Process for Data
Mining (CRISPDM) which addresses specific challenges of machine learning and
data mining for context and model reuse handling. This new general
context-aware process model is mapped with CRISP-DM reference model proposing
some new or enhanced outputs
- âŠ