56,048 research outputs found

    Binary Particle Swarm Optimization based Biclustering of Web usage Data

    Full text link
    Web mining is the nontrivial process to discover valid, novel, potentially useful knowledge from web data using the data mining techniques or methods. It may give information that is useful for improving the services offered by web portals and information access and retrieval tools. With the rapid development of biclustering, more researchers have applied the biclustering technique to different fields in recent years. When biclustering approach is applied to the web usage data it automatically captures the hidden browsing patterns from it in the form of biclusters. In this work, swarm intelligent technique is combined with biclustering approach to propose an algorithm called Binary Particle Swarm Optimization (BPSO) based Biclustering for Web Usage Data. The main objective of this algorithm is to retrieve the global optimal bicluster from the web usage data. These biclusters contain relationships between web users and web pages which are useful for the E-Commerce applications like web advertising and marketing. Experiments are conducted on real dataset to prove the efficiency of the proposed algorithms

    Measuring the Use of the Active and Assisted Living Prototype CARIMO for Home Care Service Users: Evaluation Framework and Results

    Get PDF
    To address the challenges of aging societies, various information and communication technology (ICT)-based systems for older people have been developed in recent years. Currently, the evaluation of these so-called active and assisted living (AAL) systems usually focuses on the analyses of usability and acceptance, while some also assess their impact. Little is known about the actual take-up of these assistive technologies. This paper presents a framework for measuring the take-up by analyzing the actual usage of AAL systems. This evaluation framework covers detailed information regarding the entire process including usage data logging, data preparation, and usage data analysis. We applied the framework on the AAL prototype CARIMO for measuring its take-up during an eight-month field trial in Austria and Italy. The framework was designed to guide systematic, comparable, and reproducible usage data evaluation in the AAL field; however, the general applicability of the framework has yet to be validated

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    i-JEN: Visual interactive Malaysia crime news retrieval system

    Get PDF
    Supporting crime news investigation involves a mechanism to help monitor the current and past status of criminal events. We believe this could be well facilitated by focusing on the user interfaces and the event crime model aspects. In this paper we discuss on a development of Visual Interactive Malaysia Crime News Retrieval System (i-JEN) and describe the approach, user studies and planned, the system architecture and future plan. Our main objectives are to construct crime-based event; investigate the use of crime-based event in improving the classification and clustering; develop an interactive crime news retrieval system; visualize crime news in an effective and interactive way; integrate them into a usable and robust system and evaluate the usability and system performance. The system will serve as a news monitoring system which aims to automatically organize, retrieve and present the crime news in such a way as to support an effective monitoring, searching, and browsing for the target users groups of general public, news analysts and policemen or crime investigators. The study will contribute to the better understanding of the crime data consumption in the Malaysian context as well as the developed system with the visualisation features to address crime data and the eventual goal of combating the crimes

    Profiling user activities with minimal traffic traces

    Full text link
    Understanding user behavior is essential to personalize and enrich a user's online experience. While there are significant benefits to be accrued from the pursuit of personalized services based on a fine-grained behavioral analysis, care must be taken to address user privacy concerns. In this paper, we consider the use of web traces with truncated URLs - each URL is trimmed to only contain the web domain - for this purpose. While such truncation removes the fine-grained sensitive information, it also strips the data of many features that are crucial to the profiling of user activity. We show how to overcome the severe handicap of lack of crucial features for the purpose of filtering out the URLs representing a user activity from the noisy network traffic trace (including advertisement, spam, analytics, webscripts) with high accuracy. This activity profiling with truncated URLs enables the network operators to provide personalized services while mitigating privacy concerns by storing and sharing only truncated traffic traces. In order to offset the accuracy loss due to truncation, our statistical methodology leverages specialized features extracted from a group of consecutive URLs that represent a micro user action like web click, chat reply, etc., which we call bursts. These bursts, in turn, are detected by a novel algorithm which is based on our observed characteristics of the inter-arrival time of HTTP records. We present an extensive experimental evaluation on a real dataset of mobile web traces, consisting of more than 130 million records, representing the browsing activities of 10,000 users over a period of 30 days. Our results show that the proposed methodology achieves around 90% accuracy in segregating URLs representing user activities from non-representative URLs

    Using webcrawling of publicly available websites to assess E-commerce relationships

    Get PDF
    We investigate e-commerce success factors concerning their impact on the success of commerce transactions between businesses companies. In scientific literature, many e-commerce success factors are introduced. Most of them are focused on companies' website quality. They are evaluated concerning companies' success in the business-to- consumer (B2C) environment where consumers choose their preferred e-commerce websites based on these success factors e.g. website content quality, website interaction, and website customization. In contrast to previous work, this research focuses on the usage of existing e-commerce success factors for predicting successfulness of business-to-business (B2B) ecommerce. The introduced methodology is based on the identification of semantic textual patterns representing success factors from the websites of B2B companies. The successfulness of the identified success factors in B2B ecommerce is evaluated by regression modeling. As a result, it is shown that some B2C e-commerce success factors also enable the predicting of B2B e-commerce success while others do not. This contributes to the existing literature concerning ecommerce success factors. Further, these findings are valuable for B2B e-commerce websites creation

    CASP-DM: Context Aware Standard Process for Data Mining

    Get PDF
    We propose an extension of the Cross Industry Standard Process for Data Mining (CRISPDM) which addresses specific challenges of machine learning and data mining for context and model reuse handling. This new general context-aware process model is mapped with CRISP-DM reference model proposing some new or enhanced outputs

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF
    • 

    corecore