173 research outputs found

    Recommendation Based On Comparative Analysis of Apriori and BW-Mine Algorithm

    Full text link
    With The Growth of WWW recommending appropriate and relevant page to the user is a challenging task. In many web Applications, user would like to get recommendation based on their interest of surfing. Web Mining is used to extract relevant information for the user from logs, web content, hyperlinks etc. In this paper we will be using logs to recommend frequent access patterns to the users .This paper aims at using the logs of user ,cleaning logs , identifying users , identifying session , completing sessions from website structure and then using and comparing different recommendation algorithm like Apriori Algorithms and BW-Mine to recommend frequent items to the user. We will also be comparing different recommendations Algorithm with the help of example. The fundamental of finding access patterns with Apriori is that any set that occurs frequently must have its frequent subset. The fundamental of finding access pattern with BW-Mine, it constructs the WB-table, VI-List, and HI-Counter for finding frequent patterns

    Uji Akurasi Algoritma Bipolar Slope One Dan BW-Mine Pada Sistem Rekomendasi

    Get PDF
    The recommendation system is widely applied to various e-commerce. There are some problems that can cause the recommendation system to fail. This problem is about the massive vacuum of rating data (sparsity) and cold start. Therefore, the right recommendation method is needed to improve accuracy, so that the user can find the item according to desire.To achieve this goal, bipolar slope one is used to predict the rating. Bipolar slope one is used to predict the rating of an item. In predicting an item's rating, an item pattern is needed. This item pattern can be represented in the Assosiation Rule that found in the BW-Mine algorithm.The test was carried out with MAE involving 50 users of 200 items. The test results using MAE, obtained that sparsity has an influence on the accuracy of rating prediction generated in the recommendation syste

    Discrete social recommendation

    Get PDF
    National Research Foundation (NRF) Singapore under its AI Singapore Programm

    Frequent itemset mining on multiprocessor systems

    Get PDF
    Frequent itemset mining is an important building block in many data mining applications like market basket analysis, recommendation, web-mining, fraud detection, and gene expression analysis. In many of them, the datasets being mined can easily grow up to hundreds of gigabytes or even terabytes of data. Hence, efficient algorithms are required to process such large amounts of data. In recent years, there have been many frequent-itemset mining algorithms proposed, which however (1) often have high memory requirements and (2) do not exploit the large degrees of parallelism provided by modern multiprocessor systems. The high memory requirements arise mainly from inefficient data structures that have only been shown to be sufficient for small datasets. For large datasets, however, the use of these data structures force the algorithms to go out-of-core, i.e., they have to access secondary memory, which leads to serious performance degradations. Exploiting available parallelism is further required to mine large datasets because the serial performance of processors almost stopped increasing. Algorithms should therefore exploit the large number of available threads and also the other kinds of parallelism (e.g., vector instruction sets) besides thread-level parallelism. In this work, we tackle the high memory requirements of frequent itemset mining twofold: we (1) compress the datasets being mined because they must be kept in main memory during several mining invocations and (2) improve existing mining algorithms with memory-efficient data structures. For compressing the datasets, we employ efficient encodings that show a good compression performance on a wide variety of realistic datasets, i.e., the size of the datasets is reduced by up to 6.4x. The encodings can further be applied directly while loading the dataset from disk or network. Since encoding and decoding is repeatedly required for loading and mining the datasets, we reduce its costs by providing parallel encodings that achieve high throughputs for both tasks. For a memory-efficient representation of the mining algorithms’ intermediate data, we propose compact data structures and even employ explicit compression. Both methods together reduce the intermediate data’s size by up to 25x. The smaller memory requirements avoid or delay expensive out-of-core computation when large datasets are mined. For coping with the high parallelism provided by current multiprocessor systems, we identify the performance hot spots and scalability issues of existing frequent-itemset mining algorithms. The hot spots, which form basic building blocks of these algorithms, cover (1) counting the frequency of fixed-length strings, (2) building prefix trees, (3) compressing integer values, and (4) intersecting lists of sorted integer values or bitmaps. For all of them, we discuss how to exploit available parallelism and provide scalable solutions. Furthermore, almost all components of the mining algorithms must be parallelized to keep the sequential fraction of the algorithms as small as possible. We integrate the parallelized building blocks and components into three well-known mining algorithms and further analyze the impact of certain existing optimizations. Our algorithms are already single-threaded often up an order of magnitude faster than existing highly optimized algorithms and further scale almost linear on a large 32-core multiprocessor system. Although our optimizations are intended for frequent-itemset mining algorithms, they can be applied with only minor changes to algorithms that are used for mining of other types of itemsets

    Item-centric mining of frequent patterns from big uncertain data

    Get PDF
    Item-centric mining of frequent patterns from big uncertain dat

    Compositional coding for collaborative filtering

    Get PDF
    National Research Foundation (NRF) Singapore under its AI Singapore Programm

    Designing algorithms for big graph datasets : a study of computing bisimulation and joins

    Get PDF

    Applying Secure Multi-party Computation in Practice

    Get PDF
    In this work, we present solutions for technical difficulties in deploying secure multi-party computation in real-world applications. We will first give a brief overview of the current state of the art, bring out several shortcomings and address them. The main contribution of this work is an end-to-end process description of deploying secure multi-party computation for the first large-scale registry-based statistical study on linked databases. Involving large stakeholders like government institutions introduces also some non-technical requirements like signing contracts and negotiating with the Data Protection Agency

    Discrete deep learning for fast content-aware recommendation

    Get PDF
    Cold-start problem and recommendation efficiency have been regarded as two crucial challenges in the recommender system. In this paper, we propose a hashing based deep learning framework called Discrete Deep Learning (DDL), to map users and items to Hamming space, where a user's preference for an item can be efficiently calculated by Hamming distance, and this computation scheme significantly improves the efficiency of online recommendation. Besides, DDL unifies the user-item interaction information and the item content information to overcome the issues of data sparsity and cold-start. To be more specific, to integrate content information into our DDL framework, a deep learning model, Deep Belief Network (DBN), is applied to extract effective item representation from the item content information. Besides, the framework imposes balance and irrelevant constraints on binary codes to derive compact but informative binary codes. Due to the discrete constraints in DDL, we propose an efficient alternating optimization method consisting of iteratively solving a series of mixed-integer programming subproblems. Extensive experiments have been conducted to evaluate the performance of our DDL framework on two different Amazon datasets, and the experimental results demonstrate the superiority of DDL over the state-of-the-art methods regarding online recommendation efficiency and cold-start recommendation accuracy
    corecore