2 research outputs found

    CSHURI - Modified HURI algorithm for Customer Segmentation and Transaction Profitability

    Full text link
    Association rule mining (ARM) is the process of generating rules based on the correlation between the set of items that the customers purchase.Of late, data mining researchers have improved upon the quality of association rule mining for business development by incorporating factors like value (utility), quantity of items sold (weight) and profit. The rules mined without considering utility values (profit margin) will lead to a probable loss of profitable rules. The advantage of wealth of the customers' needs information and rules aids the retailer in designing his store layout[9]. An algorithm CSHURI, Customer Segmentation using HURI, is proposed, a modified version of HURI [6], finds customers who purchase high profitable rare items and accordingly classify the customers based on some criteria; for example, a retail business may need to identify valuable customers who are major contributors to a company's overall profit. For a potential customer arriving in the store, which customer group one should belong to according to customer needs, what are the preferred functional features or products that the customer focuses on and what kind of offers will satisfy the customer, etc., finds the key in targeting customers to improve sales [9], which forms the base for customer utility mining.Comment: 11 pages, 5 tables, 1 figure, IJCSEIT-201

    Data Masking with Privacy Guarantees

    Full text link
    We study the problem of data release with privacy, where data is made available with privacy guarantees while keeping the usability of the data as high as possible --- this is important in health-care and other domains with sensitive data. In particular, we propose a method of masking the private data with privacy guarantee while ensuring that a classifier trained on the masked data is similar to the classifier trained on the original data, to maintain usability. We analyze the theoretical risks of the proposed method and the traditional input perturbation method. Results show that the proposed method achieves lower risk compared to the input perturbation, especially when the number of training samples gets large. We illustrate the effectiveness of the proposed method of data masking for privacy-sensitive learning on 1212 benchmark datasets
    corecore