201 research outputs found

    Comprehensive survey on big data privacy protection

    Get PDF
    In recent years, the ever-mounting problem of Internet phishing has been threatening the secure propagation of sensitive data over the web, thereby resulting in either outright decline of data distribution or inaccurate data distribution from several data providers. Therefore, user privacy has evolved into a critical issue in various data mining operations. User privacy has turned out to be a foremost criterion for allowing the transfer of confidential information. The intense surge in storing the personal data of customers (i.e., big data) has resulted in a new research area, which is referred to as privacy-preserving data mining (PPDM). A key issue of PPDM is how to manipulate data using a specific approach to enable the development of a good data mining model on modified data, thereby meeting a specified privacy need with minimum loss of information for the intended data analysis task. The current review study aims to utilize the tasks of data mining operations without risking the security of individuals’ sensitive information, particularly at the record level. To this end, PPDM techniques are reviewed and classified using various approaches for data modification. Furthermore, a critical comparative analysis is performed for the advantages and drawbacks of PPDM techniques. This review study also elaborates on the existing challenges and unresolved issues in PPDM.Published versio

    The Fifth International VLDB Workshop on Management of Uncertain Data

    Get PDF

    Letter from the Special Issue Editor

    Get PDF
    Editorial work for DEBULL on a special issue on data management on Storage Class Memory (SCM) technologies

    Global Consistency Management Methods Based on Escrow Approaches in Mobile ad Hoc Networks

    Get PDF

    k

    Get PDF

    Optimization-based User Group Management : Discovery, Analysis, Recommendation

    Get PDF
    User data is becoming increasingly available in multiple domains ranging from phone usage traces to data on the social Web. User data is a special type of data that is described by user demographics (e.g., age, gender, occupation, etc.) and user activities (e.g., rating, voting, watching a movie, etc.) The analysis of user data is appealing to scientists who work on population studies, online marketing, recommendations, and large-scale data analytics. However, analysis tools for user data is still lacking.In this thesis, we believe there exists a unique opportunity to analyze user data in the form of user groups. This is in contrast with individual user analysis and also statistical analysis on the whole population. A group is defined as set of users whose members have either common demographics or common activities. Group-level analysis reduces the amount of sparsity and noise in data and leads to new insights. In this thesis, we propose a user group management framework consisting of following components: user group discovery, analysis and recommendation.The very first step in our framework is group discovery, i.e., given raw user data, obtain user groups by optimizing one or more quality dimensions. The second component (i.e., analysis) is necessary to tackle the problem of information overload: the output of a user group discovery step often contains millions of user groups. It is a tedious task for an analyst to skim over all produced groups. Thus we need analysis tools to provide valuable insights in this huge space of user groups. The final question in the framework is how to use the found groups. In this thesis, we investigate one of these applications, i.e., user group recommendation, by considering affinities between group members.All our contributions of the proposed framework are evaluated using an extensive set of experiments both for quality and performance.Les donn ́ees utilisateurs sont devenue de plus en plus disponibles dans plusieurs do- maines tels que les traces d'usage des smartphones et le Web social. Les donn ́ees util- isateurs, sont un type particulier de donn ́ees qui sont d ́ecrites par des informations socio-d ́emographiques (ex., ˆage, sexe, m ́etier, etc.) et leurs activit ́es (ex., donner un avis sur un restaurant, voter, critiquer un film, etc.). L'analyse des donn ́ees utilisa- teurs int ́eresse beaucoup les scientifiques qui travaillent sur les ́etudes de la population, le marketing en-ligne, les recommandations et l'analyse des donn ́ees `a grande ́echelle. Cependant, les outils d'analyse des donn ́ees utilisateurs sont encore tr`es limit ́es.Dans cette th`ese, nous exploitons cette opportunit ́e et proposons d'analyser les donn ́ees utilisateurs en formant des groupes d'utilisateurs. Cela diff`ere de l'analyse des util- isateurs individuels et aussi des analyses statistiques sur une population enti`ere. Un groupe utilisateur est d ́efini par un ensemble des utilisateurs dont les membres parta- gent des donn ́ees socio-d ́emographiques et ont des activit ́es en commun. L'analyse au niveau d'un groupe a pour objectif de mieux g ́erer les donn ́ees creuses et le bruit dans les donn ́ees. Dans cette th`ese, nous proposons un cadre de gestion de groupes d'utilisateurs qui contient les composantes suivantes: d ́ecouverte de groupes, analyse de groupes, et recommandation aux groupes.La premi`ere composante concerne la d ́ecouverte des groupes d'utilisateurs, c.- `a-d., compte tenu des donn ́ees utilisateurs brutes, obtenir les groupes d'utilisateurs en op- timisantuneouplusieursdimensionsdequalit ́e. Ledeuxi`emecomposant(c.-`a-d., l'analyse) est n ́ecessaire pour aborder le probl`eme de la surcharge de l'information: le r ́esultat d'une ́etape d ́ecouverte des groupes d'utilisateurs peut contenir des millions de groupes. C'est une tache fastidieuse pour un analyste `a ́ecumer tous les groupes trouv ́es. Nous proposons une approche interactive pour faciliter cette analyse. La question finale est comment utiliser les groupes trouv ́es. Dans cette th`ese, nous ́etudions une applica- tion particuli`ere qui est la recommandation aux groupes d'utilisateurs, en consid ́erant les affinit ́es entre les membres du groupe et son ́evolution dans le temps.Toutes nos contributions sont ́evalu ́ees au travers d'un grand nombre d'exp ́erimentations `a la fois pour tester la qualit ́e et la performance (le temps de r ́eponse)
    corecore