    Optimization-based User Group Management : Discovery, Analysis, Recommendation

    User data is becoming increasingly available in multiple domains ranging from phone usage traces to data on the social Web. User data is a special type of data that is described by user demographics (e.g., age, gender, occupation, etc.) and user activities (e.g., rating, voting, watching a movie, etc.) The analysis of user data is appealing to scientists who work on population studies, online marketing, recommendations, and large-scale data analytics. However, analysis tools for user data is still lacking.In this thesis, we believe there exists a unique opportunity to analyze user data in the form of user groups. This is in contrast with individual user analysis and also statistical analysis on the whole population. A group is defined as set of users whose members have either common demographics or common activities. Group-level analysis reduces the amount of sparsity and noise in data and leads to new insights. In this thesis, we propose a user group management framework consisting of following components: user group discovery, analysis and recommendation.The very first step in our framework is group discovery, i.e., given raw user data, obtain user groups by optimizing one or more quality dimensions. The second component (i.e., analysis) is necessary to tackle the problem of information overload: the output of a user group discovery step often contains millions of user groups. It is a tedious task for an analyst to skim over all produced groups. Thus we need analysis tools to provide valuable insights in this huge space of user groups. The final question in the framework is how to use the found groups. In this thesis, we investigate one of these applications, i.e., user group recommendation, by considering affinities between group members.All our contributions of the proposed framework are evaluated using an extensive set of experiments both for quality and performance.Les donn ́ees utilisateurs sont devenue de plus en plus disponibles dans plusieurs do- maines tels que les traces d'usage des smartphones et le Web social. Les donn ́ees util- isateurs, sont un type particulier de donn ́ees qui sont d ́ecrites par des informations socio-d ́emographiques (ex., ˆage, sexe, m ́etier, etc.) et leurs activit ́es (ex., donner un avis sur un restaurant, voter, critiquer un film, etc.). L'analyse des donn ́ees utilisa- teurs int ́eresse beaucoup les scientifiques qui travaillent sur les ́etudes de la population, le marketing en-ligne, les recommandations et l'analyse des donn ́ees `a grande ́echelle. Cependant, les outils d'analyse des donn ́ees utilisateurs sont encore tr`es limit ́es.Dans cette th`ese, nous exploitons cette opportunit ́e et proposons d'analyser les donn ́ees utilisateurs en formant des groupes d'utilisateurs. Cela diff`ere de l'analyse des util- isateurs individuels et aussi des analyses statistiques sur une population enti`ere. Un groupe utilisateur est d ́efini par un ensemble des utilisateurs dont les membres parta- gent des donn ́ees socio-d ́emographiques et ont des activit ́es en commun. L'analyse au niveau d'un groupe a pour objectif de mieux g ́erer les donn ́ees creuses et le bruit dans les donn ́ees. Dans cette th`ese, nous proposons un cadre de gestion de groupes d'utilisateurs qui contient les composantes suivantes: d ́ecouverte de groupes, analyse de groupes, et recommandation aux groupes.La premi`ere composante concerne la d ́ecouverte des groupes d'utilisateurs, c.- `a-d., compte tenu des donn ́ees utilisateurs brutes, obtenir les groupes d'utilisateurs en op- timisantuneouplusieursdimensionsdequalit ́e. Ledeuxi`emecomposant(c.-`a-d., l'analyse) est n ́ecessaire pour aborder le probl`eme de la surcharge de l'information: le r ́esultat d'une ́etape d ́ecouverte des groupes d'utilisateurs peut contenir des millions de groupes. C'est une tache fastidieuse pour un analyste `a ́ecumer tous les groupes trouv ́es. Nous proposons une approche interactive pour faciliter cette analyse. La question finale est comment utiliser les groupes trouv ́es. Dans cette th`ese, nous ́etudions une applica- tion particuli`ere qui est la recommandation aux groupes d'utilisateurs, en consid ́erant les affinit ́es entre les membres du groupe et son ́evolution dans le temps.Toutes nos contributions sont ́evalu ́ees au travers d'un grand nombre d'exp ́erimentations `a la fois pour tester la qualit ́e et la performance (le temps de r ́eponse)

    Characterizing Driving Context from Driver Behavior

    Because of the increasing availability of spatiotemporal data, a variety of data-analytic applications have become possible. Characterizing driving context, where context may be thought of as a combination of location and time, is a new challenging application. An example of such a characterization is finding the correlation between driving behavior and traffic conditions. This contextual information enables analysts to validate observation-based hypotheses about the driving of an individual. In this paper, we present DriveContext, a novel framework to find the characteristics of a context, by extracting significant driving patterns (e.g., a slow-down), and then identifying the set of potential causes behind patterns (e.g., traffic congestion). Our experimental results confirm the feasibility of the framework in identifying meaningful driving patterns, with improvements in comparison with the state-of-the-art. We also demonstrate how the framework derives interesting characteristics for different contexts, through real-world examples.Comment: Accepted to be published at The 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2017

    Exploration of User Groups in VEXUS

    We introduce VEXUS, an interactive visualization framework for exploring user data to fulfill tasks such as finding a set of experts, forming discussion groups and analyzing collective behaviors. User data is characterized by a combination of demographics like age and occupation, and actions such as rating a movie, writing a paper, following a medical treatment or buying groceries. The ubiquity of user data requires tools that help explorers, be they specialists or novice users, acquire new insights. VEXUS lets explorers interact with user data via visual primitives and builds an exploration profile to recommend the next exploration steps. VEXUS combines state-of-the-art visualization techniques with appropriate indexing of user data to provide fast and relevant exploration

    Group Recommendation with Temporal Affinities

    International audienceWe examine the problem of recommending items to ad-hoc user groups. Group recommendation in collaborative rating datasets has received increased attention recently and has raised novel challenges. Different consensus functions that aggregate the ratings of group members with varying semantics ranging from least misery to pairwise disagreement, have been studied. In this paper, we explore a new dimension when computing group recommendations, that is, affinity between group members and its evolution over time. We extend existing group recommendation semantics to include temporal affinity in recommendations and design GRECA, an efficient algorithm that produces temporal affinity-aware recommendations for ad-hoc groups. We run extensive experiments that show substantial improvements in group recommendation quality when accounting for affinity while maintaining very good performance

    Multi-Objective Group Discovery on the Social Web (Technical Report)

    Les rapports de recherche du LIG - ISSN: 2105-0422We are interested in discovering user groups from collabo-rative rating datasets of the form i, u, s, where i ∈ I, u ∈ U, and s is the integer rating that user u has assigned to item i. Each user has a set of attributes that help find labeled groups such as young computer scientists in France and American female designers. We formalize the problem of finding user groups whose quality is optimized in multiple dimensions and show that it is NP-Complete. We develop α-MOMRI, an α-approximation algorithm, and h-MOMRI, a heuristic-based algorithm , for multi-objective optimization to find high quality groups. Our extensive experiments on real datasets from the social Web examine the performance of our algorithms and report cases where α-MOMRI and h-MOMRI are useful

    Towards a Framework for Semantic Exploration of Frequent Patterns

    http://ceur-ws.org/Vol-1075/ - ISSN: 1613-0073International audienceMining frequent patterns is an essential task in discovering hidden correlations in datasets. Although frequent patterns unveil valuable information, there are some challenges which limits their usability. First, the number of possible patterns is often very large which hinders their eff ective exploration. Second, patterns with many items are hard to read and the analyst may be unable to understand their meaning. In addition, the only available information about patterns is their support, a very coarse piece of information. In this paper, we are particularly interested in mining datasets that reflect usage patterns of users moving in space and time and for whom demographics attributes are available (age, occupation, etc). Such characteristics are typical of data collected from smart phones, whose analysis has critical business applications nowadays. We propose pattern exploration primitives, abstraction and refinement, that use hand-crafted taxonomies on time, space and user demographics. We show on two real datasets, Nokia and MovieLens, how the use of such taxonomies reduces the size of the pattern space and how demographics enable their semantic exploration. This work opens new perspectives in the semantic exploration of frequent patterns that reflect the behavior of di fferent user communities

    OntoSIDES: Ontology-based student progress monitoring on the national evaluation system of French Medical Schools

    International audienceWe introduce OntoSIDES, the core of an ontology-based learning management system in Medicine, in which theeducational content, the traces of students’ activities and the correction of exams are linked and related to itemsof an official reference program in a unified RDF data model. OntoSIDES is an RDF knowledge base comprised ofa lightweight domain ontology that serves as a pivot high-level vocabulary of the query interface with users, andof a dataset made of factual statements relating individual entities to classes and properties of the ontology.Thanks to an automatic mapping-based data materialization and rule-based data saturation, OntoSIDES containsaround 8 millions triples to date, and provides an integrated access to useful information for student progressmonitoring, using a powerful query language (namely SPARQL) allowing users to express their specific needs ofdata exploration and analysis. Since we do not expect end-users to master the raw syntax of SPARQL and toexpress directly complex queries in SPARQL, we have designed a set of parametrized queries that users caninstantiate through a user-friendly interface

    Data Pipelines for User Group Analytics

