29 research outputs found
Adaptation to Drifting User's Interests
In recent years, many systems have been developed which aim at helping users to find pieces of information or other objects that are in accordance with their personal interests. In these systems, machine learning methods are often used to acquire the user interest profile. Frequently user interests drift with time. The ability to adapt fast to the current user's interests is an important feature for recommender systems. This paper presents a method for dealing with drifting interests by introducing the notion of gradual forgetting. Thus, the last observations should be more "important" for the learning algorithm than the old ones and the importance of an observation should decrease with time. The conducted experiments with a recommender system show that the gradual forgetting improves the ability to adapt to drifting user's interests. Experiments with the STAGGER problem provide additional evidences that gradual forgetting is able to improve the prediction accuracy on drifting concepts (incl. drifting user's interests)
Time-aware Egocentric network-based User Profiling
International audienceImproving the egocentric network-based user's profile building process by taking into account the dynamic characteristics of social networks can be relevant in many applications. To achieve this aim, we propose to apply a time-aware method into an existing egocentric-based user profiling process, based on previous contributions of our team. The aim of this strategy is to weight user's interests according to their relevance and freshness. The time awareness weight of an interest is computed by combining the relevance of individuals in the user's egocentric network (computed by taking into account the freshness of their ties) with the information relevance (computed by taking into account its freshness). The experiments on scientific publications networks (DBLP/Mendeley) allow us to demonstrate the effectiveness of our proposition compared to the existing time-agnostic egocentric network-based user profiling process
Changing User Interests through Prior-Learning of Context
The paper presents an algorithm for learning drifting and recurring user interests. The algorithm uses a prior-learning level to find out the current context. After that, searches into past observations for episodes that are relevant to the current context, âremembersâ them and âforgetsâ the irrelevant ones. Finally, the algorithm learns only from the selected relevant examples. The experiments conducted with a data set about calendar scheduling recommendations show that the presented algorithm improves significantly the predictive accuracy
Dynamic learning of cases from data streams
This paper presents a dynamic adaptive framework for building a case library being able to cope with a data stream in the field of Case-Based Reasoning. The framework provides a three-layer architecture formed by a set of case libraries dynamically built. This Dynamic and Adaptive Case Library (DACL), can process in an incremental way a data stream, and can be used as a classification model or a regression model, depending on the predicted variable. In this paper, the work is focused on classification tasks. Each case library has a first layer formed by the dynamic clusters of cases, a second one formed by the meta-cases or prototypes of the cluster, and a third one formed by an incremental indexing structure. In our approach, some variant of k-d tres have been used, in addition to an exploration technique to get a more efficient retrieval time. This three-layer famework can be constructed in an incremental way. Several meta-case learning approaches are proposed, as well as some case learning strategies. The framework has been tested with several datasets. The experimental results show a very good performance in comparison with a batch learning scheme over the same data.Peer ReviewedPostprint (author's final draft
Enrichissement du profil utilisateur à partir de son réseau social dans un contexte dynamique : application d'une méthode de pondération temporelle
International audienceLe profil de lâutilisateur est un Ă©lĂ©ment central dans les systĂšmes dâadaptation de lâinformation. Les rĂ©seaux sociaux numĂ©riques reprĂ©sentent une source d'informations trĂšs riche sur lâutilisateur. Nous nous intĂ©ressons au processus dâenrichissement du profil utilisateur Ă partir de son rĂ©seau social. Ce processus extrait les intĂ©rĂȘts de lâutilisateur Ă partir des individus dans son rĂ©seau Ă©gocentrique afin de construire la dimension sociale du profil de l'utilisateur. Afin de prendre en compte le caractĂšre dynamique des rĂ©seaux sociaux, nous proposons, dans ce travail, de construire cette dimension sociale en intĂ©grant un critĂšre temporel afin de pondĂ©rer les intĂ©rĂȘts de lâutilisateur. Ce poids "temporel", qui reflĂšte la pertinence dâun intĂ©rĂȘt, est calculĂ©, dâune part, Ă partir de la pertinence des individus du rĂ©seau Ă©gocentrique de lâutilisateur en prenant en compte la fraicheur de leurs liens avec lâutilisateur et, dâautre part, Ă partir de la pertinence des informations quâils partagent en prenant en compte la fraicheur de ces informations. Les expĂ©rimentations sur les rĂ©seaux de publicationsscientifiques DBLP et Mendeley ont permis de montrer montrer que notre proposition fournit des rĂ©sultats plus satisfaisants que ceux du processus existant
Environmental data stream mining through a case-based stochastic learning approach
© . This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/Environmental data stream mining is an open challenge for Data Science. Common methods used are static because they analyze a static set of data, and provide static data-driven models. Environmental systems are dynamic and generate a continuous data stream. Dynamic methods coping with the temporal nature of data must be provided in Data Science. Our proposal is to model each environmental information unit, timely generated, as a new case/experience in a Case-Based Reasoning (CBR) system. This contribution aims to incrementally build and manage a Dynamic Adaptive Case Library (DACL). In this paper, a stochastic method for the learning of new cases and management of prototypes to create and manage the DACL in an incremental way is introduced. This stochastic method works with two main moments. An evaluation of the method has been carried using a data stream of air quality of the city of Obregon, Sonora. México, with good results. In addition, other datasets have been mined to ensure the generality of the approach.Peer ReviewedPostprint (author's final draft
Research and Application of Personalized Modeling Based on Individual Interest in Mining
Weibo services, provided by the service providers, is simple and changeless. The research based on the content of microblog reflects the userâs personalized features. The method has important significance to improve user satisfaction and expand the scale of users. First, the interest classification problem called multiclass classification algorithm is proposed based on improving support vector machine of binary tree. Second, an improved model of mixed interest based on implicit feedback is proposed. This method is based on the shortcomings of the establishment of the interest model and the drift strategy in update phase among existing users. The improved model is applied to the user modeling of personalization, improving the authenticity and accuracy of the personalized modeling
Learning Concept Drift Using Adaptive Training Set Formation Strategy
We live in a dynamic world, where changes are a part of everyday âs life. When there is a shift in data, the classification or prediction models need to be adaptive to the changes. In data mining the phenomenon of change in data distribution over time is known as concept drift. In this research, we propose an adaptive supervised learning with delayed labeling methodology. As a part of this methodology, we introduce an adaptive training set formation algorithm called SFDL, which is based on selective training set formation. Our proposed solution considered as the first systematic training set formation approach that take into account delayed labeling problem. It can be used with any base classifier without the need to change the implementation or setting of this classifier. We test our algorithm implementation using synthetic and real dataset from various domains which might have different drift types (sudden, gradual, incremental recurrences) with different speed of change. The experimental results confirm improvement in classification accuracy as compared to ordinary classifier for all drift types. Our approach is able to increase the classifications accuracy with 20% in average and 56% in the best cases of our experimentations and it has not been worse than the ordinary classifiers in any case. Finally a comparison study with other four related methods to deal with changing in user interest over time and handle recurrence drift is performed. Results indicate the effectiveness of the proposed method over other methods in terms of classification accuracy
Real-time algorithm for changes detection in depth of anesthesia signals
This paper presents a real-time algorithm for changes detection in depth of anesthesia signals. A Page-Hinkley test (PHT) with a forgetting mechanism (PHT-FM) was developed. The samples are weighted according to their "age" so that more importance is given to recent samples. This enables the detection of the changes with less time delay than if no forgetting factor was used. The performance of the PHT-FM was evaluated in a two-fold approach. First, the algorithm was run offline in depth of anesthesia (DoA) signals previously collected during general anesthesia, allowing the adjustment of the forgetting mechanism. Second, the PHT-FM was embedded in a real-time software and its performance was validated online in the surgery room. This was performed by asking the clinician to classify in real-time the changes as true positives, false positives or false negatives. The results show that 69 % of the changes were classified as true positives, 26 % as false positives, and 5 % as false negatives. The true positives were also synchronized with changes in the hypnotic or analgesic rates made by the clinician. The contribution of this work has a high impact in the clinical practice since the PHT-FM alerts the clinician for changes in the anesthetic state of the patient, allowing a more prompt action. The results encourage the inclusion of the proposed PHT-FM in a real-time decision support system for routine use in the clinical practice. © 2012 Springer-Verlag
A Survey on Concept Drift Adaptation
Concept drift primarily refers to an online supervised learning scenario when the relation between the in- put data and the target variable changes over time. Assuming a general knowledge of supervised learning in this paper we characterize adaptive learning process, categorize existing strategies for handling concept drift, discuss the most representative, distinct and popular techniques and algorithms, discuss evaluation methodology of adaptive algorithms, and present a set of illustrative applications. This introduction to the concept drift adaptation presents the state of the art techniques and a collection of benchmarks for re- searchers, industry analysts and practitioners. The survey aims at covering the different facets of concept drift in an integrated way to reflect on the existing scattered state-of-the-art