168 research outputs found

    Etude de la Maximisation de l'Influence dans les RĂ©seaux Sociaux

    No full text
    National audienceInfluence maximization is a NP-hard problem depending on the diffusion of information in social networks. The Greedy hill climbing algorithm have been proved a good approximation if the influence fonction we try to optimize is submodular, which is the case for standard diffusion models.We present a diffusion model not equivalent to standard models for which the influence function is not submodular. Then we propose, using toy graphs and a real social network, a study of different influence maximization algorithms on this model and on the standard model IC: some basic heuristics, the greedy hill climbing method, a generalization of the greedy method and an optimization method for submodular functions. We show that even if the influence function is not submodular, the greedy algorithm obtain good results while being able to scale efficiently

    Un modèle de RI basé sur des critères d'obligation et de certitude

    No full text
    International audienceIl existe un grand nombre de modèles de recherche d'information chacun ayant pour but de répondre au mieux aux attentes des utilisateurs. Le modèle que nous proposons se base sur une formulation précise de la requête reflétant le besoin de l'utilisateur : Chaque terme de la requête est augmenté par deux critères, l'un exprimant l'obligation ou non de l'apparition du terme dans les documents et l'autre exprimant la certitude de l'utilisateur quand au terme utilisé. Des expérimentations nous ont permis de vérifier qu'une telle formulation permet de gagner en précision

    Learning Multiple Temporal Matching for Time Series Classification

    No full text
    12International audienceIn real applications, time series are generally of complex structure, exhibiting different global behaviors within classes. To discriminate such challenging time series, we propose a multiple temporal matching approach that reveals the commonly shared features within classes, and the most differential ones across classes. For this, we rely on a new framework based on the variance/covariance criterion to strengthen or weaken matched observations according to the induced variability within and between classes. The experiments performed on real and synthetic datasets demonstrate the ability of the multiple temporal matching approach to capture fine-grained distinctions between time series

    Uncertain Trees: Dealing with Uncertain Inputs in Regression Trees

    Full text link
    Tree-based ensemble methods, as Random Forests and Gradient Boosted Trees, have been successfully used for regression in many applications and research studies. Furthermore, these methods have been extended in order to deal with uncertainty in the output variable, using for example a quantile loss in Random Forests (Meinshausen, 2006). To the best of our knowledge, no extension has been provided yet for dealing with uncertainties in the input variables, even though such uncertainties are common in practical situations. We propose here such an extension by showing how standard regression trees optimizing a quadratic loss can be adapted and learned while taking into account the uncertainties in the inputs. By doing so, one no longer assumes that an observation lies into a single region of the regression tree, but rather that it belongs to each region with a certain probability. Experiments conducted on several data sets illustrate the good behavior of the proposed extension.Comment: 9 page

    Terminology-based Text Embedding for Computing Document Similarities on Technical Content

    Full text link
    We propose in this paper a new, hybrid document embedding approach in order to address the problem of document similarities with respect to the technical content. To do so, we employ a state-of-the-art graph techniques to first extract the keyphrases (composite keywords) of documents and, then, use them to score the sentences. Using the ranked sentences, we propose two approaches to embed documents and show their performances with respect to two baselines. With domain expert annotations, we illustrate that the proposed methods can find more relevant documents and outperform the baselines up to 27% in terms of NDCG

    Predicting Information Diffusion in Social Networks using Content and User's Profiles

    No full text
    International audiencePredicting the diffusion of information on social networks is a key problem for applications like Opinion Leader Detection, Buzz Detection or Viral Marketing. Many recent diffusion models are direct extensions of the Cascade and Threshold models, initially proposed for epidemiology and social studies. In such models, the diffusion process is based on the dynamics of interactions between neighbor nodes in the network (the social pressure), and largely ignores important dimensions as the content of the piece of information diffused. We propose here a new family of probabilistic models that aims at predicting how a con- tent diffuses in a network by making use of additional dimensions: the content of the piece of information diffused, user's profile and willing- ness to diffuse. These models are illustrated and compared with other approaches on two blog datasets. The experimental results obtained on these datasets show that taking into account the content of the piece of information diffused is important to accurately model the diffusion process
    • …
    corecore