37 research outputs found

    TSCMF: Temporal and social collective matrix factorization model for recommender systems

    Get PDF
    In real-world recommender systems, user preferences are dynamic and typically change over time. Capturing the temporal dynamics of user preferences is essential to design an efficient personalized recommender system and has recently attracted significant attention. In this paper, we consider user preferences change individually over time. Moreover, based on the intuition that social influence can affect the users’ preferences in a recommender system, we propose a Temporal and Social CollectiveMatrix Factorization model called TSCMF for recommendation.We jointly factorize the users’ rating information and social trust information in a collective matrix factorization framework by introducing a joint objective function. We model user dynamics into this framework by learning a transition matrix of user preferences between two successive time periods for each individual user. We present an efficient optimization algorithm based on stochastic gradient descent for solving the objective function. The experiments on a real-world dataset illustrate that the proposed model outperforms the competitive methods.Moreover, the complexity analysis demonstrates that the proposed model can be scaled up to large datasets


    Get PDF
    Social media such as social networking sites, blogs, micro-blogs, Wikis, are increasingly and widely used in our daily lives. In the information system (IS) discipline, social media have become a hot research area and draw the attention of many scholars. The paper systematically reviewed social media studies published in Association for Information Systems (AIS) listed top 20 journals from 2009 to 2013. The publication time, journal preferences, research objects and research topics are discussed. Generally, the current social media studies including four areas, namely user, management, technology and information. Each area has distinct focuses and topics. By thoroughly analyzing the research topics, the authors formulate our projections and recommendations for future social media studies

    Predicting Consumers’ Brand Sentiment Using Text Analysis on Reddit

    Get PDF
    With the emergence of data privacy regulations around the world (e.g. GDPR, CCPA), practitioners of Internet marketing, the largest digital marketing channel, face the trade-off between user data protection and advertisement targeting accuracy due to their current reliance on PII-related social media analytics. To address this challenge, this research proposes a predictive model for consumers’ brand sentiment based entirely on textual data from Reddit, i.e. fully compliant with current data privacy regulations. This author uses natural language processing techniques to process all post and comment data from the r/gadgets subreddit community in 2018 – extracting frequently-discussed brands and products through named entity recognition, as well as generating brand sentiment labels for active users in r/gadgets through sentiment analysis. This research then uses four supervised learning classifiers to predict brand sentiments for four brand clusters (Apple, Samsung, Microsoft and Google) based on the self-identified characteristics of Reddit users. Across all four brand clusters, the predictive model proposed by this research achieved a ROC AUC score above 0.7 (three out of the four above 0.8). This research thus shows the predictive power of self-identified user characteristics on brand sentiments and offers a non-PII-required consumer targeting model for digital marketing practitioners

    Modeling user preference dynamics with coupled tensor factorization for social media recommendation

    Get PDF
    An essential problem in real-world recommender systems is that user preferences are not static and users are likely to change their preferences over time. Recent studies have shown that the modelling and capturing the dynamics of user preferences lead to significant improvements on recommendation accuracy and, consequently, user satisfaction. In this paper, we develop a framework to capture user preference dynamics in a personalized manner based on the fact that changes in user preferences can vary individually. We also consider the plausible assumption that older user activities should have less influence on a user’s current preferences. We introduce an individual time decay factor for each user according to the rate of his preference dynamics to weigh the past user preferences and decrease their importance gradually. We exploit users’ demographics as well as the extracted similarities among users over time, aiming to enhance the prior knowledge about user preference dynamics, in addition to the past weighted user preferences in a developed coupled tensor factorization technique to provide top-K recommendations. The experimental results on the two real social media datasets—Last.fm and Movielens—indicate that our proposed model is better and more robust than other competitive methods in terms of recommendation accuracy and is more capable of coping with problems such as cold-start and data sparsity

    Sentiment Analysis and Opinion Mining within Social Networks using Konstanz Information Miner

    Get PDF
    Evaluations, opinions, and sentiments have become very obvious due to rapid emerging interest in ecommerce which is also a significant source of expression of opinions and analysis of sentiment. In this study, a general introduction on sentiment analysis, steps of sentiment analysis, sentiments analysis applications, sentiment analysis research challenges, techniques used for sentiment analysis, etc., were discussed in detail. With these details given, it is hoped that researchers will engage in opinion mining and sentiment analysis research to attain more successes correlated to these issues. The research is based on data input from web services and social networks, including an application that performs such actions. The main aspects of this study are to statistically test and evaluate the major social network websites: In this case Twitter, because it is has rich data source and easy within social networks tools. In this study, firstly a good understanding of sentiment analysis and opinion mining research based on recent trends in the field is provided. Secondly, various aspects of sentiment analysis are explained. Thirdly, various steps of sentiment analysis are introduced. Fourthly, various sentiment analysis, research challenges are discussed. Finally, various techniques used for sentiment analysis are explained and Konstanz Information Miner (KNIME) that can be used as sentiment analysis tool is introduced. For future work, recent machine learning techniques including big data platforms may be proposed for efficient solutions for opinion mining and sentiment analysi

    A Unified Statistical Framework for Evaluating Predictive Methods

    Get PDF
    Predictive analytics is an important part of the business intelligence and decision support systems literature and likely to grow in importance with the emergence of big data as a discipline. Despite their importance, the accuracy of predictive methods is often not assessed using statistical hypothesis tests. Furthermore, there is no commonly agreed upon standard as to which questions should be examined when evaluating predictive methods. We fill this gap by defining three questions that involve the overall and comparative predictive accuracy of the new method. We then present a unified statistical framework for evaluating predictive methods that can be used to address all three of these questions. The framework is particularly versatile and can be applied to most problems and datasets. In addition to these practical advantages over hypotheses tests used in previous literature, the framework has the theoretical advantage that it is not necessary to assume a normal distribution

    Probabilistic Personalized Recommendation Models For Heterogeneous Social Data

    Get PDF
    Content recommendation has risen to a new dimension with the advent of platforms like Twitter, Facebook, FriendFeed, Dailybooth, and Instagram. Although this uproar of data has provided us with a goldmine of real-world information, the problem of information overload has become a major barrier in developing predictive models. Therefore, the objective of this The- sis is to propose various recommendation, prediction and information retrieval models that are capable of leveraging such vast heterogeneous content. More specifically, this Thesis focuses on proposing models based on probabilistic generative frameworks for the following tasks: (a) recommending backers and projects in Kickstarter crowdfunding domain and (b) point of interest recommendation in Foursquare. Through comprehensive set of experiments over a variety of datasets, we show that our models are capable of providing practically useful results for recommendation and information retrieval tasks

    Exploring the topical structure of short text through probability models : from tasks to fundamentals

    Get PDF
    Recent technological advances have radically changed the way we communicate. Today’s communication has become ubiquitous and it has fostered the need for information that is easier to create, spread and consume. As a consequence, we have experienced the shortening of text messages in mediums ranging from electronic mailing, instant messaging to microblogging. Moreover, the ubiquity and fast-paced nature of these mediums have promoted their use for unthinkable tasks. For instance, reporting real-world events was classically carried out by news reporters, but, nowadays, most interesting events are first disclosed on social networks like Twitter by eyewitness through short text messages. As a result, the exploitation of the thematic content in short text has captured the interest of both research and industry. Topic models are a type of probability models that have traditionally been used to explore this thematic content, a.k.a. topics, in regular text. Most popular topic models fall into the sub-class of LVMs (Latent Variable Models), which include several latent variables at the corpus, document and word levels to summarise the topics at each level. However, classical LVM-based topic models struggle to learn semantically meaningful topics in short text because the lack of co-occurring words within a document hampers the estimation of the local latent variables at the document level. To overcome this limitation, pooling and hierarchical Bayesian strategies that leverage on contextual information have been essential to improve the quality of topics in short text. In this thesis, we study the problem of learning semantically meaningful and predictive representations of text in two distinct phases: • In the first phase, Part I, we investigate the use of LVM-based topic models for the specific task of event detection in Twitter. In this situation, the use of contextual information to pool tweets together comes naturally. Thus, we first extend an existing clustering algorithm for event detection to use the topics learned from pooled tweets. Then, we propose a probability model that integrates topic modelling and clustering to enable the flow of information between both components. • In the second phase, Part II and Part III, we challenge the use of local latent variables in LVMs, specially when the context of short messages is not available. First of all, we study the evaluation of the generalization capabilities of LVMs like PFA (Poisson Factor Analysis) and propose unbiased estimation methods to approximate it. With the most accurate method, we compare the generalization of chordal models without latent variables to that of PFA topic models in short and regular text collections. In summary, we demonstrate that by integrating clustering and topic modelling, the performance of event detection techniques in Twitter is improved due to the interaction between both components. Moreover, we develop several unbiased likelihood estimation methods for assessing the generalization of PFA and we empirically validate their accuracy in different document collections. Finally, we show that we can learn chordal models without latent variables in text through Chordalysis, and that they can be a competitive alternative to classical topic models, specially in short text.Els avenços tecnològics han canviat radicalment la forma que ens comuniquem. Avui en dia, la comunicació és ubiqua, la qual cosa fomenta l’ús de informació fàcil de crear, difondre i consumir. Com a resultat, hem experimentat l’escurçament dels missatges de text en diferents medis de comunicació, des del correu electrònic, a la missatgeria instantània, al microblogging. A més de la ubiqüitat, la naturalesa accelerada d’aquests medis ha promogut el seu ús per tasques fins ara inimaginables. Per exemple, el relat d’esdeveniments era clàssicament dut a terme per periodistes a peu de carrer, però, en l’actualitat, el successos més interessants es publiquen directament en xarxes socials com Twitter a través de missatges curts. Conseqüentment, l’explotació de la informació temàtica del text curt ha atret l'interès tant de la recerca com de la indústria. Els models temàtics (o topic models) són un tipus de models de probabilitat que tradicionalment s’han utilitzat per explotar la informació temàtica en documents de text. Els models més populars pertanyen al subgrup de models amb variables latents, els quals incorporen varies variables a nivell de corpus, document i paraula amb la finalitat de descriure el contingut temàtic a cada nivell. Tanmateix, aquests models tenen dificultats per aprendre la semàntica en documents curts degut a la manca de coocurrència en les paraules d’un mateix document, la qual cosa impedeix una correcta estimació de les variables locals. Per tal de solucionar aquesta limitació, l’agregació de missatges segons el context i l’ús d’estratègies jeràrquiques Bayesianes són essencials per millorar la qualitat dels temes apresos. En aquesta tesi, estudiem en dos fases el problema d’aprenentatge d’estructures semàntiques i predictives en documents de text: En la primera fase, Part I, investiguem l’ús de models temàtics amb variables latents per la detecció d’esdeveniments a Twitter. En aquest escenari, l’ús del context per agregar tweets sorgeix de forma natural. Per això, primer estenem un algorisme de clustering per detectar esdeveniments a partir dels temes apresos en els tweets agregats. I seguidament, proposem un nou model de probabilitat que integra el model temàtic i el de clustering per tal que la informació flueixi entre ambdós components. En la segona fase, Part II i Part III, qüestionem l’ús de variables latents locals en models per a text curt sense context. Primer de tot, estudiem com avaluar la capacitat de generalització d’un model amb variables latents com el PFA (Poisson Factor Analysis) a través del càlcul de la likelihood. Atès que aquest càlcul és computacionalment intractable, proposem diferents mètodes d estimació. Amb el mètode més acurat, comparem la generalització de models chordals sense variables latents amb la del models PFA, tant en text curt com estàndard. En resum, demostrem que integrant clustering i models temàtics, el rendiment de les tècniques de detecció d’esdeveniments a Twitter millora degut a la interacció entre ambdós components. A més a més, desenvolupem diferents mètodes d’estimació per avaluar la capacitat generalizadora dels models PFA i validem empíricament la seva exactitud en diverses col·leccions de text. Finalment, mostrem que podem aprendre models chordals sense variables latents en text a través de Chordalysis i que aquests models poden ser una bona alternativa als models temàtics clàssics, especialment en text curt.Postprint (published version

    Mobile App Recommendation

    Get PDF

    Social informatics

    Get PDF
    5th International Conference, SocInfo 2013, Kyoto, Japan, November 25-27, 2013, Proceedings</p