34 research outputs found

    Dynamics of Information Diffusion and Social Sensing

    Full text link
    Statistical inference using social sensors is an area that has witnessed remarkable progress and is relevant in applications including localizing events for targeted advertising, marketing, localization of natural disasters and predicting sentiment of investors in financial markets. This chapter presents a tutorial description of four important aspects of sensing-based information diffusion in social networks from a communications/signal processing perspective. First, diffusion models for information exchange in large scale social networks together with social sensing via social media networks such as Twitter is considered. Second, Bayesian social learning models and risk averse social learning is considered with applications in finance and online reputation systems. Third, the principle of revealed preferences arising in micro-economics theory is used to parse datasets to determine if social sensors are utility maximizers and then determine their utility functions. Finally, the interaction of social sensors with YouTube channel owners is studied using time series analysis methods. All four topics are explained in the context of actual experimental datasets from health networks, social media and psychological experiments. Also, algorithms are given that exploit the above models to infer underlying events based on social sensing. The overview, insights, models and algorithms presented in this chapter stem from recent developments in network science, economics and signal processing. At a deeper level, this chapter considers mean field dynamics of networks, risk averse Bayesian social learning filtering and quickest change detection, data incest in decision making over a directed acyclic graph of social sensors, inverse optimization problems for utility function estimation (revealed preferences) and statistical modeling of interacting social sensors in YouTube social networks.Comment: arXiv admin note: text overlap with arXiv:1405.112

    Monitoring and Modelling of Social Networks

    Get PDF
    In this thesis we contribute to the understanding of online social networks, temporal networks, and non-equilibrium dynamics. As the title of this work suggests, this thesis is split into two parts, \emph{monitoring} and \emph{modelling} social networks. In the first half we look at current methods for understanding the behaviour and influence of individual users within a social network, and assess their robustness and effectiveness. In particular, we look at the role that the temporal dimension plays on these methods and the various representations that temporal networks can take. We introduce a new temporal network representation which describes a temporal network in terms of node behaviour which we use to characterise individuals and collectives. The new representation is illustrated with examples from the online social network Twitter. We model two particular aspects of social networks in the second half of this thesis. The first model, a generalisation of the popular Voter model, considers the dynamics of two opposite opinions in a heterogeneous society which differ by the resolve of their opinion. The second model investigates how the presence of `anti-bandwagon' agents can prevent the spread of ideas and innovations on a social network, particularly on networks with restrictive topologies. This contribution offers new ways to analyse temporal networks and online social media, and also provokes new and interesting questions for future research in the field

    Advances in knowledge discovery and data mining Part II

    Get PDF
    19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p

    Neural recommender models for sparse and skewed behavioral data

    Get PDF
    Modern online platforms offer recommendations and personalized search and services to a large and diverse user base while still aiming to acquaint users with the broader community on the platform. Prior work backed by large volumes of user data has shown that user retention is reliant on catering to their specific eccentric tastes, in addition to providing them popular services or content on the platform. Long-tailed distributions are a fundamental characteristic of human activity, owing to the bursty nature of human attention. As a result, we often observe skew in data facets that involve human interaction. While there are superficial similarities to Zipf's law in textual data and other domains, the challenges with user data extend further. Individual words may have skewed frequencies in the corpus, but the long-tail words by themselves do not significantly impact downstream text-mining tasks. On the contrary, while sparse users (a majority on most online platforms) contribute little to the training data, they are equally crucial at inference time. Perhaps more so, since they are likely to churn. In this thesis, we study platforms and applications that elicit user participation in rich social settings incorporating user-generated content, user-user interaction, and other modalities of user participation and data generation. For instance, users on the Yelp review platform participate in a follower-followee network and also create and interact with review text (two modalities of user data). Similarly, community question-answer (CQA) platforms incorporate user interaction and collaboratively authored content over diverse domains and discussion threads. Since user participation is multimodal, we develop generalizable abstractions beyond any single data modality. Specifically, we aim to address the distributional mismatch that occurs with user data independent of dataset specifics; While a minority of the users generates most training samples, it is insufficient only to learn the preferences of this subset of users. As a result, the data's overall skew and individual users' sparsity are closely interlinked: sparse users with uncommon preferences are under-represented. Thus, we propose to treat these problems jointly with a skew-aware grouping mechanism that iteratively sharpens the identification of preference groups within the user population. As a result, we improve user characterization; content recommendation and activity prediction (+6-22% AUC, +6-43% AUC, +12-25% RMSE over state-of-the-art baselines), primarily for users with sparse activity. The size of the item or content inventories compounds the skew problem. Recommendation models can achieve very high aggregate performance while recommending only a tiny proportion of the inventory (as little as 5%) to users. We propose a data-driven solution guided by the aggregate co-occurrence information across items in the dataset. We specifically note that different co-occurrences are not equally significant; For example, some co-occurring items are easily substituted while others are not. We develop a self-supervised learning framework where the aggregate co-occurrences guide the recommendation problem while providing room to learn these variations among the item associations. As a result, we improve coverage to ~100% (up from 5%) of the inventory and increase long-tail item recall up to 25%. We also note that the skew and sparsity problems repeat across data modalities. For instance, social interactions and review content both exhibit aggregate skew, although individual users who actively generate reviews may not participate socially and vice-versa. It is necessary to differentially weight and merge different data sources for each user towards inference tasks in such cases. We show that the problem is inherently adversarial since the user participation modalities compete to describe a user accurately. We develop a framework to unify these representations while algorithmically tackling mode collapse, a well-known pitfall with adversarial models. A more challenging but important instantiation of sparsity is the few-shot setting or cross-domain setting. We may only have a single or a few interactions for users or items in the sparse domains or partitions. We show that contextualizing user-item interactions helps us infer behavioral invariants in the dense domain, allowing us to correlate sparse participants to their active counterparts (resulting in 3x faster training, ~19% recall gains in multi-domain settings). Finally, we consider the multi-task setting, where the platform incorporates multiple distinct recommendations and prediction tasks for each user. A single-user representation is insufficient for users who exhibit different preferences along each dimension. At the same time, it is counter-productive to handle correlated prediction or inference tasks in isolation. We develop a multi-faceted representation approach grounded on residual learning with heterogeneous knowledge graph representations, which provides us an expressive data representation for specialized domains and applications with multimodal user data. We achieve knowledge sharing by unifying task-independent and task-specific representations of each entity with a unified knowledge graph framework. In each chapter, we also discuss and demonstrate how the proposed frameworks directly incorporate a wide range of gradient-optimizable recommendation and behavior models, maximizing their applicability and pertinence to user-centered inference tasks and platforms

    Nonparametric Bayesian Topic Modelling with Auxiliary Data

    Get PDF
    The intent of this dissertation in computer science is to study topic models for text analytics. The first objective of this dissertation is to incorporate auxiliary information present in text corpora to improve topic modelling for natural language processing (NLP) applications. The second objective of this dissertation is to extend existing topic models to employ state-of-the-art nonparametric Bayesian techniques for better modelling of text data. In particular, this dissertation focusses on: - incorporating hashtags, mentions, emoticons, and target-opinion dependency present in tweets, together with an external sentiment lexicon, to perform opinion mining or sentiment analysis on products and services; - leveraging abstracts, titles, authors, keywords, categorical labels, and the citation network to perform bibliographic analysis on research publications, using a supervised or semi-supervised topic model; and - employing the hierarchical Pitman-Yor process (HPYP) and the Gaussian process (GP) to jointly model text, hashtags, authors, and the follower network in tweets for corpora exploration and summarisation. In addition, we provide a framework for implementing arbitrary HPYP topic models to ease the development of our proposed topic models, made possible by modularising the Pitman-Yor processes. Through extensive experiments and qualitative assessment, we find that topic models fit better to the data as we utilise more auxiliary information and by employing the Bayesian nonparametric method

    24th International Conference on Information Modelling and Knowledge Bases

    Get PDF
    In the last three decades information modelling and knowledge bases have become essentially important subjects not only in academic communities related to information systems and computer science but also in the business area where information technology is applied. The series of European – Japanese Conference on Information Modelling and Knowledge Bases (EJC) originally started as a co-operation initiative between Japan and Finland in 1982. The practical operations were then organised by professor Ohsuga in Japan and professors Hannu Kangassalo and Hannu Jaakkola in Finland (Nordic countries). Geographical scope has expanded to cover Europe and also other countries. Workshop characteristic - discussion, enough time for presentations and limited number of participants (50) / papers (30) - is typical for the conference. Suggested topics include, but are not limited to: 1. Conceptual modelling: Modelling and specification languages; Domain-specific conceptual modelling; Concepts, concept theories and ontologies; Conceptual modelling of large and heterogeneous systems; Conceptual modelling of spatial, temporal and biological data; Methods for developing, validating and communicating conceptual models. 2. Knowledge and information modelling and discovery: Knowledge discovery, knowledge representation and knowledge management; Advanced data mining and analysis methods; Conceptions of knowledge and information; Modelling information requirements; Intelligent information systems; Information recognition and information modelling. 3. Linguistic modelling: Models of HCI; Information delivery to users; Intelligent informal querying; Linguistic foundation of information and knowledge; Fuzzy linguistic models; Philosophical and linguistic foundations of conceptual models. 4. Cross-cultural communication and social computing: Cross-cultural support systems; Integration, evolution and migration of systems; Collaborative societies; Multicultural web-based software systems; Intercultural collaboration and support systems; Social computing, behavioral modeling and prediction. 5. Environmental modelling and engineering: Environmental information systems (architecture); Spatial, temporal and observational information systems; Large-scale environmental systems; Collaborative knowledge base systems; Agent concepts and conceptualisation; Hazard prediction, prevention and steering systems. 6. Multimedia data modelling and systems: Modelling multimedia information and knowledge; Contentbased multimedia data management; Content-based multimedia retrieval; Privacy and context enhancing technologies; Semantics and pragmatics of multimedia data; Metadata for multimedia information systems. Overall we received 56 submissions. After careful evaluation, 16 papers have been selected as long paper, 17 papers as short papers, 5 papers as position papers, and 3 papers for presentation of perspective challenges. We thank all colleagues for their support of this issue of the EJC conference, especially the program committee, the organising committee, and the programme coordination team. The long and the short papers presented in the conference are revised after the conference and published in the Series of “Frontiers in Artificial Intelligence” by IOS Press (Amsterdam). The books “Information Modelling and Knowledge Bases” are edited by the Editing Committee of the conference. We believe that the conference will be productive and fruitful in the advance of research and application of information modelling and knowledge bases. Bernhard Thalheim Hannu Jaakkola Yasushi Kiyok

    How You Like Me Now? The Influence of Athlete Behavior on Fan Group Dynamics and Sports Consumption

    Full text link
    Within sports, membership in a fan base often constitutes an attachment to a team and its various personnel. As part of a presumed ingroup, sports fans will go about evaluating their favorite teams and players based on several factors, such as team or athlete performance and off‑the‑field behaviors by such athletes. Although a vast set of literature within sport management has reported that fans exhibit partiality towards their favorite teams, research in social psychology and group dynamics has presented evidence to dispute this occurrence. This body of work has contended that people in a group will operate using subjective group dynamics (SGD), wherein norms and values are actively considered in group appraisal. Complementary research has offered the manifestation of a black sheep effect (BSE), or ingroup extremity, particularly when members deviate from norms or standards of the group. In a similar vein, this dissertation challenges the prevalent notion of fans’ enduring support for their favorite teams and examines numerous correlates of such behavior. Through five main studies, this dissertation investigates the impact of athlete behavior, group membership, player status, rivalry, and regret on evaluative judgments, identity threat, purchase decisions, product choices, and social media behaviors. Study 1 gauged the role of ingroup extremity when a team’s expectations, or norms of performance by an athlete, are violated, providing evidence to support ingroup derogation among fans. Expanding upon these results, Study 2 offered an assessment of the BSE in determining how fans go about supporting and derogating an ingroup or outgroup athlete based on performance, while furthering the application of these concepts to purchase decisions and social media intentions. Our second experiment offers partial support of the BSE, wherein fans exhibit a proclivity to derogate deviant ingroup and outgroup athletes to the same extent. Using a multi‑method approach integrating both quantitative and qualitative methods, our third experiment tested how rivalry and membership (i.e., player) saliency operate to amplify specific aspects of fan behavior, social media intentions, and product choices. Study 3 reveals ingroup and performance biases among fans as well as the function of team identification as a guide for team-licensed merchandise selections. Study 4 examined how evaluations of deviant performance- and moral‑related behavior by athletes can be affected by various moral reasoning strategies utilized by fans. Our fourth experiment demonstrates similar biases as established in Study 3 and also illustrates the amplified use of moral rationalization over other moral reasoning strategies. Using the findings from our first four studies as a foundation, we introduce a novel concept to the field (i.e., black sheep regret [BSR]) and complete this dissertation with a field study (Study 5A) and an experimental investigation (Study 5B). Although Study 5A did not support BSR in a naturalistic context (i.e., on social media), Study 5B provides data to verify its occurrence in fans. Ultimately, Study 5B produces rationale for the inconclusive results within social media settings, explained by a potential effect of black sheep perpetuance (BSP). Taken together, this dissertation discusses its theoretical contributions and offers pragmatic implications and future directions for sport managers and practitioners within the sport industry. Ultimately, the current composition highlights the importance of multidisciplinary approaches in exploring various components of specific group behavior in fans, as well as in the larger milieu of human behavior itself.PHDKinesiologyUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/136938/1/seanprad_1.pd
    corecore