249 research outputs found

    Identification of Online Users' Social Status via Mining User-Generated Data

    Get PDF
    With the burst of available online user-generated data, identifying online users’ social status via mining user-generated data can play a significant role in many commercial applications, research and policy-making in many domains. Social status refers to the position of a person in relation to others within a society, which is an abstract concept. The actual definition of social status is specific in terms of specific measure indicator. For example, opinion leadership measures individual social status in terms of influence and expertise in an online society, while socioeconomic status characterizes personal real-life social status based on social and economic factors. Compared with traditional survey method which is time-consuming, expensive and sometimes difficult, some efforts have been made to identify specific social status of users based on specific user-generated data using classic machine learning methods. However, in fact, regarding specific social status identification based on specific user-generated data, the specific case has several specific challenges. However, classic machine learning methods in existing works fail to address these challenges, which lead to low identification accuracy. Given the importance of improving identification accuracy, this thesis studies three specific cases on identification of online and offline social status. For each work, this thesis proposes novel effective identification method to address the specific challenges for improving accuracy. The first work aims at identifying users’ online social status in terms of topic-sensitive influence and knowledge authority in social community question answering sites, namely identifying topical opinion leaders who are both influential and expert. Social community question answering (SCQA) site, an innovative community question answering platform, not only offers traditional question answering (QA) services but also integrates an online social network where users can follow each other. Identifying topical opinion leaders in SCQA has become an important research area due to the significant role of topical opinion leaders. However, most previous related work either focus on using knowledge expertise to find experts for improving the quality of answers, or aim at measuring user influence to identify influential ones. In order to identify the true topical opinion leaders, we propose a topical opinion leader identification framework called QALeaderRank which takes account of both topic-sensitive influence and topical knowledge expertise. In the proposed framework, to measure the topic-sensitive influence of each user, we design a novel influence measure algorithm that exploits both the social and QA features of SCQA, taking into account social network structure, topical similarity and knowledge authority. In addition, we propose three topic-relevant metrics to infer the topical expertise of each user. The extensive experiments along with an online user study show that the proposed QALeaderRank achieves significant improvement compared with the state-of-the-art methods. Furthermore, we analyze the topic interest change behaviors of users over time and examine the predictability of user topic interest through experiments. The second work focuses on predicting individual socioeconomic status from mobile phone data. Socioeconomic Status (SES) is an important social and economic aspect widely concerned. Assessing individual SES can assist related organizations in making a variety of policy decisions. Traditional approach suffers from the extremely high cost in collecting large-scale SES-related survey data. With the ubiquity of smart phones, mobile phone data has become a novel data source for predicting individual SES with low cost. However, the task of predicting individual SES on mobile phone data also proposes some new challenges, including sparse individual records, scarce explicit relationships and limited labeled samples, unconcerned in prior work restricted to regional or household-oriented SES prediction. To address these issues, we propose a semi-supervised Hypergraph based Factor Graph Model (HyperFGM) for individual SES prediction. HyperFGM is able to efficiently capture the associations between SES and individual mobile phone records to handle the individual record sparsity. For the scarce explicit relationships, HyperFGM models implicit high-order relationships among users on the hypergraph structure. Besides, HyperFGM explores the limited labeled data and unlabeled data in a semi-supervised way. Experimental results show that HyperFGM greatly outperforms the baseline methods on individual SES prediction with using a set of anonymized real mobile phone data. The third work is to predict social media users’ socioeconomic status based on their social media content, which is useful for related organizations and companies in a range of applications, such as economic and social policy-making. Previous work leverage manually defined textual features and platform-based user level attributes from social media content and feed them into a machine learning based classifier for SES prediction. However, they ignore some important information of social media content, containing the order and the hierarchical structure of social media text as well as the relationships among user level attributes. To this end, we propose a novel coupled social media content representation model for individual SES prediction, which not only utilizes a hierarchical neural network to incorporate the order and the hierarchical structure of social media text but also employs a coupled attribute representation method to take into account intra-coupled and inter-coupled interaction relationships among user level attributes. The experimental results show that the proposed model significantly outperforms other stat-of-the-art models on a real dataset, which validate the efficiency and robustness of the proposed model

    Multidimensional opinion mining from social data

    Get PDF
    Social media popularity and importance is on the increase due to people using it for various types of social interaction across multiple channels. This thesis focuses on the evolving research area of Social Opinion Mining, tasked with the identification of multiple opinion dimensions, such as subjectivity, sentiment polarity, emotion, affect, sarcasm, and irony, from user-generated content represented across multiple social media platforms and in various media formats, like textual, visual, and audio. Mining people’s social opinions from social sources, such as social media platforms and newswires commenting sections, is a valuable business asset that can be utilised in many ways and in multiple domains, such as Politics, Finance, and Government. The main objective of this research is to investigate how a multidimensional approach to Social Opinion Mining affects fine-grained opinion search and summarisation at an aspect-based level and whether such a multidimensional approach outperforms single dimension approaches in the context of an extrinsic human evaluation conducted in a real-world context: the Malta Government Budget, where five social opinion dimensions are taken into consideration, namely subjectivity, sentiment polarity, emotion, irony, and sarcasm. This human evaluation determines whether the multidimensional opinion summarisation results provide added-value to potential end-users, such as policy-makers and decision-takers, thereby providing a nuanced voice to the general public on their social opinions on topics of a national importance. Results obtained indicate that a more fine-grained aspect-based opinion summary based on the combined dimensions of subjectivity, sentiment polarity, emotion, and sarcasm or irony is more informative and more useful than one based on sentiment polarity only. This research contributes towards the advancement of intelligent search and information retrieval from social data and impacts entities utilising Social Opinion Mining results towards effective policy formulation, policy-making, decision-making, and decision-taking at a strategic level

    “Comments Matter and The More The Better!”: Improving Rumor Detection with User Comments

    Get PDF
    While many online platforms bring great benefits to their users by allowing user-generated content, they have also facilitated generation and spreading of harmful content such as rumors. Researcher have proposed different rumor detection methods based on features extracted from the original post and/or associated comments, but how comments affect the performance of such methods remains largely less understood. In this paper, we first propose a new BERT-based rumor detection method that can outperform other state-of-the-art methods, and then used it to study the role of comments in rumor detection. Our proposed method concatenates the original post and associated comments to form a single long text, which is then segmented into shorter chunks more suitable for BERT-based vectorization. Features extracted from all trunks are fed into a classifier based on an LSTM network or a transformer layer for the classification task. The experimental results on the PHEME and Ma-Weibo datasets proved the superior performance of our method. We conducted additional experiments on different settings of our proposed method to study different aspects of the role comments play in the rumor detection task. These additional experiments led to some very interesting findings, including the surprising result that fixed-length segmentation is better than natural segmentation, and the observation that including more comments can help improve the rumor detector's performance. Some of these findings have profound operational implications for online platforms, e.g., commentators can contribute to rumor detection positively so online platforms can leverage the crowd intelligence to detect online rumors more effectively without applying over-strict content consensus policies

    Cybernationalism and cyberactivism in China

    Get PDF
    El nacionalismo en la era de Internet se está convirtiendo cada vez más en un factor esencial que influye en la agenda-setting de la sociedad china, así como en las relaciones de China con los países extranjeros, especialmente con Occidente. Para China, una mejor comprensión de la estructura teórica universal y de los patrones de comportamiento del nacionalismo facilitaría la articulación social general de esta tendencia y potenciaría su papel positivo en la agenda-setting social. Por otra parte, un estudio del cibernacionalismo chino basado en una perspectiva china en el mundo académico occidental es un intento de transculturación. Desde el punto de vista de las relaciones internacionales y la geopolítica actuales, que son bastante urgentes, este intento ayudaría a mejorar la compatibilidad de China con el actual orden mundial dominado por Occidente, a reducir la desinformación entre China y otros países y a sentar las bases culturales e ideológicas para otras colaboraciones internacionales. Teniendo en cuenta el estado actual de la investigación sobre el nacionalismo chino y la naturaleza participativa de las masas del cibernacionalismo, esta disertación se centra en el cibernacionalismo en las tres partes siguientes. El primero es un estudio de los orígenes históricos del cibernacionalismo chino. Esta sección incluye tanto una exploración del consenso social en la antigua China como un estudio de la influencia del nacionalismo en la historia china moderna. El estudio de los orígenes históricos no sólo nos muestra la secuencia cronológica de la experiencia del desarrollo y la evolución tanto del proto-nacionalismo como del nacionalismo en China, sino que también revela un impulso decisivo para las reivindicaciones y comportamientos actuales del cibernacionalismo. La segunda parte trata del proceso de formación y ascenso del cibernacionalismo desde el siglo XXI. El importante antecedente del paso del nacionalismo al cibernacionalismo es el proceso de informatización de la sociedad china. Una vez completado el estudio de la situación básica de la sociedad china de Internet, especialmente el estudio de los medios sociales como espacio público, podemos vincular Internet con el nacionalismo y examinar el nuevo desarrollo del nacionalismo en la era de la participación de masas. El objetivo final es conectar el proto-nacionalismo, el nacionalismo y el cibernacionalismo, y seguir construyendo una comprensión del cibernacionalismo que sea coherente tanto con los principios universales del nacionalismo como con el contexto chino. Por último, validamos los resultados derivados del estudio anterior a través de la realidad social, es decir, estudiando las prácticas de ciberactivismo del cibernacionalismo para juzgar su suficiencia general así como su validez. Llevaremos a cabo varios estudios de caso de natural language processing basados en big data para reproducir la lógica de comportamiento y el impacto real del ciberactivismo de la manera más cercana posible a la realidad de Internet, evitando al mismo tiempo los defectos de argumentación unilateral y de infrarrepresentación de los estudios de caso tradicionales.Nationalism in the Internet age is increasingly becoming an essential factor influencing agendasetting within Chinese society, as well as China’s relations with foreign countries, especially the West. For China, a better understanding of the universal theoretical structure and behavioral patterns of nationalism would facilitate the overall social articulation of this trend and enhance its positive role in social agenda setting. On the other hand, a study of Chinese cybernationalism based on a Chinese perspective in western academia is an attempt at transculturation. From the viewpoint of the current rather urgent international relations and geopolitics, such an attempt would help to enhance China’s compatibility with the current western-dominated world order, reduce misinformation between China and other countries, and lay the cultural and ideological groundwork for various other international collaborations. Considering the current state of Chinese nationalism research and the mass participatory nature of cybernationalism, this dissertation focuses on cybernationalism in the following three parts. The first is a study of the historical origins of Chinese cybernationalism. This section includes both an exploration of the social consensus in ancient China and a survey of the influence of nationalism in modern Chinese history. The historical origins study not only shows us the chronological sequence of experiencing the development and evolution of both proto-nationalism and nationalism in China, but also reveals a decisive impetus for the current claims and behaviors of cybernationalism. The second part deals with the process of formation and rise of cybernationalism since the 21st century. The important background for the move from nationalism to cybernationalism is the informatization process of Chinese society. After we have completed the study of the basic situation of Chinese Internet society, especially the study of social media as a public space, we can link the Internet with nationalism and examine the new development of nationalism in the era of mass participation. The ultimate goal is to connect the proto-nationalism, nationalism, cybernationalism, and furtherly construct an understanding of cybernationalism that is consistent with both the universal principles of nationalism and the Chinese context. Finally, we validate the results derived from the previous study through social reality, i.e., by studying the cyberactivism practices of cybernationalism to judge its general sufficiency as well as validity. We will conduct several natural language processing case studies based on big data to reproduce the behavioral logic and actual impact of cyberactivism in the closest possible way to the Internet reality while avoiding the unilateral argumentation and under-representation flaws of traditional case studies

    The Web of False Information: Rumors, Fake News, Hoaxes, Clickbait, and Various Other Shenanigans

    Full text link
    A new era of Information Warfare has arrived. Various actors, including state-sponsored ones, are weaponizing information on Online Social Networks to run false information campaigns with targeted manipulation of public opinion on specific topics. These false information campaigns can have dire consequences to the public: mutating their opinions and actions, especially with respect to critical world events like major elections. Evidently, the problem of false information on the Web is a crucial one, and needs increased public awareness, as well as immediate attention from law enforcement agencies, public institutions, and in particular, the research community. In this paper, we make a step in this direction by providing a typology of the Web's false information ecosystem, comprising various types of false information, actors, and their motives. We report a comprehensive overview of existing research on the false information ecosystem by identifying several lines of work: 1) how the public perceives false information; 2) understanding the propagation of false information; 3) detecting and containing false information on the Web; and 4) false information on the political stage. In this work, we pay particular attention to political false information as: 1) it can have dire consequences to the community (e.g., when election results are mutated) and 2) previous work show that this type of false information propagates faster and further when compared to other types of false information. Finally, for each of these lines of work, we report several future research directions that can help us better understand and mitigate the emerging problem of false information dissemination on the Web
    corecore