229 research outputs found

    The Shapes of Cultures: A Case Study of Social Network Sites/Services Design in the U.S. and China

    Get PDF
    With growing popularity of the use of social network sites/services (SNSs) throughout the world, the global dominance of SNSs designed in the western industrialized countries, especially in the United Sates, seems to have become an inevitable trend. As internationalization has become a common practice in designing SNSs in the United States, is localization still a viable practice? Does culture still matter in designing SNSs? This dissertation aims to answer these questions by comparing the user interface (UI) designs of a U.S.-based SNS, Twitter, and a China-based SNS, Sina Weibo, both of which have assumed an identity of a “microblogging” service, a sub category of SNSs. This study employs the theoretical lens of the theory of technical identity, user-centered website cultural usability studies, and communication and media studies. By comparing the UI designs, or the “form,” of the two microblogging sites/services, I illustrate how the social functions of a technological object as embedded and expressed in the interface designs are preserved or changed as the technological object that has developed a relatively stable identity (as a microblogging site/service) in one culture is transferred between the “home” culture and another. The analysis in this study focuses on design elements relevant to users as members of networks, members of audience, and publishers/broadcasters. The results suggest that the designs carry disparate biases towards modes of communication and social affordances, which indicate a shift of the identity of microblogging service/site across cultures

    Twibo: comparing very large communities via massive social media datasets

    Get PDF
    Online social media are becoming the standard infrastructure for social communication and dissemination of information. As social media platforms not only passively provide infrastructure but also actively perform algorithmic curation for their profit and user experience, an important concern, often called ``filter bubble'' arises: people are trapped in their own personalized bubble--being exposed only to the opinions that conform their beliefs and political positions, thus potentially creating social polarization and information ``islands''. Although the adoption of social media is an international phenomenon, language difference and policy barrier also create information islands. The goal of this paper is to develop methods/system to cross-link concepts and communities in different social media, and leverage them to study the extent and impact of filter bubbles. To accomplish this goal, the main objectives in this paper are to develop text/graph mining methods to connect concepts and entities in Twitter and Weibo through Wikipedia; and to compare two social media in the dimension of topics and networks to quantify the significance of language bubble

    Predicting Influencer Virality on Twitter

    Get PDF
    The ability to successfully predict virality on Twitter holds great potential as a resource for Twitter influencers, enabling the development of more sophisticated strategies for audience engagement, audience monetization, and information sharing. To our knowledge, focusing exclusively on tweets posted by influencers is a novel context for studying Twitter virality. We find, among feature categories traditionally considered in the literature, that combining categories covering a range of information performs better than models only incorporating individual feature categories. Moreover, our general predictive model, encompassing a range of feature categories, achieves a prediction accuracy of 68% for influencer virality. We also investigate the role of influencer audiences in predicting virality, a topic we believe to be understudied in the literature. We suspect that incorporating audience information will allow us to better discriminate between virality classes, thus leading to better predictions. We pursue two different approaches, resulting in 10 different predictive models that leverage influencer audience information in addition to traditional feature categories. Both of our attempts to incorporate audience information plateau at an accuracy of approximately 61%, roughly a 7% decrease in performance compared to our general predictive model. We conclude that we are unable to find experimental evidence to support our claim that incorporating influencer audience information will improve virality predictions. Nonetheless, the performance of our general model holds promise for the deployment of a tool that allows influencers to reap the benefits of virality prediction. As stronger performance from the underlying model would make this tool more useful in practice to influencers, improving the predictive performance of our general model is a cornerstone of future work

    Why are some websites researched more than others? A review of research into the global top twenty

    Get PDF
    The web is central to the work and social lives of a substantial fraction of the world’s population, but the role of popular websites may not always be subject to academic scrutiny. This is a concern if social scientists are unable to understand an aspect of users’ daily lives because one or more major websites have been ignored. To test whether popular websites may be ignored in academia, this article assesses the volume and citation impact of research mentioning any of twenty major websites. The results are consistent with the user geographic base affecting research interest and citation impact. In addition, site affordances that are useful for research also influence academic interest. Because of the latter factor, however, it is not possible to estimate the extent of academic knowledge about a site from the number of publications that mention it. Nevertheless, the virtual absence of international research about some globally important Chinese and Russian websites is a serious limitation for those seeking to understand reasons for their web success, the markets they serve or the users that spend time on them. The sites investigated were Google, YouTube, Facebook, Baidu, Wikipedia, QQ, Tmall, Taobao, Yahoo, Amazon, Twitter, Sohu, Live, VK, JD, Instagram, Sina, Weibo, Yandex, and 360

    You process content and I process context: Cross-platform divergence of retweetability between Twitter and Weibo

    Get PDF
    RESUMÉArtiklen undersøger forskelle i brugeres retweeting-adfærd på tværs af platformene Twitter og Sina Weibo. Med en heuristisk-analytisk informationsbehandlingsmodel for retweeting, som udspringer af dual-proces-teori, rationaliserer denne undersøgelse sammenligningen af brugeradfærden for de to platforme på basis af bruger-centrerede tværkulturelle kognitive forskelle. Resultaterne viser, at når der træffes beslutninger om, hvorvidt et indlæg skal retweetes, er brugere på Twitter mere tilbøjelige til at anvende en analytisk strategi baseret på informationsbehandling af indholdsfaktorer i sammenligning med brugere på Weibo, der er mere tilbøjelige til at benytte en heuristisk strategi baseret på informationsbehandling af kontekstuelle faktorer.ABSTRACTThis article examines the cross-platform divergence between Twitter and Sina Weibo in users’ retweeting behaviors. With a heuristic-analytic information-processing model of retweeting proposed on the basis of dual-process theory, this study rationalizes the cross-platform comparison by introducing user-centered cross-cultural cognitive differences. Results show that when making decisions about whether to retweet a post, users on Twitter are more likely to use an analytic strategy based on information processing of content factors compared to users on Weibo, who are more likely to adopt a heuristic strategy based on information processing of contextual factors

    A Pointillism Approach for Natural Language Processing of Social Media

    Get PDF
    Natural language processing tasks typically start with the basic unit of words, and then from words and their meanings a big picture is constructed about what the meanings of documents or other larger constructs are in terms of the topics discussed. Social media is very challenging for natural language processing because it challenges the notion of a word. Social media users regularly use words that are not in even the most comprehensive lexicons. These new words can be unknown named entities that have suddenly risen in prominence because of a current event, or they might be neologisms newly created to emphasize meaning or evade keyword filtering. Chinese social media is particularly challenging. The Chinese language poses challenges for natural language processing based on the unit of a word even for formal uses of the Chinese language, social media only makes word segmentation in Chinese even more difficult. Thus, even knowing what the boundaries of words are in a social media corpus is a difficult proposition. For these reasons, in this document I propose the Pointillism approach to natural language processing. In the pointillism approach, language is viewed as a time series, or sequence of points that represent the grams\u27 usage over time. Time is an important aspect of the Pointillism approach. Detailed timing information, such as timestamps of when posts were posted, contain correlations based on human patterns and current events. This timing information provides the necessary context to build words and phrases out of trigrams and then group those words and phrases into topical clusters. Rather than words that have individual meanings, the basic unit of the pointillism approach is trigrams of characters. These grams take on meaning in aggregate when they appear together in a way that is correlated over time. I anticipate that the pointillism approach can perform well in a variety of natural language processing tasks for many different languages, but in this document my focus is on trend analysis for Chinese microblogging. Microblog posts have a timestamp of when posts were posted, that is accurate to the minute or second (though, in this dissertation, I bin posts by the hour). To show that trigrams supplemented with frequency information do collect scattered information into meaningful pieces, I first use the pointillism approach to extract phrases. I conducted experiments on 4-character idioms, a set of 500 phrases that are longer than 3 characters taken from the Chinese-language version of Wiktionary, and also on Weibo\u27s hot keywords. My results show that when words and topics do have a meme-like trend, they can be reconstructed from only trigrams. For example, for 4-character idioms that appear at least 99 times in one day in my data, the unconstrained precision (that is, precision that allows for deviation from a lexicon when the result is just as correct as the lexicon version of the word or phrase) is 0.93. For longer words and phrases collected from Wiktionary, including neologisms, the unconstrained precision is 0.87. I consider these results to be very promising, because they suggest that it is feasible for a machine to reconstruct complex idioms, phrases, and neologisms with good precision without any notion of words. Next, I examine the potential of the pointillism approach for extracting topical trends from microblog posts that are related to environmental issues. Independent Component Analysis (ICA) is utilized to find the trigrams which have the same independent signal source, i.e., topics. Contrast this with probabilistic topic models, which leverage co-occurrence to classify the documents into the topics they have learned, so it is hard for it to extract topics in real-time. However, pointillism approach can extract trends in real-time, whether those trends have been discussed before or not. This is more challenging because in phrase extraction, order information is used to narrow down the candidates, whereas for trend extraction only the frequency of the trigrams are considered. The proposed approach is compared against a state of the art topic extraction technique, Latent Dirichlet Allocation (LDA), on 9,147 labelled posts with timestamps. The experimental results show that the highest F1 score of the pointillism approach with ICA is 4% better than that of LDA. Thus, using the pointillism approach, the colorful and baroque uses of language that typify social media in challenging languages such as Chinese may in fact be accessible to machines. The thesis that my dissertation tests is this: For topic extraction for scenarios where no adequate lexicon is available, such as social media, the Pointillism approach uses timing information to out-perform traditional techniques that are based on co-occurrence

    Identification of Online Users' Social Status via Mining User-Generated Data

    Get PDF
    With the burst of available online user-generated data, identifying online users’ social status via mining user-generated data can play a significant role in many commercial applications, research and policy-making in many domains. Social status refers to the position of a person in relation to others within a society, which is an abstract concept. The actual definition of social status is specific in terms of specific measure indicator. For example, opinion leadership measures individual social status in terms of influence and expertise in an online society, while socioeconomic status characterizes personal real-life social status based on social and economic factors. Compared with traditional survey method which is time-consuming, expensive and sometimes difficult, some efforts have been made to identify specific social status of users based on specific user-generated data using classic machine learning methods. However, in fact, regarding specific social status identification based on specific user-generated data, the specific case has several specific challenges. However, classic machine learning methods in existing works fail to address these challenges, which lead to low identification accuracy. Given the importance of improving identification accuracy, this thesis studies three specific cases on identification of online and offline social status. For each work, this thesis proposes novel effective identification method to address the specific challenges for improving accuracy. The first work aims at identifying users’ online social status in terms of topic-sensitive influence and knowledge authority in social community question answering sites, namely identifying topical opinion leaders who are both influential and expert. Social community question answering (SCQA) site, an innovative community question answering platform, not only offers traditional question answering (QA) services but also integrates an online social network where users can follow each other. Identifying topical opinion leaders in SCQA has become an important research area due to the significant role of topical opinion leaders. However, most previous related work either focus on using knowledge expertise to find experts for improving the quality of answers, or aim at measuring user influence to identify influential ones. In order to identify the true topical opinion leaders, we propose a topical opinion leader identification framework called QALeaderRank which takes account of both topic-sensitive influence and topical knowledge expertise. In the proposed framework, to measure the topic-sensitive influence of each user, we design a novel influence measure algorithm that exploits both the social and QA features of SCQA, taking into account social network structure, topical similarity and knowledge authority. In addition, we propose three topic-relevant metrics to infer the topical expertise of each user. The extensive experiments along with an online user study show that the proposed QALeaderRank achieves significant improvement compared with the state-of-the-art methods. Furthermore, we analyze the topic interest change behaviors of users over time and examine the predictability of user topic interest through experiments. The second work focuses on predicting individual socioeconomic status from mobile phone data. Socioeconomic Status (SES) is an important social and economic aspect widely concerned. Assessing individual SES can assist related organizations in making a variety of policy decisions. Traditional approach suffers from the extremely high cost in collecting large-scale SES-related survey data. With the ubiquity of smart phones, mobile phone data has become a novel data source for predicting individual SES with low cost. However, the task of predicting individual SES on mobile phone data also proposes some new challenges, including sparse individual records, scarce explicit relationships and limited labeled samples, unconcerned in prior work restricted to regional or household-oriented SES prediction. To address these issues, we propose a semi-supervised Hypergraph based Factor Graph Model (HyperFGM) for individual SES prediction. HyperFGM is able to efficiently capture the associations between SES and individual mobile phone records to handle the individual record sparsity. For the scarce explicit relationships, HyperFGM models implicit high-order relationships among users on the hypergraph structure. Besides, HyperFGM explores the limited labeled data and unlabeled data in a semi-supervised way. Experimental results show that HyperFGM greatly outperforms the baseline methods on individual SES prediction with using a set of anonymized real mobile phone data. The third work is to predict social media users’ socioeconomic status based on their social media content, which is useful for related organizations and companies in a range of applications, such as economic and social policy-making. Previous work leverage manually defined textual features and platform-based user level attributes from social media content and feed them into a machine learning based classifier for SES prediction. However, they ignore some important information of social media content, containing the order and the hierarchical structure of social media text as well as the relationships among user level attributes. To this end, we propose a novel coupled social media content representation model for individual SES prediction, which not only utilizes a hierarchical neural network to incorporate the order and the hierarchical structure of social media text but also employs a coupled attribute representation method to take into account intra-coupled and inter-coupled interaction relationships among user level attributes. The experimental results show that the proposed model significantly outperforms other stat-of-the-art models on a real dataset, which validate the efficiency and robustness of the proposed model
    • …
    corecore