512 research outputs found

    SWSR: A Chinese dataset and lexicon for online sexism detection

    Get PDF
    Online sexism has become an increasing concern in social media platforms as it has affected the healthy development of the Internet and can have negative effects in society. While research in the sexism detection domain is growing, most of this research focuses on English as the language and on Twitter as the platform. Our objective here is to broaden the scope of this research by considering the Chinese language on Sina Weibo. We propose the first Chinese sexism dataset – Sina Weibo Sexism Review (SWSR) dataset –, as well as a large Chinese lexicon SexHateLex made of abusive and gender-related terms. We introduce our data collection and annotation process, and provide an exploratory analysis of the dataset characteristics to validate its quality and to show how sexism is manifested in Chinese. The SWSR dataset provides labels at different levels of granularity including (i) sexism or non-sexism, (ii) sexism category and (iii) target type, which can be exploited, among others, for building computational methods to identify and investigate finer-grained gender-related abusive language. We conduct experiments for the three sexism classification tasks making use of state-of-the-art machine learning models. Our results show competitive performance, providing a benchmark for sexism detection in the Chinese language, as well as an error analysis highlighting open challenges needing more research in Chinese NLP. The SWSR dataset and SexHateLex lexicon are publicly available.

    An Empirical Study on Android for Saving Non-shared Data on Public Storage

    Get PDF
    With millions of apps that can be downloaded from official or third-party market, Android has become one of the most popular mobile platforms today. These apps help people in all kinds of ways and thus have access to lots of user's data that in general fall into three categories: sensitive data, data to be shared with other apps, and non-sensitive data not to be shared with others. For the first and second type of data, Android has provided very good storage models: an app's private sensitive data are saved to its private folder that can only be access by the app itself, and the data to be shared are saved to public storage (either the external SD card or the emulated SD card area on internal FLASH memory). But for the last type, i.e., an app's non-sensitive and non-shared data, there is a big problem in Android's current storage model which essentially encourages an app to save its non-sensitive data to shared public storage that can be accessed by other apps. At first glance, it seems no problem to do so, as those data are non-sensitive after all, but it implicitly assumes that app developers could correctly identify all sensitive data and prevent all possible information leakage from private-but-non-sensitive data. In this paper, we will demonstrate that this is an invalid assumption with a thorough survey on information leaks of those apps that had followed Android's recommended storage model for non-sensitive data. Our studies showed that highly sensitive information from billions of users can be easily hacked by exploiting the mentioned problematic storage model. Although our empirical studies are based on a limited set of apps, the identified problems are never isolated or accidental bugs of those apps being investigated. On the contrary, the problem is rooted from the vulnerable storage model recommended by Android. To mitigate the threat, we also propose a defense framework

    Baidu, Weibo and Renren: The Global Political Economy of Social Media in China

    Get PDF
    The task of this work is to conduct a global political-economic analysis of China's major social media platforms in the context of transformations of the Chinese economy. It analyses Chinese social media's commodity and capital form. It compares the political economy of Baidu (search engine), Weibo (microblog) and Renren (social networking site) to the political economy of the US platforms Google (search engine), Twitter (microblog) and Facebook (social networking site) in order to analyse differences and commonalities. The comparative analysis focuses on aspects such as profits, the role of advertising, the boards of directors, shareholders, financial market values, terms of use and usage policies. The analysis is framed by the question to which extent China has a capitalist or socialist economy

    Loyal Dissent in the Chinese Blogosphere: Sina Weibo Discourse on the Chinese Communist Party

    Get PDF
    The impact of the Internet on Chinese politics is a hot topic in contemporary academic debate. Some scholars believe that political discussions in cyberspace will lead to a more pluralistic and democratic China. Others argue that the ruling Communist Party will strengthen its position by using the Internet as a tool for censorship and active propaganda. The purpose of this article is to contribute to the debate on the political impact of increasing Internet use, by studying uncensored online opinions about the Communist Party and its policies. More specifically, we investigate and analyze some of the most popular and uncensored microblog tweets (Sina Weibo) that discussed political scandals in China during the Spring of 2012. The findings show that a majority of the tweets contains criticism against certain activities of the Party, but do not challenge its hold on power. The study indicates that the phenomenon of loyal dissent is a distinguishing feature of online political discourse in contemporary China. Consequently, the blogosphere has the potential to foster a generation of more critical Chinese citizens. However, in the current phase of overall information repression and censorship, and as a particular form of online expression, microblogging cannot yet be considered a catalyst for democratization

    The Shapes of Cultures: A Case Study of Social Network Sites/Services Design in the U.S. and China

    Get PDF
    With growing popularity of the use of social network sites/services (SNSs) throughout the world, the global dominance of SNSs designed in the western industrialized countries, especially in the United Sates, seems to have become an inevitable trend. As internationalization has become a common practice in designing SNSs in the United States, is localization still a viable practice? Does culture still matter in designing SNSs? This dissertation aims to answer these questions by comparing the user interface (UI) designs of a U.S.-based SNS, Twitter, and a China-based SNS, Sina Weibo, both of which have assumed an identity of a “microblogging” service, a sub category of SNSs. This study employs the theoretical lens of the theory of technical identity, user-centered website cultural usability studies, and communication and media studies. By comparing the UI designs, or the “form,” of the two microblogging sites/services, I illustrate how the social functions of a technological object as embedded and expressed in the interface designs are preserved or changed as the technological object that has developed a relatively stable identity (as a microblogging site/service) in one culture is transferred between the “home” culture and another. The analysis in this study focuses on design elements relevant to users as members of networks, members of audience, and publishers/broadcasters. The results suggest that the designs carry disparate biases towards modes of communication and social affordances, which indicate a shift of the identity of microblogging service/site across cultures

    How do libraries use social networking sites to interact with users?

    Get PDF
    Conference Theme: Information, Interaction, Innovation: Celebrating the Past, Constructing the Present and Creating the FutureASIST 2012 Proceedings' web site is located at http://www.asis.org/asist2012/proceedings/frontmatter/titlepage12.htmlSocial networking sites (SNS) are helpful for stirring up interactions among users. The number of libraries which adopt SNSs is increasing. However, user engagement is low on many libraries’ SNSs. Existing research mainly focuses on the ways SNSs used in libraries and the librarians or users’ attitudes towards libraries using SNSs. Little research has been done on how to use SNSs to interact with library users effectively. This study focuses on the interactions between libraries and users on libraries’ Facebook, Twitter and Weibo. Four types of interactions are examined, including knowledge sharing, information dissemination, communication and knowledge gathering. A mixed method is applied in this study: quantitative results, generated from the analysis on around 1700 posts sampled from 40 libraries’ SNSs, are incorporated with qualitative results concluded from the interviews with 10 librarians. The study finds that among the four types of interactions, knowledge sharing attracts the largest volume of user responses on libraries’ SNSs. The study’s investigation on the differences of Facebook-like and Twitter-like SNSs and those between academic and public libraries on using SNSs suggests that in order to improve the efficiency of interacting with users on SNSs, there are necessities for libraries to coordinate different types of SNSs and take the properties of their communities under consideration.postprin

    Tackling Sexist Hate Speech: Cross-Lingual Detection and Multilingual Insights from Social Media

    Get PDF
    With the widespread use of social media, the proliferation of online communication presents both opportunities and challenges for fostering a respectful and inclusive digital environment. Due to the anonymity and weak regulations of social media platforms, the rise of hate speech has become a significant concern, particularly against specific individuals or groups based on race, religion, ethnicity, or gender, posing a severe threat to human rights. Sexist hate speech is a prevalent form of online hate that often manifests itself through gender-based violence and discrimination, challenging societal norms and legal systems. Despite the advances in natural language processing techniques for detecting offensive and sexist content, most research still focuses on monolingual (primarily English) contexts, neglecting the multilingual nature of online platforms. This gap highlights the need for effective and scalable strategies to address the linguistic diversity and cultural variations in hate speech. Cross-language transfer learning and state-of-the-art multilingual pre-trained language models provide potential solutions to improve the detection efficiency of low-resource languages by leveraging data from high-resource languages. Additional knowledge is crucial to facilitate the models’ performance in detecting culturally varying expressions of sexist hate speech in different languages. In this thesis, we delve into the complex area of identifying sexist hate speech in social media across diverse languages pertaining to different language families, with a focus on sexism and a broad exploration of datasets, methodologies, and barriers inherent in mitigating online hate speech in cross-lingual and multilingual scenarios. We primarily apply cross-lingual transfer learning techniques to detect sexist hate speech, aiming to leverage knowledge acquired from related linguistic data in order to improve performance in a target language. We also investigate the integration of external knowledge to deepen the understanding of sexism in multilingual social media contexts, addressing both the challenges of linguistic diversity and the need for comprehensive, culturally sensitive hate speech detection models. Specifically, it embarks on a comprehensive survey of tackling cross-lingual hate speech online, summarising existing datasets and cross-lingual approaches, as well as highlighting challenges and frontiers in this field. It then presents a first contribution to the field, the creation of the Sina Weibo Sexism Review (Swsr) dataset in Chinese —a pioneering resource that not only fills a crucial gap in limited resources but also lays the foundation for relevant cross-lingual investigations. Additionally, it examines how cross-lingual techniques can be utilised to generate domain-aware word embeddings, and explores the application of these embeddings in a cross-lingual hate speech framework, thereby enhancing the capacity to capture the subtleties of sexist hate speech across diverse languages. Recognising the significance of linguistic nuances in multilingual and cross-lingual settings, another innovation consists in proposing and evaluating a series of multilingual and cross-lingual models tailored for detecting sexist hate speech. By leveraging the capacity of shared knowledge and features across languages, these models significantly advance the state-of-the-art in identifying online sexist hate speech. As societies continue to deal with the complexities of social media, the findings and methodologies presented in this thesis could effectively help foster more inclusive and respectful online content across languages
    • …
    corecore