13 research outputs found

    A sentiment-based filteration and data analysis framework for social media

    Get PDF
    This paper describes a framework that explains the processes involved in the filteration and analysis of data for user generated content in social media.Previous researches have put their focus in leveraging high quality data from social media data stream, but there are many opportunities that need to be explored.This paper proposes a sentiment-based filteration and data analysis framework in identifying relevant information from data generated by users in social media.Based on the textual contents generated and spread through social media, it is assumed that each of the set of text streams/corpora might carry a sentiment associated with it regardless of its polarity bias. Due to this, the proposed framework introduces the idea of data filtering that exploits information and sentiment captured in text while at the same time adapts text analysis methods overcoming the noisy and unstructured nature of social media textual content

    TotalDefMeme: A Multi-Attribute Meme dataset on Total Defence in Singapore

    Full text link
    Total Defence is a defence policy combining and extending the concept of military defence and civil defence. While several countries have adopted total defence as their defence policy, very few studies have investigated its effectiveness. With the rapid proliferation of social media and digitalisation, many social studies have been focused on investigating policy effectiveness through specially curated surveys and questionnaires either through digital media or traditional forms. However, such references may not truly reflect the underlying sentiments about the target policies or initiatives of interest. People are more likely to express their sentiment using communication mediums such as starting topic thread on forums or sharing memes on social media. Using Singapore as a case reference, this study aims to address this research gap by proposing TotalDefMeme, a large-scale multi-modal and multi-attribute meme dataset that captures public sentiments toward Singapore's Total Defence policy. Besides supporting social informatics and public policy analysis of the Total Defence policy, TotalDefMeme can also support many downstream multi-modal machine learning tasks, such as aspect-based stance classification and multi-modal meme clustering. We perform baseline machine learning experiments on TotalDefMeme and evaluate its technical validity, and present possible future interdisciplinary research directions and application scenarios using the dataset as a baseline.Comment: 6 pages. Accepted at ACM MMSys 202

    Parallel clustering of high-dimensional social media data streams

    Full text link
    We introduce Cloud DIKW as an analysis environment supporting scientific discovery through integrated parallel batch and streaming processing, and apply it to one representative domain application: social media data stream clustering. Recent work demonstrated that high-quality clusters can be generated by representing the data points using high-dimensional vectors that reflect textual content and social network information. Due to the high cost of similarity computation, sequential implementations of even single-pass algorithms cannot keep up with the speed of real-world streams. This paper presents our efforts to meet the constraints of real-time social stream clustering through parallelization. We focus on two system-level issues. Most stream processing engines like Apache Storm organize distributed workers in the form of a directed acyclic graph, making it difficult to dynamically synchronize the state of parallel workers. We tackle this challenge by creating a separate synchronization channel using a pub-sub messaging system. Due to the sparsity of the high-dimensional vectors, the size of centroids grows quickly as new data points are assigned to the clusters. Traditional synchronization that directly broadcasts cluster centroids becomes too expensive and limits the scalability of the parallel algorithm. We address this problem by communicating only dynamic changes of the clusters rather than the whole centroid vectors. Our algorithm under Cloud DIKW can process the Twitter 10% data stream in real-time with 96-way parallelism. By natural improvements to Cloud DIKW, including advanced collective communication techniques developed in our Harp project, we will be able to process the full Twitter stream in real-time with 1000-way parallelism. Our use of powerful general software subsystems will enable many other applications that need integration of streaming and batch data analytics.Comment: IEEE/ACM CCGrid 2015: 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 201

    Memetic moments : the speed of twitter memes

    Get PDF
    This paper examines how speed shapes internet culture. To do so, it analyses ‘memetic moments’ on Twitter, short-lived and rapidly circulated memes that quickly reach saturation. The paper examines two ‘memetic moments’ on Twitter in 2018 and 2019 to assess how they develop over time. Each case study comprises a week’s worth of relevant tweets that were analysed for temporal patterns. We analyse these ‘memetic moments’ through Lefebvre’s (2004) work on rhythmanalysis, arguing that the temporal patterns of memes on Twitter can be understood through his concepts of repetition, presence and dialogue. While seemingly trivial, memetic moments underscore the didactic relationship between social media and news media while also providing a way to approach complex social issues

    Conceptualisation of rights and meta-rule of law for the web of data

    Get PDF
    This article deals with some regulatory and legal problems of the Web of Data. Data and metadata are defined. Digital Rights Management (DRM) and Rights Expression Languages (REL) are introduced. Open Digital Rights Language (ODRL), Licensed Linked Data Resources (LLDR) and Creative Commons Licenses are referred. The development of REL by means of Ontology Design Patterns such as LLDR, or Open Licenses sustained by Policy Models such as ODRL, situates the discussion on metadata at the regulatory level. With the development of the Web of Data the Rule of Law needs to evolve to a Meta-Rule of Law, incorporating tools to regulate and monitor the semantic layer of the Web. This means reflecting on the construction of a new public dimension space for the exercise of rights

    Conceptualisation of rights and meta-rule of law for the web of data

    Get PDF
    This paper was previously published by Democracia Digital e Governo Eletrônico (Brazil)This article deals with some regulatory and legal problems of the Web of Data. Data and metadata are defined. Digital Rights Management (DRM) and Rights Expression Languages (REL) are introduced. Open Digital Rights Language (ODRL), Licensed Linked Data Resources (LLDR) and Creative Commons Licenses are referred. The development of REL by means of Ontology Design Patterns such as LLDR, or Open Licenses sustained by Policy Models such as ODRL, situates the discussion on metadata at the regulatory level. With the development of the Web of Data the Rule of Law needs to evolve to a Meta-Rule of Law, incorporating tools to regulate and monitor the semantic layer of the Web. This means reflecting on the construction of a new public dimension space for the exercise of rights
    corecore