13 research outputs found
A sentiment-based filteration and data analysis framework for social media
This paper describes a framework that explains the processes involved in the filteration and analysis of data for user generated content in social media.Previous researches have put their focus in leveraging high quality data from social media data stream, but there are many opportunities
that need to be explored.This paper proposes a sentiment-based filteration and data analysis framework in identifying relevant information from data generated by users in social media.Based on the textual contents generated and spread through social media, it is assumed that each of the set of text streams/corpora might carry a sentiment associated with it regardless of its polarity bias. Due to this, the proposed framework introduces the idea of data filtering that exploits information and sentiment captured in text while at the same time adapts text analysis methods overcoming the noisy and unstructured
nature of social media textual content
TotalDefMeme: A Multi-Attribute Meme dataset on Total Defence in Singapore
Total Defence is a defence policy combining and extending the concept of
military defence and civil defence. While several countries have adopted total
defence as their defence policy, very few studies have investigated its
effectiveness. With the rapid proliferation of social media and digitalisation,
many social studies have been focused on investigating policy effectiveness
through specially curated surveys and questionnaires either through digital
media or traditional forms. However, such references may not truly reflect the
underlying sentiments about the target policies or initiatives of interest.
People are more likely to express their sentiment using communication mediums
such as starting topic thread on forums or sharing memes on social media. Using
Singapore as a case reference, this study aims to address this research gap by
proposing TotalDefMeme, a large-scale multi-modal and multi-attribute meme
dataset that captures public sentiments toward Singapore's Total Defence
policy. Besides supporting social informatics and public policy analysis of the
Total Defence policy, TotalDefMeme can also support many downstream multi-modal
machine learning tasks, such as aspect-based stance classification and
multi-modal meme clustering. We perform baseline machine learning experiments
on TotalDefMeme and evaluate its technical validity, and present possible
future interdisciplinary research directions and application scenarios using
the dataset as a baseline.Comment: 6 pages. Accepted at ACM MMSys 202
Parallel clustering of high-dimensional social media data streams
We introduce Cloud DIKW as an analysis environment supporting scientific
discovery through integrated parallel batch and streaming processing, and apply
it to one representative domain application: social media data stream
clustering. Recent work demonstrated that high-quality clusters can be
generated by representing the data points using high-dimensional vectors that
reflect textual content and social network information. Due to the high cost of
similarity computation, sequential implementations of even single-pass
algorithms cannot keep up with the speed of real-world streams. This paper
presents our efforts to meet the constraints of real-time social stream
clustering through parallelization. We focus on two system-level issues. Most
stream processing engines like Apache Storm organize distributed workers in the
form of a directed acyclic graph, making it difficult to dynamically
synchronize the state of parallel workers. We tackle this challenge by creating
a separate synchronization channel using a pub-sub messaging system. Due to the
sparsity of the high-dimensional vectors, the size of centroids grows quickly
as new data points are assigned to the clusters. Traditional synchronization
that directly broadcasts cluster centroids becomes too expensive and limits the
scalability of the parallel algorithm. We address this problem by communicating
only dynamic changes of the clusters rather than the whole centroid vectors.
Our algorithm under Cloud DIKW can process the Twitter 10% data stream in
real-time with 96-way parallelism. By natural improvements to Cloud DIKW,
including advanced collective communication techniques developed in our Harp
project, we will be able to process the full Twitter stream in real-time with
1000-way parallelism. Our use of powerful general software subsystems will
enable many other applications that need integration of streaming and batch
data analytics.Comment: IEEE/ACM CCGrid 2015: 15th IEEE/ACM International Symposium on
Cluster, Cloud and Grid Computing, 201
Memetic moments : the speed of twitter memes
This paper examines how speed shapes internet culture. To do so, it analyses ‘memetic moments’ on Twitter, short-lived and rapidly circulated memes that quickly reach saturation. The paper examines two ‘memetic moments’ on Twitter in 2018 and 2019 to assess how they develop over time. Each case study comprises a week’s worth of relevant tweets that were analysed for temporal patterns. We analyse these ‘memetic moments’ through Lefebvre’s (2004) work on rhythmanalysis, arguing that the temporal patterns of memes on Twitter can be understood through his concepts of repetition, presence and dialogue. While seemingly trivial, memetic moments underscore the didactic relationship between social media and news media while also providing a way to approach complex social issues
Conceptualisation of rights and meta-rule of law for the web of data
This article deals with some regulatory and legal problems of the Web of Data. Data and metadata are defined. Digital Rights Management (DRM) and Rights Expression Languages (REL) are introduced. Open Digital Rights Language (ODRL), Licensed Linked Data Resources (LLDR) and Creative Commons Licenses are referred. The development of REL by means of Ontology Design Patterns such as LLDR, or Open Licenses sustained by Policy Models such as ODRL, situates the discussion on metadata at the regulatory level. With the development of the Web of Data the Rule of Law needs to evolve to a Meta-Rule of Law, incorporating tools to regulate and monitor the semantic layer of the Web. This means reflecting on the construction of a new public dimension space for the exercise of rights
Conceptualisation of rights and meta-rule of law for the web of data
This paper was previously published by Democracia Digital e Governo Eletrônico (Brazil)This article deals with some regulatory and legal problems of the Web of Data. Data and metadata are defined. Digital Rights Management (DRM) and Rights Expression Languages (REL) are introduced. Open Digital Rights Language (ODRL), Licensed Linked Data Resources (LLDR) and Creative Commons Licenses are referred. The development of REL by means of Ontology Design Patterns such as LLDR, or Open Licenses sustained by Policy Models such as ODRL, situates the discussion on metadata at the regulatory level. With the development of the Web of Data the Rule of Law needs to evolve to a Meta-Rule of Law, incorporating tools to regulate and monitor the semantic layer of the Web. This means reflecting on the construction of a new public dimension space for the exercise of rights