22,099 research outputs found
Topic Modelling of Swedish Newspaper Articles about Coronavirus: a Case Study using Latent Dirichlet Allocation Method
Topic Modelling (TM) is from the research branches of natural language
understanding (NLU) and natural language processing (NLP) that is to facilitate
insightful analysis from large documents and datasets, such as a summarisation
of main topics and the topic changes. This kind of discovery is getting more
popular in real-life applications due to its impact on big data analytics. In
this study, from the social-media and healthcare domain, we apply popular
Latent Dirichlet Allocation (LDA) methods to model the topic changes in Swedish
newspaper articles about Coronavirus. We describe the corpus we created
including 6515 articles, methods applied, and statistics on topic changes over
approximately 1 year and two months period of time from 17th January 2020 to
13th March 2021. We hope this work can be an asset for grounding applications
of topic modelling and can be inspiring for similar case studies in an era with
pandemics, to support socio-economic impact research as well as clinical and
healthcare analytics. Our data and source code are openly available at
https://github. com/poethan/Swed_Covid_TM Keywords: Latent Dirichlet Allocation
(LDA); Topic Modelling; Coronavirus; Pandemics; Natural Language Understanding;
BERT-topicComment: 14 pages, 14 figure
Is Stack Overflow Overflowing With Questions and Tags
Programming question and answer (Q & A) websites, such as Quora, Stack
Overflow, and Yahoo! Answer etc. helps us to understand the programming
concepts easily and quickly in a way that has been tested and applied by many
software developers. Stack Overflow is one of the most frequently used
programming Q\&A website where the questions and answers posted are presently
analyzed manually, which requires a huge amount of time and resource. To save
the effort, we present a topic modeling based technique to analyze the words of
the original texts to discover the themes that run through them. We also
propose a method to automate the process of reviewing the quality of questions
on Stack Overflow dataset in order to avoid ballooning the stack overflow with
insignificant questions. The proposed method also recommends the appropriate
tags for the new post, which averts the creation of unnecessary tags on Stack
Overflow.Comment: 11 pages, 7 figures, 3 tables Presented at Third International
Symposium on Women in Computing and Informatics (WCI-2015
Sequences of purchases in credit card data reveal life styles in urban populations
Zipf-like distributions characterize a wide set of phenomena in physics,
biology, economics and social sciences. In human activities, Zipf-laws describe
for example the frequency of words appearance in a text or the purchases types
in shopping patterns. In the latter, the uneven distribution of transaction
types is bound with the temporal sequences of purchases of individual choices.
In this work, we define a framework using a text compression technique on the
sequences of credit card purchases to detect ubiquitous patterns of collective
behavior. Clustering the consumers by their similarity in purchases sequences,
we detect five consumer groups. Remarkably, post checking, individuals in each
group are also similar in their age, total expenditure, gender, and the
diversity of their social and mobility networks extracted by their mobile phone
records. By properly deconstructing transaction data with Zipf-like
distributions, this method uncovers sets of significant sequences that reveal
insights on collective human behavior.Comment: 30 pages, 26 figure
A comparative study on face recognition techniques and neural network
In modern times, face recognition has become one of the key aspects of
computer vision. There are at least two reasons for this trend; the first is
the commercial and law enforcement applications, and the second is the
availability of feasible technologies after years of research. Due to the very
nature of the problem, computer scientists, neuro-scientists and psychologists
all share a keen interest in this field. In plain words, it is a computer
application for automatically identifying a person from a still image or video
frame. One of the ways to accomplish this is by comparing selected features
from the image and a facial database. There are hundreds if not thousand
factors associated with this. In this paper some of the most common techniques
available including applications of neural network in facial recognition are
studied and compared with respect to their performance.Comment: 8 page
- …