19 research outputs found
Mutual-Excitation of Cryptocurrency Market Returns and Social Media Topics
Cryptocurrencies have recently experienced a new wave of price volatility and
interest; activity within social media communities relating to cryptocurrencies
has increased significantly. There is currently limited documented knowledge of
factors which could indicate future price movements. This paper aims to
decipher relationships between cryptocurrency price changes and topic
discussion on social media to provide, among other things, an understanding of
which topics are indicative of future price movements. To achieve this a
well-known dynamic topic modelling approach is applied to social media
communication to retrieve information about the temporal occurrence of various
topics. A Hawkes model is then applied to find interactions between topics and
cryptocurrency prices. The results show particular topics tend to precede
certain types of price movements, for example the discussion of 'risk and
investment vs trading' being indicative of price falls, the discussion of
'substantial price movements' being indicative of volatility, and the
discussion of 'fundamental cryptocurrency value' by technical communities being
indicative of price rises. The knowledge of topic relationships gained here
could be built into a real-time system, providing trading or alerting signals.Comment: 3rd International Conference on Knowledge Engineering and
Applications (ICKEA 2018) - Moscow, Russia (June 25-27 2018
IDENTIFIKASI TOPIK ARTIKEL BERITA MENGGUNAKAN TOPIC MODELLING DENGAN LATENT DIRICHLET ALLOCATION
Portal berita memberikan informasi yang sangat beragam, namun judul berita tidak dapat dijadikan acuan utama dalam penentuan topik suatu berita secara keseluruhan karena judul berita bersifat hipebola untuk menarik pembaca. Oleh karena itu, penelitian ini mengusulkan sistem identifikasi topik artikel berita menggunakan topic modelling dengan algoritma Latent Dirichlet Allocation (LDA). Tahapan penelitian diawali dengan pengambilan data secara otomatis dari situs web detik.com dan tempo.co dengan proses web scrapping, kemudian dilakukan preprocessing terhadap data. Ada 4 tahap preprocessing yaitu tokenization, case folding, stopword removal, dan stemming. Tahap terakhir adalah topic modelling dengan algoritma LDA. Topic modelling merupakan model statistik untuk menentukan inti atau topik pada kumpulan dokumen. Identifikasi topik dengan algoritma LDA didasarkan pada probabilitas kemunculan kata dalam kumpulan dokumen. Penelitian ini menghasilkan topik yang paling sering muncul dalam portal berita kriminal adalah pembunuha
Social Media Mining in Drug Development Decision Making: Prioritizing Multiple Sclerosis Patients’ Unmet Medical Needs
Pharmaceutical companies increasingly must consider patients’ needs in drug development. Since patients’ needs are often difficult to measure, especially in rare diseases, information in drug development decision-making is limited. In the proposed study, we employ the opportunity algorithm to identify and prioritize unmet medical needs of multiple sclerosis patients shared in social media posts. Using topic modeling and sentiment analysis features of the opportunity algorithm are generated. The result implies that sensory problems, pain, mental health problems, fatigue and sleep disturbances represent the highest unmet medical needs of the samples population. The present study suggests a promising potential of this method to provide relevant insights into rare disease populations to promote patient-centered drug development
Knowledge Discovery from CVs: A Topic Modeling Procedure
With a huge number of CVs available online, recruiting via the web has become an integral part of human resource management for companies. Automated text mining methods can be used to analyze large databases containing CVs. We present a topic modeling procedure consisting of five steps with the aim of identifying competences in CVs in an automated manner. Both the procedure and its exemplary application to CVs from IT experts are described in detail. The specific characteristics of CVs are considered in each step for optimal results. The exemplary application suggests that clearly interpretable topics describing fine-grained competences (e.g., Java programming, web design) can be discovered. This information can be used to rapidly assess the contents of a CV, categorize CVs and identify candidates for job offers. Furthermore, a topic-based search technique is evaluated to provide helpful decision support
A Social Citizen Dashboard for Participatory Urban Planning in Berlin: Prototype and Evaluation
Participatory urban planning enables citizens to make their voices heard in the urban planning process. The resulting measures are more likely to be accepted by the community. However, the parti-cipation process becomes more effortful and time-consuming. New approaches have been developed using digital technologies to facilitate citizen participation, such as topic modeling based on social media. Using Twitter data for the city of Berlin, we explore how social media and topic modeling can be used to classify and analyze citizen opinions. We develop a Social Citizen Dashboard allowing for a better understanding of changes in citizens’ priorities and incorporating constant cycles of feedback throughout planning phases. Evaluation interviews indicate the dashboard’s potential usefulness and implications as well as point to limitation in data quality and spur further research potentials
Utility of Large-scale Recipe Data in Food Computing
This article aims to look at the recipe data analysis from a critical perspective, offering the authors’ own learning experience from successes and failures of the research process. The present recipe research has been limited by the availability of data, which in the case of recipes mostly consists of texts depicting a variety of ingredients. This has contributed to a better understanding of flavour formation and nutritional value of food but has not led further to establishing a corpus of healthy and unhealthy foods. Time-related cooking aspects have remained largely out of the present research’s scope due to the difficulties in obtaining immediately analyzable data. The same goes for the recipe-relate research on food texture, color and other aspects. In this research the methodology of topic modelling has been applied to analyze recipes in North American and Mexican cuisines in order to highlight the core culinary themes within these two cuisines. Potential for result analysis, as well as its limitations, are also discussed. Topic models of agglomerated data can be helpful in further multisensory research, as they provide some insights into the colour, the flavour and, potentially, the texture of certain groups of dishes. It can be combined further on with social media sentiment analysis and other research methods to better grasp the human relationship with food. © 2021 Baltic Journal of Modern Computing. All rights Reserved
Mutual-excitation of cryptocurrency market returns and social media topics
Cryptocurrencies have recently experienced a new
wave of price volatility and interest; activity within social
media communities relating to cryptocurrencies has increased
significantly. There is currently limited documented
knowledge of factors which could indicate future price
movements. This paper aims to decipher relationships between
cryptocurrency price changes and topic discussion on social
media to provide, among other things, an understanding of
which topics are indicative of future price movements. To
achieve this a well-known dynamic topic modelling approach is
applied to social media communication to retrieve information
about the temporal occurrence of various topics. A Hawkes
model is then applied to find interactions between topics and
cryptocurrency prices. The results show particular topics tend
to precede certain types of price movements, for example the
discussion of ‘risk and investment vs trading’ being indicative
of price falls, the discussion of ‘substantial price movements’
being indicative of volatility, and the discussion of
‘fundamental cryptocurrency value’ by technical communities
being indicative of price rises. The knowledge of topic
relationships gained here could be built into a real-time system,
providing trading or alerting signals
Data Science as a Tool to Support Decision-Making: Descending Hierarchical Classification of Access to Information Requests in the Municipality of São Paulo
Buscou-se compreender de que forma a ciência de dados e as tecnologias de mineração e classificação
de textos podem contribuir para a tomada de decisões a partir de uma melhor compreensão agregada dos pe didos de acesso à informação. A pesquisa utilizou dados dos pedidos de acesso à informação feitos à Prefeitura Municipal de São Paulo (PMSP), de 2012 a 2019, disponíveis no Portal de Dados Abertos da municipalidade, propondo a identificação e classificação das principais questões apresentadas. Os 39.369 textos dos pedidos de acesso submetidos à PMSP foram reunidos em um corpus e submetidos a análise por meio de Classificação Hierárquica Descendente (CHD). Ao propor uma classificação de textos como uma metodologia para análise de dados textuais, reforçou-se um paradigma de que dados textuais não pertencem apenas ao campo qualitativo. Além disso, a consideração de apenas substantivos, excluídos verbos e advérbios; e os adjetivos mais ocorrentes serem usados como parte de expressões, permitiu uma otimização do contexto dos pedidos, proporcionando classificar os dados textuais de maneira mais objetiva, mitigando o viés dos investigadores. O artigo apresenta também outros estudos de caso relevantes para a pesquisa, com referências encontradas na análise de pedidos de acesso à informação, contribuindo para a compreensão de pedidos dos cidadãos de modo aglutinado e permitindo aos tomadores de decisões um melhor entendimento das demandas da sociedade, podendo resultar em políticas públicas mais focadas. Conclui-se que a análise dos dados através da CHD permite obter informações relevantes para a tomada de decisão baseada em dados e evidências e que a abordagem favorece a concretização de decisões fundamentadas e mais próximas das necessidades dos cidadãos.info:eu-repo/semantics/publishedVersio
Mapping Phenomena Relevant to Adolescent Emotion Regulation: A Text-Mining Systematic Review
Adolescence is a developmentally sensitive period for emotion regulation with potentially lifelong implications for mental health and well-being. Although substantial empirical research has addressed this topic, the literature is fragmented across subdisciplines, and an overarching theoretical framework is lacking. The first step toward constructing a unifying framework is identifying relevant phenomena. This systematic review of 6305 articles used text mining to identify phenomena relevant to adolescents’ emotion regulation. First, a baseline was established of relevant phenomena discussed in theory and recent narrative reviews. Then, article keywords and abstracts were analyzed using text mining, examining term frequency as an indicator of relevance and term co-occurrence as an indicator of association. The results reflected themes commonly featured in theory and narrative reviews, such as socialization and neurocognitive development, but also identified undertheorized themes, such as developmental disorders, physical health, external stressors, structural disadvantage, substance use, identity and moral development, and sexual development. The findings illustrate how text mining systematic reviews, a novel approach, may complement narrative reviews. Future theoretical work might integrate these undertheorized themes into an overarching framework, and empirical research might consider them as promising areas for future research, or as potential confounders in research on adolescents’ emotion regulation