Search CORE

19 research outputs found

ANTM: An Aligned Neural Topic Model for Exploring Evolving Topics

Author: Amann Bernd
Constantin Camelia
Naacke Hubert
Rahimi Hamed
Publication venue
Publication date: 04/06/2023
Field of study

This paper presents an algorithmic family of dynamic topic models called Aligned Neural Topic Models (ANTM), which combine novel data mining algorithms to provide a modular framework for discovering evolving topics. ANTM maintains the temporal continuity of evolving topics by extracting time-aware features from documents using advanced pre-trained Large Language Models (LLMs) and employing an overlapping sliding window algorithm for sequential document clustering. This overlapping sliding window algorithm identifies a different number of topics within each time frame and aligns semantically similar document clusters across time periods. This process captures emerging and fading trends across different periods and allows for a more interpretable representation of evolving topics. Experiments on four distinct datasets show that ANTM outperforms probabilistic dynamic topic models in terms of topic coherence and diversity metrics. Moreover, it improves the scalability and flexibility of dynamic topic models by being accessible and adaptable to different types of algorithms. Additionally, a Python package is developed for researchers and scientists who wish to study the trends and evolving patterns of topics in large-scale textual data

arXiv.org e-Print Archive

Unifying Community Detection Across Scales from Genomes to Landscapes

Author: Barber Cristina
Caughlin T. Trevor
Davidson Merry M.
Forbey Jennifer Sorensen
Frankel-Bricker Jonas
Fremgen-Tarantino Marcella
Hayden Eric
Hudon Stephanie F.
Pendleton Britt
Robb Brecken
Rodriguez Olivia K.
Roopsind Anand
Roser Anna
Zaiats Andrii
Publication venue: 'IUScholarWorks'
Publication date: 01/06/2021
Field of study

Biodiversity science encompasses multiple disciplines and biological scales from molecules to landscapes. Nevertheless, biodiversity data are often analyzed separately with discipline-specific methodologies, constraining resulting inferences to a single scale. To overcome this, we present a topic modeling framework to analyze community composition in cross-disciplinary datasets, including those generated from metagenomics, metabolomics, field ecology and remote sensing. Using topic models, we demonstrate how community detection in different datasets can inform the conservation of interacting plants and herbivores. We show how topic models can identify members of molecular, organismal and landscape-level communities that relate to wildlife health, from gut microbes to forage quality. We conclude with a future vision for how topic modeling can be used to design cross-scale studies that promote a holistic approach to detect, monitor and manage biodiversity

Boise State University - ScholarWorks

Topic Modeling on Health Journals with Regularized Variational Inference

Author: Banerjee Arindam
Giaquinto Robert
Publication venue
Publication date: 15/01/2018
Field of study

Topic modeling enables exploration and compact representation of a corpus. The CaringBridge (CB) dataset is a massive collection of journals written by patients and caregivers during a health crisis. Topic modeling on the CB dataset, however, is challenging due to the asynchronous nature of multiple authors writing about their health journeys. To overcome this challenge we introduce the Dynamic Author-Persona topic model (DAP), a probabilistic graphical model designed for temporal corpora with multiple authors. The novelty of the DAP model lies in its representation of authors by a persona --- where personas capture the propensity to write about certain topics over time. Further, we present a regularized variational inference algorithm, which we use to encourage the DAP model's personas to be distinct. Our results show significant improvements over competing topic models --- particularly after regularization, and highlight the DAP model's unique ability to capture common journeys shared by different authors.Comment: Published in Thirty-Second AAAI Conference on Artificial Intelligence, February 2018, New Orleans, Louisiana, US

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Bitcoin Volatility Forecasting with a Glimpse into Buy and Sell Orders

Author: Antulov-Fantulin Nino
Bifet Albert
Guo Tian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

In this paper, we study the ability to make the short-term prediction of the exchange price fluctuations towards the United States dollar for the Bitcoin market. We use the data of realized volatility collected from one of the largest Bitcoin digital trading offices in 2016 and 2017 as well as order information. Experiments are performed to evaluate a variety of statistical and machine learning approaches.Comment: Full version of the paper published at IEEE International Conference on Data Mining (ICDM), 201

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref