7 research outputs found

    Comparison of Different Modeling Techniques for Flemish Twitter Sentiment Analysis

    No full text
    Microblogging websites such as Twitter have caused sentiment analysis research to increase in popularity over the last several decades. However, most studies focus on the English language, which leaves other languages underrepresented. Therefore, in this paper, we compare several modeling techniques for sentiment analysis using a new dataset containing Flemish tweets. The key contribution of our paper lies in its innovative experimental design: we compared different preprocessing techniques and vector representations to find the best-performing combination for a Flemish dataset. We compared models belonging to four different categories: lexicon-based methods, traditional machine-learning models, neural networks, and attention-based models. We found that more preprocessing leads to better results, but the best-performing vector representation approach depends on the model applied. Moreover, an immense gap was observed between the performances of the lexicon-based approaches and those of the other models. The traditional machine learning approaches and the neural networks produced similar results, but the attention-based model was the best-performing technique. Nevertheless, a tradeoff should be made between computational expenses and performance gains

    Predicting the demographics of Twitter users with programmatic weak supervision

    No full text
    Predicting the demographics of Twitter users has become a problem with a large interest in computational social sciences. However, the limited amount of public datasets with ground truth labels and the tremendous costs of hand-labeling make this task particularly challenging. Recently, programmatic weak supervision has emerged as a new framework to train classifiers on noisy data with minimal human labeling effort. In this paper, demographic prediction is framed for the first time as a programmatic weak supervision problem. A new three-step methodology for gender, age category, and location prediction is provided, which outperforms traditional programmatic weak supervision and is competitive with the state-of the-art deep learning model. The study is performed in Flanders, a small Dutch speaking European region, characterized by a limited number of user profiles and tweets. An evaluation conducted on an independent hand-labeled test set shows that the proposed methodology can be generalized to unseen users within the geographic area of interest

    Topic modelling applied on innovation studies of Flemish companies

    No full text
    Mapping innovation in companies for the purpose of official statistics is usually done through business surveys. However, this traditional approach faces several drawbacks like a lack of responses, response bias, low frequency, and high costs. Alternatively, text-based models trained on web-scraped text from company websites have been developed to complement or substitute traditional business surveys. This paper utilises web scraping and text-based models to map the business innovation in Flanders with a focus on identifying different types of innovation through topic modelling. More specifically, the scraped web texts are used to identify innovative economic sectors or topics, and to classify firms into these topics using Top2Vec and Lbl2Vec. We conclude that both models can be successfully combined to discover topics (or sectors) and classify companies into these topics which results in an additional parameter for mapping innovation in different regions

    Benchmark study for Flemish twitter sentiment analysis

    No full text
    Microblogging websites such as Twitter have caused sentiment analysis research to increase in popularity over the last several decades. However, most studies focus on the English language, which leaves other languages underrepresented. Therefore, we conduct a benchmark study on several modeling techniques for sentiment analysis with a new dataset containing Flemish tweets. The key contribution of our paper lies in its innovative experimental design, where we compare different preprocessing techniques and vector representations to find the best-performing combination for a Flemish dataset. We compare models belonging to four different categories: lexicon-based methods, traditional machine learning models, neural networks, and attention-based models. We find that more preprocessing leads to better results, but the best-performing vector representation approach depends on the applied model. Moreover, an immense gap is observed between the performance of the lexicon-based approaches and that of the other models. The traditional machine learning approaches and the neural networks produce similar results, but the attention-based model is the best-performing technique. Nevertheless, a tradeoff should be made between computational expenses and performance gains

    Benchmarking conventional outlier detection methods

    No full text
    Nowadays, businesses in many industries face an increasing flow of data and information. Data are at the core of the decision-making process, hence it is vital to ensure that the data are of high quality and no noise is present. Outlier detection methods are aimed to find unusual patterns in data and find their applications in many practical domains. These methods employ different techniques, ranging from pure statistical tools to deep learning models that have gained popularity in recent years. Moreover, one of the most popular outlier detection techniques are machine learning models. They have several characteristics which affect the potential of their usefulness in real-life scenarios. The goal of this paper is to add to the existing body of research on outlier detection by comparing the isolation forest, DBSCAN and LOF techniques. Thus, we investigate the research question: which ones of these outlier detection models perform best in practical business applications. To this end, three models are built on 12 datasets and compared using 5 performance metrics. The final comparison of the models is based on the McNemar’s test, as well as on ranks per performance measure and on average. Three main conclusions can be made from the benchmarking study. First, the models considered in this research disagree differently, i.e. their type I and type II errors are not similar. Second, considering the time, AUPRC and sensitivity metrics, the iForest model is ranked the highest. Hence, the iForest model is the best in the cases when time performance is a key consideration as well as when the opportunity costs of not detecting an outlier are high. Third, the DBSCAN model obtains the highest ranking along the F1 score and precision dimensions. That allows us to conclude that if raising many false alarms is not an important concern, the DBSCAN model is the best to employ.</p

    SEER : A Knapsack approach to Exemplar Selection for In-Context HybridQA

    No full text
    Question answering over hybrid contexts is a complex task, which requires the combination of information extracted from unstructured texts and structured tables in various ways. Recently, In-Context Learning demonstrated significant performance advances for reasoning tasks. In this paradigm, a large language model performs predictions based on a small set of supporting exemplars. The performance of In-Context Learning depends heavily on the selection procedure of the supporting exemplars, particularly in the case of HybridQA, where considering the diversity of reasoning chains and the large size of the hybrid contexts becomes crucial. In this work, we present Selection of ExEmplars for hybrid Reasoning (SEER), a novel method for selecting a set of exemplars that is both representative and diverse. The key novelty of SEER is that it formulates exemplar selection as a Knapsack Integer Linear Program. The Knapsack framework provides the flexibility to incorporate diversity constraints that prioritize exemplars with desirable attributes, and capacity constraints that ensure that the prompt size respects the provided capacity budgets. The effectiveness of SEER is demonstrated on FinQA and TAT-QA, two real-world benchmarks for HybridQA, where it outperforms previous exemplar selection methods

    Evaluating text classification: a benchmark study

    No full text
    This paper presents an impartial and extensive benchmark for text classification involving five different text classification tasks, 20 datasets, 11 different model architectures, and 42,800 al gorithm runs. The five text classification tasks are fake news classification, topic detection, emotion detection, polarity detection, and sarcasm detection. While in practice, especially in Natural Language Processing (NLP), research tends to focus on the most sophisticated models, we hypothesize that this is not always necessary. Therefore, our main objective is to investi gate whether the largest state-of-the-art (SOTA) models are always preferred, or in what cases simple methods can compete with complex models, i.e. for which dataset specifications and classification tasks. We assess the performance of different methods with varying complexity, ranging from simple statistical and machine learning methods to pretrained transformers like robustly optimized BERT (Bidirectional Encoder Representations from Transformers) pretrain ing approach (RoBERTa). This comprehensive benchmark is lacking in existing literature, with research mainly comparing similar types of methods. Furthermore, with increasing awareness of the ecological impacts of extensive computational resource usage, this comparison is both critical and timely. We find that overall, bidirectional long short-term memory (LSTM) net works are ranked as the best-performing method albeit not statistically significantly better than logistic regression and RoBERTa. Overall, we cannot conclude that simple methods perform worse although this depends mainly on the classification task. Concretely, we find that for fake news classification and topic detection, simple techniques are the best-ranked models and consequently, it is not necessary to train complicated neural network architectures for these classification tasks. Moreover, we also find a negative correlation between F1 performance and complexity for the smallest datasets (with dataset size less than 10,000). Finally, the different models’ results are analyzed in depth to explain the model decisions, which is an increasing requirement in the field of text classification
    corecore