6 research outputs found
Using Isolation Forest and Alternative Data Products to Overcome Ground Truth Data Scarcity for Improved Deep Learning-based Agricultural Land Use Classification Models
High-quality labelled datasets represent a cornerstone in the development of deep learning models for land use classification. The high cost of data collection, the inherent errors introduced during data mapping efforts, the lack of local knowledge, and the spatial variability of the data hinder the development of accurate and spatially-transferable deep learning models in the context of agriculture. In this paper, we investigate the use of Isolation Forest (IF), an anomaly detection algorithm, to reduce noise in a large-scale, low-resolution alternative ground truth dataset used to train land use deep learning models. We use a modest-size, high-resolution and high-fidelity manually collected ground-truth dataset to calibrate Isolation Forest parameters and evaluate our approach, highlighting the relatively low cost of the methodology. Our data-centric methodology demonstrates the efficacy of deep learning methods coupled with IF to create mid-resolution land-use models and map products for agriculture using an alternative ground-truth dataset. Moreover, we compare our deep learning approach with a traditional algorithm used in remote sensing and evaluate the spatial transferability of the created models. Finally, we reflect upon the lessons learnt and future work
A Deep Learning Model for Predicting Stock Prices in Tanzania
This research article was published by Engineering, Technology & Applied Science Research in Volume: 13 | Issue: 2 | Pages: 10517-10522 | April 2023 |Stock price prediction models help traders to reduce investment risk and choose the most profitable stocks.
Machine learning and deep learning techniques have been applied to develop various models. As there is a
lack of literature on efforts to utilize such techniques to predict stock prices in Tanzania, this study
attempted to fill this gap. This study selected active stocks from the Dar es Salaam Stock Exchange and
developed LSTM and GRU deep learning models to predict the next-day closing prices. The results showed
that LSTM had the highest prediction accuracy with an RMSE of 4.7524 and an MAE of 2.4377. This
study also aimed to examine whether it is significant to account for the outstanding shares of each stock
when developing a joint model for predicting the closing prices of multiple stocks. Experimental results
with both models revealed that prediction accuracy improved significantly when the number of
outstanding shares of each stock was taken into account. The LSTM model achieved an RMSE of 10.4734
when the outstanding shares were not taken into account and 4.7524 when they were taken into account,
showing an improvement of 54.62%. However, GRU achieved an RMSE of 12.4583 when outstanding
shares were not taken into account and 8.7162 when they were taken into account, showing an
improvement of 30.04%. The best model was implemented in a web-based prototype to make it accessible
to stockbrokers and investment advisors
HIGEA: An Intelligent Conversational Agent to Detect Caregiver Burden
Mental health disorders increasingly affect people worldwide. As a consequence, more
families and relatives find themselves acting as caregivers. Most often, these are untrained people
who experience loneliness, abandonment, and often develop signs of depression (i.e., caregiver
burden syndrome). In this work, we present HIGEA, a digital system based on a conversational agent
to help to detect caregiver burden. The conversational agent naturally embeds psychological test
questions into informal conversations, which aim at increasing the adherence of use and avoiding
user bias. A proof-of-concept is developed based on the popular Zarit Test, which is widely used to
assess caregiver burden. Preliminary results show the system is useful and effective
Blacks is to Anger as Whites is to Joy? Understanding Latent Affective Bias in Large Pre-trained Neural Language Models
Groundbreaking inventions and highly significant performance improvements in
deep learning based Natural Language Processing are witnessed through the
development of transformer based large Pre-trained Language Models (PLMs). The
wide availability of unlabeled data within human generated data deluge along
with self-supervised learning strategy helps to accelerate the success of large
PLMs in language generation, language understanding, etc. But at the same time,
latent historical bias/unfairness in human minds towards a particular gender,
race, etc., encoded unintentionally/intentionally into the corpora harms and
questions the utility and efficacy of large PLMs in many real-world
applications, particularly for the protected groups. In this paper, we present
an extensive investigation towards understanding the existence of "Affective
Bias" in large PLMs to unveil any biased association of emotions such as anger,
fear, joy, etc., towards a particular gender, race or religion with respect to
the downstream task of textual emotion detection. We conduct our exploration of
affective bias from the very initial stage of corpus level affective bias
analysis by searching for imbalanced distribution of affective words within a
domain, in large scale corpora that are used to pre-train and fine-tune PLMs.
Later, to quantify affective bias in model predictions, we perform an extensive
set of class-based and intensity-based evaluations using various bias
evaluation corpora. Our results show the existence of statistically significant
affective bias in the PLM based emotion detection systems, indicating biased
association of certain emotions towards a particular gender, race, and
religion
CLUB Working Papers in Linguistics Volume 6
Questo sesto volume della collana âCLUB Working Papers in Linguisticsâ raccoglie alcuni dei contributi presentati nel corso delle iniziative organizzate dal Circolo Linguistico dellâUniversitĂ di Bologna nellâanno accademico 2020-2021. Risalgono al programma ufficiale i primi tre saggi, a firma rispettivamente di Elisa Corino (UniversitĂ di Torino), Marina Benedetti (UniversitĂ per Stranieri di Siena) e Andrea SansĂČ (UniversitĂ dellâInsubria). I successivi tre contributi sono stati originariamente presentati in occasione dei seminari periodici del Circolo; si tratta dei lavori di Silvia Brambilla e Idea Basile (UniversitĂ di Bologna e UniversitĂ Roma âLa Sapienzaâ), Marta Maffia e Massimo Pettorino (UniversitĂ di Napoli âLâOrientaleâ) e Anna DallâAcqua (UniversitĂ di Bologna e Injenia S.r.L.). Il volume si chiude con un articolo di Ottavia Cepraga, vincitrice del premio âUna tesi in linguisticaâ 2021