20 research outputs found

    COINS RESEARCH SUMMER SCHOOL 2021 (ONLINE-ZOOM)

    No full text
    The COINS summer school is a one-week intensive course for Ph.D. students in computer and information security and in related fields. In 2021 the summer school is offered in cooperation with the UiA study centre in Metochi on Lesvos Island, Greece. But, due to COVID-19 the summer school sessions were conducted online via zoom

    Bloom’s Learning Outcomes’ Automatic Classification Using LSTM and Pretrained Word Embeddings

    No full text
    Bloom’s taxonomy is a popular model to classify educational learning objectives into different learning levels for three domains including cognitive, affective and psycho motor. Each domain is further detailed into different levels. The cognitive domain includes knowledge, comprehension, application, analysis, synthesis and evaluation levels. In educational institutions, designing course learning outcomes (CLOs) as per different levels of Bloom and mapping of assessment items on designed CLOs is an important task — every semester, faculty and administrators read thousands of statements to complete the tedious task of such mapping of CLOs and assessment items into Bloom’s levels for an improved student learning. This paper proposes LSTM based deep learning model to perform classification of CLOs and assessment items in different levels of Bloom in cognitive domain. Although, there has been some attempts in the literature to automatically assign Bloom’s taxonomy category using keywords-based approach but it suffers from the problem of low accuracy and overlapping of keywords. Initially, when we performed keywords-based approach on our datasets we achieved an overall accuracy of 55% for classification of CLOs and assessment items into Bloom’s taxonomy. The proposed model predicts Bloom’s level for CLO and assessment question item, respectively. The proposed model is simple in terms of the architecture as compared to other deep learning models reported in literature and achieves classification accuracy of 87% and 74% on CLOs and assessment question items, respectively. The proposed model obtained 3% increase in overall accuracy comparing to an existing study for the same task. To the best of our knowledge, this is first attempt towards applying deep learning on classifying educational objectives in Bloom’s levels

    Towards Understanding of User Perceptions for Smart Border Control Technologies using a Fine-Tuned Transformer Approach

    No full text
    Smart Border Control (SBC) technologies became a hot topic in recent years when the European Union (EU) Commission announced the Smart Borders Package to improve the efficiency and security of the border crossing points (BCPs). Although, BCPs technologies have potential benefits in terms of enabling traveller' data processing, they still lead to acceptability and usability challenges when used by travelers. Success of technologies depends on user acceptance. Sentiment analysis is one of the primary techniques to measure user acceptance. Although, there exists variety of studies in literature where sentiment analysis has been used to understand user acceptance in different domains. To the best of our knowledge, there is no study where sentiment analysis has been used for measuring the user acceptance of SBC technologies. Thus, in this study, we propose a fine-tuned transformer model along with an automatic sentiment labels generation technique to perform sentiment analysis as a step towards getting insights into user acceptance of BCPs technologies. The results obtained in this study are promising; given the condition that there is no training data available from BCPs. The proposed approach was validated against IMDB reviews dataset and achieved weighted F1-score of 79% for sentiment analysis task

    Assessing the Usability of ChatGPT for Formal English Language Learning

    No full text
    Recently, the emerging technologies have been constantly shaping the education domain, especially the use of artificial intelligence (AI) for language learning, which has attracted significant attention. Many of the AI tools are being used for learning foreign languages, in both formal and informal ways. There are many studies that have explored the potential of the recent technology “ChatGPT” for education and learning languages, but none of the existing studies have conducted any exploratory study for assessing the usability of ChatGPT. This paper conducts an assessment for usability of ChatGPT for formal English language learning. The study uses a standard questionnaire-based approach to ask participants about their feedback for usefulness and effectiveness of ChatGPT. The participants were asked for their feedback after performing series of tasks related to formal English language learning with ChatGPT. A variety of student participants were selected for this study with diverse English language proficiency levels, education levels, and nationalities. The quantitative analysis of the participant responses shed light on their experience with regards to the usability of ChatGPT for performing different English language learning tasks such as conversation, writing, grammar, and vocabulary. The findings from this study are quite promising and indicate that ChatGPT is an effective tool to be used for formal English language learning. Overall, this study contributes to the fast-growing research domain on using emerging technologies for formal English language learning by conducting in-depth assessment of usability for ChatGPT in formal English language learning

    Towards Improved Classification Accuracy on Highly Imbalanced Text Dataset Using Deep Neural Language Models

    No full text
    Data imbalance is a frequently occurring problem in classification tasks where the number of samples in one category exceeds the amount in others. Quite often, the minority class data is of great importance representing concepts of interest and is often challenging to obtain in real-life scenarios and applications. Imagine a customers’ dataset for bank loans-majority of the instances belong to non-defaulter class, only a small number of customers would be labeled as defaulters, however, the performance accuracy is more important on defaulters labels than non-defaulter in such highly imbalance datasets. Lack of enough data samples across all the class labels results in data imbalance causing poor classification performance while training the model. Synthetic data generation and oversampling techniques such as SMOTE, AdaSyn can address this issue for statistical data, yet such methods suffer from overfitting and substantial noise. While such techniques have proved useful for synthetic numerical and image data generation using GANs, the effectiveness of approaches proposed for textual data, which can retain grammatical structure, context, and semantic information, has yet to be evaluated. In this paper, we address this issue by assessing text sequence generation algorithms coupled with grammatical validation on domain-specific highly imbalanced datasets for text classification. We exploit recently proposed GPT-2 and LSTM-based text generation models to introduce balance in highly imbalanced text datasets. The experiments presented in this paper on three highly imbalanced datasets from different domains show that the performance of same deep neural network models improve up to 17% when datasets are balanced using generated text

    The impact of synthetic text generation for sentiment analysis using GAN based models

    No full text
    Data imbalance in datasets is a common issue where the number of instances in one or more categories far exceeds the others, so is the case with the educational domain. Collecting feedback on a course on a large scale and the lack of publicly available datasets in this domain limits models' performance, especially for deep neural network based models which are data hungry. A model trained on such an imbalanced dataset would naturally favor the majority class. However, the minority class could be critical for decision-making in prediction systems, and therefore it is usually desirable to train a model with equally high class-level accuracy. This paper addresses the data imbalance issue for the sentiment analysis of users' opinions task on two educational feedback datasets utilizing synthetic text generation deep learning models. Two state-of-the-art text generation GAN models namely CatGAN and SentiGAN, are employed for synthesizing text used to balance the highly imbalanced datasets in this study. Particular emphasis is given to the diversity of synthetically generated samples for populating minority classes. Experimental results on highly imbalanced datasets show significant improvement in models' performance on CR23K and CR100K after balancing with synthetic data for the sentiment classification task

    A Systematic Review on the Use of Emerging Technologies in Teaching English as an Applied Language at the University Level

    No full text
    At present, emerging technologies, such as machine learning, deep learning, or various forms of artificial intelligence are penetrating different fields of education, including foreign language education (FLE). Moreover, the current young generation was born into the technological environment, and they perceive technologies as being an indispensable part of their everyday life. However, they mainly use technologies in their informal learning, but there is not much research into emerging technologies in FLE, namely in teaching and learning English as an applied language. Therefore, the purpose of this systematic review is to identify, bring together, compare and analyze all of the technologies that are currently efficiently employed in foreign language teaching and learning, and based on the findings of the detected experimental studies, we provide specific pedagogical implications on how to use these technologies in the acquisition of English as an applied language at the university level. The methodology followed the PRISMA guidelines for systematic reviews and meta-analyses. The results of the detected experimental studies revealed that there was a serious lack of the latest technologies, such as chatbots or virtual reality (VR) devices, that are being empirically employed in a foreign language (FL) education. Moreover, mobile apps are merely focused on the development of FL vocabulary. The findings also indicate that although the FL teachers might theoretically know about these latest technological devices, such as neural machine translation, they do not know how to practically implement them in their teaching process. Therefore, this research suggests that teachers must be trained and pedagogically guided on how to purposefully implement them in their FL classes to support traditional instruction in order to identify what skills or language structures could be developed through their use. In addition, it is also claimed that more experimental studies are needed to clearly the evidence and its usefulness in teaching a foreign language as an applied language

    SentiUrdu-1M : A large-scale tweet dataset for Urdu text sentiment analysis using weakly supervised learning

    No full text
    Low-resource languages are gaining much-needed attention with the advent of deep learning models and pre-trained word embedding. Though spoken by more than 230 million people worldwide, Urdu is one such low-resource language that has recently gained popularity online and is attracting a lot of attention and support from the research community. One challenge faced by such resource-constrained languages is the scarcity of publicly available large-scale datasets for conducting any meaningful study. In this paper, we address this challenge by collecting the first-ever large-scale Urdu Tweet Dataset for sentiment analysis and emotion recognition. The dataset consists of a staggering number of 1,140,821 tweets in the Urdu language. Obviously, manual labeling of such a large number of tweets would have been tedious, error-prone, and humanly impossible; therefore, the paper also proposes a weakly supervised approach to label tweets automatically. Emoticons used within the tweets, in addition to SentiWordNet, are utilized to propose a weakly supervised labeling approach to categorize extracted tweets into positive, negative, and neutral categories. Baseline deep learning models are implemented to compute the accuracy of three labeling approaches, i.e., VADER, TextBlob, and our proposed weakly supervised approach. Unlike the weakly supervised labeling approach, the VADER and TextBlob put most tweets as neutral and show a high correlation between the two. This is largely attributed to the fact that these models do not consider emoticons for assigning polarity

    SentiUrdu-1M: A large-scale tweet dataset for Urdu text sentiment analysis using weakly supervised learning

    No full text
    Low-resource languages are gaining much-needed attention with the advent of deep learning models and pre-trained word embedding. Though spoken by more than 230 million people worldwide, Urdu is one such low-resource language that has recently gained popularity online and is attracting a lot of attention and support from the research community. One challenge faced by such resource-constrained languages is the scarcity of publicly available large-scale datasets for conducting any meaningful study. In this paper, we address this challenge by collecting the first-ever large-scale Urdu Tweet Dataset for sentiment analysis and emotion recognition. The dataset consists of a staggering number of 1, 140, 821 tweets in the Urdu language. Obviously, manual labeling of such a large number of tweets would have been tedious, error-prone, and humanly impossible; therefore, the paper also proposes a weakly supervised approach to label tweets automatically. Emoticons used within the tweets, in addition to SentiWordNet, are utilized to propose a weakly supervised labeling approach to categorize extracted tweets into positive, negative, and neutral categories. Baseline deep learning models are implemented to compute the accuracy of three labeling approaches, i.e., VADER, TextBlob, and our proposed weakly supervised approach. Unlike the weakly supervised labeling approach, the VADER and TextBlob put most tweets as neutral and show a high correlation between the two. This is largely attributed to the fact that these models do not consider emoticons for assigning polarity

    Evaluating Polarity Trend Amidst the Coronavirus Crisis in Peoples’ Attitudes toward the Vaccination Drive

    No full text
    t has been more than a year since the coronavirus (COVID-19) engulfed the whole world, disturbing the daily routine, bringing down the economies, and killing two million people across the globe at the time of writing. The pandemic brought the world together to a joint effort to find a cure and work toward developing a vaccine. Much to the anticipation, the first batch of vaccines started rolling out by the end of 2020, and many countries began the vaccination drive early on while others still waiting in anticipation for a successful trial. Social media, meanwhile, was bombarded with all sorts of both positive and negative stories of the development and the evolving coronavirus situation. Many people were looking forward to the vaccines, while others were cautious about the side-effects and the conspiracy theories resulting in mixed emotions. This study explores users’ tweets concerning the COVID-19 vaccine and the sentiments expressed on Twitter. It tries to evaluate the polarity trend and a shift since the start of the coronavirus to the vaccination drive across six countries. The findings suggest that people of neighboring countries have shown quite a similar attitude regarding the vaccination in contrast to their different reactions to the coronavirus outbreak
    corecore