8 research outputs found
K-MHaS: A Multi-label Hate Speech Detection Dataset in Korean Online News Comment
Online hate speech detection has become an important issue due to the growth
of online content, but resources in languages other than English are extremely
limited. We introduce K-MHaS, a new multi-label dataset for hate speech
detection that effectively handles Korean language patterns. The dataset
consists of 109k utterances from news comments and provides a multi-label
classification using 1 to 4 labels, and handles subjectivity and
intersectionality. We evaluate strong baseline experiments on K-MHaS using
Korean-BERT-based language models with six different metrics. KR-BERT with a
sub-character tokenizer outperforms others, recognizing decomposed characters
in each hate speech class.Comment: Accepted by COLING 202
Session-Based Recommender System for Sustainable Digital Marketing
Many companies operate e-commerce websites to sell fashion products. Some customers want to buy products with intention of sustainability and therefore the companies need to suggest appropriate fashion products to those customers. Recommender systems are key applications in these sustainable digital marketing strategies and high performance is the most necessary factor. This research aims to improve recommendation systems’ performance by considering item session and attribute session information. We suggest the Item Session-Based Recommender (ISBR) and the Attribute Session-Based Recommenders (ASBRs) that use item and attribute session data independently, and then we suggest the Feature-Weighted Session-Based Recommenders (FWSBRs) that combine multiple ASBRs with various feature weighting schemes. Our experimental results show that FWSBR with chi-square feature weighting scheme outperforms ISBR, ASBRs, and Collaborative Filtering Recommender (CFR). In addition, it is notable that FWSBRs overcome the cold-start item problem, one significant limitation of CFR and ISBR, without losing performance.</jats:p
An empirical study on the effect of data sparsity and data overlap on cross domain collaborative filtering performance
Measuring the Effect of Mental Health on Type 2 Diabetes
There are many putative risk factors for type 2 diabetes (T2D), and the causal relationship between these factors and diabetes has been established. Socio-environmental and biological approaches are increasingly used to infer causality, and research is needed to understand the causal role of these factors in diabetes risk. Therefore, this study investigated the extent to which the treatment factor of stress induces the risk of diabetes through socio-environmental and biological factors. We present machine learning-based causal inference results generated using DoWhy, a Python library that provides a four-step causal inference method consisting of modeling, identification, estimation, and refutation steps. This study used 253,680 examples collected by the Behavioral Risk Factor Surveillance System (BRFSS), created a causal model based on a socio-environmental model, and tested the statistical significance of the obtained estimates. We identified several causal relationships and attempted various refutations. The results show that mental health problems increase the incidence of diabetes by about 15% (mean value: 0.146). The causal effect was evaluated based on the causal model, and the reliability of causal inference was proved through three refutation tests
