8 research outputs found

    K-MHaS: A Multi-label Hate Speech Detection Dataset in Korean Online News Comment

    Full text link
    Online hate speech detection has become an important issue due to the growth of online content, but resources in languages other than English are extremely limited. We introduce K-MHaS, a new multi-label dataset for hate speech detection that effectively handles Korean language patterns. The dataset consists of 109k utterances from news comments and provides a multi-label classification using 1 to 4 labels, and handles subjectivity and intersectionality. We evaluate strong baseline experiments on K-MHaS using Korean-BERT-based language models with six different metrics. KR-BERT with a sub-character tokenizer outperforms others, recognizing decomposed characters in each hate speech class.Comment: Accepted by COLING 202

    Session-Based Recommender System for Sustainable Digital Marketing

    No full text
    Many companies operate e-commerce websites to sell fashion products. Some customers want to buy products with intention of sustainability and therefore the companies need to suggest appropriate fashion products to those customers. Recommender systems are key applications in these sustainable digital marketing strategies and high performance is the most necessary factor. This research aims to improve recommendation systems’ performance by considering item session and attribute session information. We suggest the Item Session-Based Recommender (ISBR) and the Attribute Session-Based Recommenders (ASBRs) that use item and attribute session data independently, and then we suggest the Feature-Weighted Session-Based Recommenders (FWSBRs) that combine multiple ASBRs with various feature weighting schemes. Our experimental results show that FWSBR with chi-square feature weighting scheme outperforms ISBR, ASBRs, and Collaborative Filtering Recommender (CFR). In addition, it is notable that FWSBRs overcome the cold-start item problem, one significant limitation of CFR and ISBR, without losing performance.</jats:p

    Measuring the Effect of Mental Health on Type 2 Diabetes

    No full text
    There are many putative risk factors for type 2 diabetes (T2D), and the causal relationship between these factors and diabetes has been established. Socio-environmental and biological approaches are increasingly used to infer causality, and research is needed to understand the causal role of these factors in diabetes risk. Therefore, this study investigated the extent to which the treatment factor of stress induces the risk of diabetes through socio-environmental and biological factors. We present machine learning-based causal inference results generated using DoWhy, a Python library that provides a four-step causal inference method consisting of modeling, identification, estimation, and refutation steps. This study used 253,680 examples collected by the Behavioral Risk Factor Surveillance System (BRFSS), created a causal model based on a socio-environmental model, and tested the statistical significance of the obtained estimates. We identified several causal relationships and attempted various refutations. The results show that mental health problems increase the incidence of diabetes by about 15% (mean value: 0.146). The causal effect was evaluated based on the causal model, and the reliability of causal inference was proved through three refutation tests

    A Study on the Timing of Starting Pitcher Replacement Using Machine Learning

    No full text
    corecore