Search CORE

53 research outputs found

Data-Driven Approach for Formality-Sensitive Machine Translation: Language-Specific Handling and Synthetic Data Generation

Author: Lee Seugnjun
Lim Heuiseok
Moon Hyeonseok
Park Chanjun
Publication venue
Publication date: 27/06/2023
Field of study

In this paper, we introduce a data-driven approach for Formality-Sensitive Machine Translation (FSMT) that caters to the unique linguistic properties of four target languages. Our methodology centers on two core strategies: 1) language-specific data handling, and 2) synthetic data generation using large-scale language models and empirical prompt engineering. This approach demonstrates a considerable improvement over the baseline, highlighting the effectiveness of data-centric techniques. Our prompt engineering strategy further improves performance by producing superior synthetic translation examples.Comment: Accepted for Data-centric Machine Learning Research (DMLR) Workshop at ICML 202

arXiv.org e-Print Archive

Alternative Speech: Complementary Method to Counter-Narrative for Better Discourse

Author: Jung Dahyun
Lee Seolhwa
Lee Seungyoon
Lim Heuiseok
Park Chanjun
Publication venue
Publication date: 25/01/2024
Field of study

We introduce the concept of "Alternative Speech" as a new way to directly combat hate speech and complement the limitations of counter-narrative. An alternative speech provides practical alternatives to hate speech in real-world scenarios by offering speech-level corrections to speakers while considering the surrounding context and promoting speakers to reform. Further, an alternative speech can combat hate speech alongside counter-narratives, offering a useful tool to address social issues such as racial discrimination and gender inequality. We propose the new concept and provide detailed guidelines for constructing the necessary dataset. Through discussion, we demonstrate that combining alternative speech and counter-narrative can be a more effective strategy for combating hate speech by complementing specificity and guiding capacity of counter-narrative. This paper presents another perspective for dealing with hate speech, offering viable remedies to complement the constraints of current approaches to mitigating harmful bias.Comment: Accepted for The First Workshop on Data-Centric AI (DCAI) at ICDM 202

arXiv.org e-Print Archive

A Self-Supervised Automatic Post-Editing Data Generation Tool

Author: Eo Sugyeong
Lee SeungJun
Lim Heuiseok
Moon Hyeonseok
Park Chanjun
Seo Jaehyung
Publication venue
Publication date: 09/06/2022
Field of study

Data building for automatic post-editing (APE) requires extensive and expert-level human effort, as it contains an elaborate process that involves identifying errors in sentences and providing suitable revisions. Hence, we develop a self-supervised data generation tool, deployable as a web application, that minimizes human supervision and constructs personalized APE data from a parallel corpus for several language pairs with English as the target language. Data-centric APE research can be conducted using this tool, involving many language pairs that have not been studied thus far owing to the lack of suitable data.Comment: Accepted for DataPerf workshop at ICML 202

arXiv.org e-Print Archive

A Study on the Development of Game-based Mind Wandering Judgment Model in Video Lecture-based Education

Author: Jo Jaechoon
Lim Heuiseok
Yang Yeongwook
Publication venue: 'Taiwan Association of Engineering and Technology Innovation'
Publication date: 11/10/2018
Field of study

Although video lecture materials are very efficient learning materials, they are likely to be unilateral learning materials by the lecturer. It is easily degraded to be one-sided learning, which has been considered as a problem of online education, and it is difficult to judge whether learners are actually learning. Therefore, in this paper, a minimum learning activity judgment model that can automatically determine if they actually learn through mind wandering judgment was proposed to overcome the limitations of previous learning materials, and educational effect verification experiment was performed. Experiment results show that the video lecture class using the minimum learning activity judgment system was effective in improving the academic achievement

Taiwan Association of Engineering and Technology Innovation: E-Journals