Search CORE

49 research outputs found

LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model

Author: Kang Pilsung
Kim Jina
Lee Yukyung
Publication venue
Publication date: 20/11/2021
Field of study

The system log generated in a computer system refers to large-scale data that are collected simultaneously and used as the basic data for determining simple errors and detecting external adversarial intrusion or the abnormal behaviors of insiders. The aim of system log anomaly detection is to promptly identify anomalies while minimizing human intervention, which is a critical problem in the industry. Previous studies performed anomaly detection through algorithms after converting various forms of log data into a standardized template using a parser. These methods involved generating a template for refining the log key. Particularly, a template corresponding to a specific event should be defined in advance for all the log data using which the information within the log key may get lost.In this study, we propose LAnoBERT, a parser free system log anomaly detection method that uses the BERT model, exhibiting excellent natural language processing performance. The proposed method, LAnoBERT, learns the model through masked language modeling, which is a BERT-based pre-training method, and proceeds with unsupervised learning-based anomaly detection using the masked language modeling loss function per log key word during the inference process. LAnoBERT achieved better performance compared to previous methodology in an experiment conducted using benchmark log datasets, HDFS, and BGL, and also compared to certain supervised learning-based models

arXiv.org e-Print Archive

Box Office Forecasting considering Competitive Environment and Word-of-Mouth in Social Networks: A Case Study of Korean Film Market

Author: Jungsik Hong
Pilsung Kang
Taegu Kim
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

Accurate box office forecasting models are developed by considering competition and word-of-mouth (WOM) effects in addition to screening-related information. Nationality, genre, ratings, and distributors of motion pictures running concurrently with the target motion picture are used to describe the competition, whereas the numbers of informative, positive, and negative mentions posted on social network services (SNS) are used to gauge the atmosphere spread by WOM. Among these candidate variables, only significant variables are selected by genetic algorithm (GA), based on which machine learning algorithms are trained to build forecasting models. The forecasts are combined to improve forecasting performance. Experimental results on the Korean film market show that the forecasting accuracy in early screening periods can be significantly improved by considering competition. In addition, WOM has a stronger influence on total box office forecasting. Considering both competition and WOM improves forecasting performance to a larger extent than when only one of them is considered

Crossref

Directory of Open Access Journals

DSTEA: Improving Dialogue State Tracking via Entity Adaptive Pre-training

Author: Bang Junseong
Kang Pilsung
Kim Misuk
Kim Takyoung
Lee Yukyung
Yoon Hoonsang
Publication venue
Publication date: 23/07/2023
Field of study

Dialogue State Tracking (DST) is critical for comprehensively interpreting user and system utterances, thereby forming the cornerstone of efficient dialogue systems. Despite past research efforts focused on enhancing DST performance through alterations to the model structure or integrating additional features like graph relations, they often require additional pre-training with external dialogue corpora. In this study, we propose DSTEA, improving Dialogue State Tracking via Entity Adaptive pre-training, which can enhance the encoder through by intensively training key entities in dialogue utterances. DSTEA identifies these pivotal entities from input dialogues utilizing four different methods: ontology information, named-entity recognition, the spaCy, and the flair library. Subsequently, it employs selective knowledge masking to train the model effectively. Remarkably, DSTEA only requires pre-training without the direct infusion of extra knowledge into the DST model. This approach resulted in substantial performance improvements of four robust DST models on MultiWOZ 2.0, 2.1, and 2.2, with joint goal accuracy witnessing an increase of up to 2.69% (from 52.41% to 55.10%). Further validation of DSTEA's efficacy was provided through comparative experiments considering various entity types and different entity adaptive pre-training configurations such as masking strategy and masking rate

arXiv.org e-Print Archive

Model eye imaging by closed-loop accumulation of single scattering (CLASS) microscopy

Author: Choi Wonshik
Hong Jin Hee
Jung Yookyung
Kang Pilsung
Kwon Yongwoo
Publication venue: ECI Digital Archives
Publication date: 02/06/2019
Field of study

‘Closed-loop accumulation of single scattering (CLASS)’ microscopy provides novel solutions to the problems of light scattering and aberration in optical imaging, providing increased imaging depth while maintaining diffraction limited resolution. This method has a great potential to increase imaging depth and resolution of current eye imaging. In this presentation, the strength and weakness of the CLASS microscopy over the current adaptive optical microscopy will be discussed. Important factors to apply CLASS microscopy to eye imaging and the possibility to imaging retina in turbid condition will be discussed by using model eye

Engineering Conferences International

Painsight: An Extendable Opinion Mining Framework for Detecting Pain Points Based on Online Customer Reviews

Author: Kang Pilsung
Kho Yookyung
Kim Doyoon
Kim Jaehee
Kim Younsun
Lee Yukyung
Publication venue
Publication date: 03/06/2023
Field of study

As the e-commerce market continues to expand and online transactions proliferate, customer reviews have emerged as a critical element in shaping the purchasing decisions of prospective buyers. Previous studies have endeavored to identify key aspects of customer reviews through the development of sentiment analysis models and topic models. However, extracting specific dissatisfaction factors remains a challenging task. In this study, we delineate the pain point detection problem and propose Painsight, an unsupervised framework for automatically extracting distinct dissatisfaction factors from customer reviews without relying on ground truth labels. Painsight employs pre-trained language models to construct sentiment analysis and topic models, leveraging attribution scores derived from model gradients to extract dissatisfaction factors. Upon application of the proposed methodology to customer review data spanning five product categories, we successfully identified and categorized dissatisfaction factors within each group, as well as isolated factors for each type. Notably, Painsight outperformed benchmark methods, achieving substantial performance enhancements and exceptional results in human evaluations.Comment: WASSA at ACL 202

arXiv.org e-Print Archive

GPU-Accelerated Stochastic Simulation of Biochemical Networks

Author: Pilsung KANG
Publication venue: 'Institute of Electronics, Information and Communications Engineers (IEICE)'
Publication date: 01/01/2018
Field of study

Crossref