49 research outputs found

    LAnoBERT : System Log Anomaly Detection based on BERT Masked Language Model

    Full text link
    The system log generated in a computer system refers to large-scale data that are collected simultaneously and used as the basic data for determining simple errors and detecting external adversarial intrusion or the abnormal behaviors of insiders. The aim of system log anomaly detection is to promptly identify anomalies while minimizing human intervention, which is a critical problem in the industry. Previous studies performed anomaly detection through algorithms after converting various forms of log data into a standardized template using a parser. These methods involved generating a template for refining the log key. Particularly, a template corresponding to a specific event should be defined in advance for all the log data using which the information within the log key may get lost.In this study, we propose LAnoBERT, a parser free system log anomaly detection method that uses the BERT model, exhibiting excellent natural language processing performance. The proposed method, LAnoBERT, learns the model through masked language modeling, which is a BERT-based pre-training method, and proceeds with unsupervised learning-based anomaly detection using the masked language modeling loss function per log key word during the inference process. LAnoBERT achieved better performance compared to previous methodology in an experiment conducted using benchmark log datasets, HDFS, and BGL, and also compared to certain supervised learning-based models

    Box Office Forecasting considering Competitive Environment and Word-of-Mouth in Social Networks: A Case Study of Korean Film Market

    Get PDF
    Accurate box office forecasting models are developed by considering competition and word-of-mouth (WOM) effects in addition to screening-related information. Nationality, genre, ratings, and distributors of motion pictures running concurrently with the target motion picture are used to describe the competition, whereas the numbers of informative, positive, and negative mentions posted on social network services (SNS) are used to gauge the atmosphere spread by WOM. Among these candidate variables, only significant variables are selected by genetic algorithm (GA), based on which machine learning algorithms are trained to build forecasting models. The forecasts are combined to improve forecasting performance. Experimental results on the Korean film market show that the forecasting accuracy in early screening periods can be significantly improved by considering competition. In addition, WOM has a stronger influence on total box office forecasting. Considering both competition and WOM improves forecasting performance to a larger extent than when only one of them is considered

    DSTEA: Improving Dialogue State Tracking via Entity Adaptive Pre-training

    Full text link
    Dialogue State Tracking (DST) is critical for comprehensively interpreting user and system utterances, thereby forming the cornerstone of efficient dialogue systems. Despite past research efforts focused on enhancing DST performance through alterations to the model structure or integrating additional features like graph relations, they often require additional pre-training with external dialogue corpora. In this study, we propose DSTEA, improving Dialogue State Tracking via Entity Adaptive pre-training, which can enhance the encoder through by intensively training key entities in dialogue utterances. DSTEA identifies these pivotal entities from input dialogues utilizing four different methods: ontology information, named-entity recognition, the spaCy, and the flair library. Subsequently, it employs selective knowledge masking to train the model effectively. Remarkably, DSTEA only requires pre-training without the direct infusion of extra knowledge into the DST model. This approach resulted in substantial performance improvements of four robust DST models on MultiWOZ 2.0, 2.1, and 2.2, with joint goal accuracy witnessing an increase of up to 2.69% (from 52.41% to 55.10%). Further validation of DSTEA's efficacy was provided through comparative experiments considering various entity types and different entity adaptive pre-training configurations such as masking strategy and masking rate

    Model eye imaging by closed-loop accumulation of single scattering (CLASS) microscopy

    Get PDF
    ‘Closed-loop accumulation of single scattering (CLASS)’ microscopy provides novel solutions to the problems of light scattering and aberration in optical imaging, providing increased imaging depth while maintaining diffraction limited resolution. This method has a great potential to increase imaging depth and resolution of current eye imaging. In this presentation, the strength and weakness of the CLASS microscopy over the current adaptive optical microscopy will be discussed. Important factors to apply CLASS microscopy to eye imaging and the possibility to imaging retina in turbid condition will be discussed by using model eye

    Painsight: An Extendable Opinion Mining Framework for Detecting Pain Points Based on Online Customer Reviews

    Full text link
    As the e-commerce market continues to expand and online transactions proliferate, customer reviews have emerged as a critical element in shaping the purchasing decisions of prospective buyers. Previous studies have endeavored to identify key aspects of customer reviews through the development of sentiment analysis models and topic models. However, extracting specific dissatisfaction factors remains a challenging task. In this study, we delineate the pain point detection problem and propose Painsight, an unsupervised framework for automatically extracting distinct dissatisfaction factors from customer reviews without relying on ground truth labels. Painsight employs pre-trained language models to construct sentiment analysis and topic models, leveraging attribution scores derived from model gradients to extract dissatisfaction factors. Upon application of the proposed methodology to customer review data spanning five product categories, we successfully identified and categorized dissatisfaction factors within each group, as well as isolated factors for each type. Notably, Painsight outperformed benchmark methods, achieving substantial performance enhancements and exceptional results in human evaluations.Comment: WASSA at ACL 202

    GPU-Accelerated Stochastic Simulation of Biochemical Networks

    No full text
    corecore