423 research outputs found

    A brute force tuning of training length for concept drift

    Get PDF
    We present a brute-force approach to analyze the concept drift behind time sequence data. This approach, named SELECT, searches for the optimal length of training data to minimize error metrics. In other words, SELECT searches for the start point of a new concept from the input sequence. Unlike many related methods, SELECT does not require a pre-specified error threshold to detect drift. In addition, the visual analysis obtained from SELECT enables us to understand how significant a drift has occurred. We test SELECT on two real-world datasets, stock price and COVID-19 infection data. The experimental results show that SELECT can improve the model performance of both datasets. In addition, the visual analysis shows the points of significant drifts, e.g., Lehman’s collapse in stock price data and the spread of variants in COVID-19 data. These results show the effectiveness of the brute-force approach in analyzing concept drift

    Emotional Similarity Word Embedding Model for Sentiment Analysis

    Get PDF
    We propose a method for constructing a dictionary of emotional expressions, which is an indispensable language resource for sentiment analysis in the Japanese. Furthermore, we propose a method for constructing a language model that reproduces emotional similarity between words, which to date has yet not been considered in conventional dictionaries and language models. In the proposed method, we pre-trained sentiment labels for the distributed representations of words. An intermediate feature vector was obtained from the pre-trained model. By learning an additional semantic label on this feature vector, we can construct an emotional semantic language model that embeds both emotion and semantics. To confirm the effectiveness of the proposed method, we conducted a simple experiment to retrieve similar emotional words using the constructed model. The results of this experiment showed that the proposed method can retrieve similar emotional words with higher accuracy than the conventional word-embedding model

    Relations between Sleep Time and SNS Texts

    Get PDF
    Sleeping habits are one of the major issues in today’s healthcare. In this paper, we consider the problem of analyzing sleeping habits of people using social networking service (SNS) texts. As the first step toward predicting user’s sleeping time using SNS texts, we assume that the time span between the user’s last post in one day and the first post the next day can be used as a pseudo-indicator for the user’s sleeping time if the user posts the text sufficiently frequently. We call such tweet time spans “pseudo-sleeping time” if the first tweet of the next day include “Good morning” or similar words. We try to predict such pseudo-sleeping time using the text (tweet) of the preceding tweet (i.e., the last tweet of the day). Preliminary experiments show that the tweet text contains some useful information to predict the user’s pseudo-sleeping time

    Lower Perplexity is Not Always Human-Like

    Full text link
    In computational psycholinguistics, various language models have been evaluated against human reading behavior (e.g., eye movement) to build human-like computational models. However, most previous efforts have focused almost exclusively on English, despite the recent trend towards linguistic universal within the general community. In order to fill the gap, this paper investigates whether the established results in computational psycholinguistics can be generalized across languages. Specifically, we re-examine an established generalization -- the lower perplexity a language model has, the more human-like the language model is -- in Japanese with typologically different structures from English. Our experiments demonstrate that this established generalization exhibits a surprising lack of universality; namely, lower perplexity is not always human-like. Moreover, this discrepancy between English and Japanese is further explored from the perspective of (non-)uniform information density. Overall, our results suggest that a cross-lingual evaluation will be necessary to construct human-like computational models.Comment: Accepted by ACL 202

    Classification of Smartphone Application Reviews Using Small Corpus Based on Bidirectional LSTM Transformer

    Get PDF
    This paper provides the classification of the review texts on a smartphone application posted on social media. We propose a high performance binary classification method (positive/negative) of review texts, which uses the bidirectional long short-term memory (biLSTM) self-attentional Transformer and is based on the distributed representations created by unsupervised learning of a manually labelled small review corpus, dictionary, and an unlabeled large review corpus. The proposed method obtained higher accuracy as compared to the existing methods, such as StarSpace or the Bidirectional Encoder Representations from Transformer (BERT)

    Efficient Model Selection for Predictive Pattern Mining Model by Safe Pattern Pruning

    Full text link
    Predictive pattern mining is an approach used to construct prediction models when the input is represented by structured data, such as sets, graphs, and sequences. The main idea behind predictive pattern mining is to build a prediction model by considering substructures, such as subsets, subgraphs, and subsequences (referred to as patterns), present in the structured data as features of the model. The primary challenge in predictive pattern mining lies in the exponential growth of the number of patterns with the complexity of the structured data. In this study, we propose the Safe Pattern Pruning (SPP) method to address the explosion of pattern numbers in predictive pattern mining. We also discuss how it can be effectively employed throughout the entire model building process in practical data analysis. To demonstrate the effectiveness of the proposed method, we conduct numerical experiments on regression and classification problems involving sets, graphs, and sequences

    Asymmetric desymmetrization of meso-diols by C(2)-symmetric chiral 4-pyrrolidinopyridines.

    Get PDF
    In this work we developed C(2)-symmetric chiral nucleophilic catalysts which possess a pyrrolidinopyridine framework as a catalytic site. Some of these organocatalysts effectively promoted asymmetric desymmetrization of meso-diols via enantioselective acylation

    Raman Scattering Investigation of Structural Transition in Ca5Ir3O12

    Get PDF
    We report a study of the second-order phase transition at 105 K in the geometrically frustrated iridate Ca5Ir3O12 using a Raman scattering method. The Raman scattering spectra of a single crystal were measured from 4 K to room temperature. Ab initio phonon calculations that consider the spin–orbit interaction were also conducted and compared with the experimental spectra. Agreement between the theoretical and experimental results at room temperature is reasonably good. At room temperature, 6A\u271+9E\u27+5E\u27\u27 were assigned among the Raman active modes, 6A\u271+13E\u27+6E\u27\u27, based on the reported P6¯2m crystal structure. Below Ts, 23 additional peaks were observed, suggesting the appearance of a superlattice structure. The polarization dependence of Raman spectra below Ts indicates the existence of 6¯ symmetry. We observed at least one additional mode as a broad weak-intensity peak at temperatures higher than Ts. This suggests possible local distortion around the Ir ions, which would be expected for Ir ions with mixed valence states
    corecore