Search CORE

2 research outputs found

Development of an Entropy-Based Feature Selection Method and Analysis of Online Reviews on Real Estate

Author: Carreón Elisa Claire Alemán
Hiraoka Toru
Horino Hiroki
Nonaka Hirofumi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/04/2019
Field of study

In recent years, data posted about real estate on the Internet is currently increasing. In this study, in order to analyze user needs for real estate, we focus on "Mansion Community" which is a Japanese bulletin board system (hereinafter referred to as BBS) about Japanese real estate. In our study, extraction of keywords is performed based on the calculation of the entropy value of each word, and we used them as features in a machine learning classifier to analyze 6 million posts at "Mansion Community". As a result, we achieved a 0.69 F-measure and found that the customers are particularly concerned about the facility of apartment, access, and price of an apartment

arXiv.org e-Print Archive

Automatic Domain Adaptation Outperforms Manual Domain Adaptation for Predicting Financial Outcomes

Author: Breitkopf Nikolas
Schütze Hinrich
Sedinkina Marina
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 25/06/2020
Field of study

In this paper, we automatically create sentiment dictionaries for predicting financial outcomes. We compare three approaches: (I) manual adaptation of the domain-general dictionary H4N, (ii) automatic adaptation of H4N and (iii) a combination consisting of first manual, then automatic adaptation. In our experiments, we demonstrate that the automatically adapted sentiment dictionary outperforms the previous state of the art in predicting the financial outcomes excess return and volatility. In particular, automatic adaptation performs better than manual adaptation. In our analysis, we find that annotation based on an expert's a priori belief about a word's meaning can be incorrect -- annotation should be performed based on the word's contexts in the target domain instead.Comment: Accepted at ACL201

arXiv.org e-Print Archive