3 research outputs found

    文節の係り受け関係を用いた観点に基づく意見クラスタリング

    Get PDF
    Web上には,様々なトピックに関する意見が存在し,トピックに関する意見には様々な観点のものが混在している.例えば,「原発」というトピックに関する意見には安全性やエネルギー,健康といった観点の意見が混在している.意見をこのような観点ごとに分類することで,観点ごとに意見を容易に把握・比較でき,新たな観点の意見を発見する手がかりにもなる.意見を観点ごとに分類する研究は少なく,分類する観点を予め設定しているものや,観点の差異を考慮していない手法がほとんどである.そこで本研究では,予め観点を設定せずに,文脈情報,とりわけ名詞と動詞の係り受け関係を考慮して意見集合に適した観点を自動的に特定・分類するクラスタリング手法を提案する.本研究で提案する意見クラスタリング手法では,「意見の観点の違いは名詞と動詞の係り受け関係の違いに反映される」という仮定のもと,文節の係り受け関係から名詞Nと動詞Vのペア〈N,V〉を抽出し,これをクラスタリングに利用する.具体的には,各意見から得られた文節の係り受け関係をもとに名詞とそれが係る動詞のペア〈N,V〉を抽出する.そして,日本語WordNetと潜在意味インデキシングを用いて計算した名詞Nどうしの類似度と動詞Vどうしの類似度から抽出した〈N,V〉間の類似度を計算するが,特に,名詞Nどうしの類似度が高くなるほど動詞Vどうしの類似度が〈N,V〉間の類似度に大きく影響を与えるように計算する.最終的に意見どうしの類似度を〈N,V〉間の類似度から計算し,Ward法による階層型クラスタリングを行う.評価実験では,意見集合に対して人手による観点に基づいた分類と提案手法および従来のクラスタリング手法による分類がどの程度近いかということを指標として分類性能を調べた.実験の結果,提案手法では従来手法より高い分類性能となり,提案手法が有用であることが示された.電気通信大学201

    Public Opinion on National Exam Policies in Indonesia

    Get PDF
    Abstract Every new policy by Indonesian government in National Examination (NE) implementation always obtains different respond from public. Since the implementation, NE system already experienced many changes, but in recent years this system receives serious critiques. As a result, government then abolished this system as graduation determinant in 2014. This research analyzes public opinion, in the form of positive and negative sentiment toward NE policy, and factors that drive the opinions. Data in this research obtained from online news media from 2012 to 2015. The result shows that public sentiment fluctuating from year to year and depends on three important factors, i.e. political pressure, extreme events, and media coverage

    Genre and Domain Dependencies in Sentiment Analysis

    Get PDF
    Genre and domain influence an author\''s style of writing and therefore a text\''s characteristics. Natural language processing is prone to such variations in textual characteristics: it is said to be genre and domain dependent. This thesis investigates genre and domain dependencies in sentiment analysis. Its goal is to support the development of robust sentiment analysis approaches that work well and in a predictable manner under different conditions, i.e. for different genres and domains. Initially, we show that a prototypical approach to sentiment analysis -- viz. a supervised machine learning model based on word n-gram features -- performs differently on gold standards that originate from differing genres and domains, but performs similarly on gold standards that originate from resembling genres and domains. We show that these gold standards differ in certain textual characteristics, viz. their domain complexity. We find a strong linear relation between our approach\''s accuracy on a particular gold standard and its domain complexity, which we then use to estimate our approach\''s accuracy. Subsequently, we use certain textual characteristics -- viz. domain complexity, domain similarity, and readability -- in a variety of applications. Domain complexity and domain similarity measures are used to determine parameter settings in two tasks. Domain complexity guides us in model selection for in-domain polarity classification, viz. in decisions regarding word n-gram model order and word n-gram feature selection. Domain complexity and domain similarity guide us in domain adaptation. We propose a novel domain adaptation scheme and apply it to cross-domain polarity classification in semi- and unsupervised domain adaptation scenarios. Readability is used for feature engineering. We propose to adopt readability gradings, readability indicators as well as word and syntax distributions as features for subjectivity classification. Moreover, we generalize a framework for modeling and representing negation in machine learning-based sentiment analysis. This framework is applied to in-domain and cross-domain polarity classification. We investigate the relation between implicit and explicit negation modeling, the influence of negation scope detection methods, and the efficiency of the framework in different domains. Finally, we carry out a case study in which we transfer the core methods of our thesis -- viz. domain complexity-based accuracy estimation, domain complexity-based model selection, and negation modeling -- to a gold standard that originates from a genre and domain hitherto not used in this thesis
    corecore