12,056 research outputs found
Mining Comparison Opinions from Chinese Online Reviews for Restaurant Competitive Analysis
Comparison is widely used by consumers during the process of product evaluation in order to emphasize their preference, which can contribute to a proxy for product competitiveness analysis. This paper proposes a novel method for mining comparative sentences based on the achievements of linguistic study. The definition of comparative sentence subcategory is put forward and a mixed rule pool containing both artificial rules and CSR is set up. Besides, an entity dictionary is used to re-check the identification result which can ensure precise identification and classification of comparative sentences. Real online comments are collected from Dianping.com as experimental data. The result shows that the proposed method outperforms baseline methods in terms of identification precision. Based on the result, features and opinions of comparative sentences are mined. We then conducted sentiment analysis to calculate the sentimental score of comparison relations. Finally, a feature competitive network of restaurants is constructed
Role of sentiment classification in sentiment analysis: a survey
Through a survey of literature, the role of sentiment classification in sentiment analysis has been reviewed. The review identifies the research challenges involved in tackling sentiment classification. A total of 68 articles during 2015 – 2017 have been reviewed on six dimensions viz., sentiment classification, feature extraction, cross-lingual sentiment classification, cross-domain sentiment classification, lexica and corpora creation and multi-label sentiment classification. This study discusses the prominence and effects of sentiment classification in sentiment evaluation and a lot of further research needs to be done for productive results
User Review Analysis for Requirement Elicitation: Thesis and the framework prototype's source code
Online reviews are an important channel for requirement elicitation. However, requirement engineers face challenges when analysing online user reviews, such as data volumes, technical supports, existing techniques, and legal barriers. Juan Wang proposes a framework solving user review analysis problems for the purpose of requirement elicitation that sets up a channel from downloading user reviews to structured analysis data. The main contributions of her work are: (1) the thesis proposed a framework to solve the user review analysis problem for requirement elicitation; (2) the prototype of this framework proves its feasibility; (3) the experiments prove the effectiveness and efficiency of this framework.
This resource here is the latest version of Juan Wang's PhD thesis "User Review Analysis for Requirement Elicitation" and all the source code of the prototype for the framework as the results of her thesis
Text-based Sentiment Analysis and Music Emotion Recognition
Nowadays, with the expansion of social media, large amounts of user-generated
texts like tweets, blog posts or product reviews are shared online. Sentiment polarity
analysis of such texts has become highly attractive and is utilized in recommender
systems, market predictions, business intelligence and more. We also witness deep
learning techniques becoming top performers on those types of tasks. There are
however several problems that need to be solved for efficient use of deep neural
networks on text mining and text polarity analysis.
First of all, deep neural networks are data hungry. They need to be fed with
datasets that are big in size, cleaned and preprocessed as well as properly labeled.
Second, the modern natural language processing concept of word embeddings as a
dense and distributed text feature representation solves sparsity and dimensionality
problems of the traditional bag-of-words model. Still, there are various uncertainties
regarding the use of word vectors: should they be generated from the same dataset
that is used to train the model or it is better to source them from big and popular
collections that work as generic text feature representations? Third, it is not easy for
practitioners to find a simple and highly effective deep learning setup for various
document lengths and types. Recurrent neural networks are weak with longer texts
and optimal convolution-pooling combinations are not easily conceived. It is thus
convenient to have generic neural network architectures that are effective and can
adapt to various texts, encapsulating much of design complexity.
This thesis addresses the above problems to provide methodological and practical
insights for utilizing neural networks on sentiment analysis of texts and achieving
state of the art results. Regarding the first problem, the effectiveness of various
crowdsourcing alternatives is explored and two medium-sized and emotion-labeled
song datasets are created utilizing social tags. One of the research interests of Telecom
Italia was the exploration of relations between music emotional stimulation and
driving style. Consequently, a context-aware music recommender system that aims
to enhance driving comfort and safety was also designed. To address the second
problem, a series of experiments with large text collections of various contents and
domains were conducted. Word embeddings of different parameters were exercised
and results revealed that their quality is influenced (mostly but not only) by the
size of texts they were created from. When working with small text datasets, it is
thus important to source word features from popular and generic word embedding
collections. Regarding the third problem, a series of experiments involving convolutional
and max-pooling neural layers were conducted. Various patterns relating
text properties and network parameters with optimal classification accuracy were
observed. Combining convolutions of words, bigrams, and trigrams with regional
max-pooling layers in a couple of stacks produced the best results. The derived
architecture achieves competitive performance on sentiment polarity analysis of
movie, business and product reviews.
Given that labeled data are becoming the bottleneck of the current deep learning
systems, a future research direction could be the exploration of various data programming
possibilities for constructing even bigger labeled datasets. Investigation
of feature-level or decision-level ensemble techniques in the context of deep neural
networks could also be fruitful. Different feature types do usually represent complementary
characteristics of data. Combining word embedding and traditional text
features or utilizing recurrent networks on document splits and then aggregating the
predictions could further increase prediction accuracy of such models
Multi-modal post-editing of machine translation
As MT quality continues to improve, more and more translators switch from traditional translation from scratch to PE of MT output, which has been shown to save time and reduce errors. Instead of mainly generating text, translators are now asked to correct errors within otherwise helpful translation proposals, where repetitive MT errors make the process tiresome, while hard-to-spot errors make PE a cognitively demanding activity. Our contribution is three-fold: first, we explore whether interaction modalities other than mouse and keyboard could well support PE by creating and testing the MMPE translation environment. MMPE allows translators to cross out or hand-write text, drag and drop words for reordering, use spoken commands or hand gestures to manipulate text, or to combine any of these input modalities. Second, our interviews revealed that translators see value in automatically receiving additional translation support when a high CL is detected during PE. We therefore developed a sensor framework using a wide range of physiological and behavioral data to estimate perceived CL and tested it in three studies, showing that multi-modal, eye, heart, and skin measures can be used to make translation environments cognition-aware. Third, we present two multi-encoder Transformer architectures for APE and discuss how these can adapt MT output to a domain and thereby avoid correcting repetitive MT errors.Angesichts der stetig steigenden Qualität maschineller Übersetzungssysteme (MÜ) post-editieren (PE) immer mehr Übersetzer die MÜ-Ausgabe, was im Vergleich zur herkömmlichen Übersetzung Zeit spart und Fehler reduziert. Anstatt primär Text zu generieren, müssen Übersetzer nun Fehler in ansonsten hilfreichen Übersetzungsvorschlägen korrigieren. Dennoch bleibt die Arbeit durch wiederkehrende MÜ-Fehler mühsam und schwer zu erkennende Fehler fordern die Übersetzer kognitiv. Wir tragen auf drei Ebenen zur Verbesserung des PE bei: Erstens untersuchen wir, ob andere Interaktionsmodalitäten als Maus und Tastatur das PE unterstützen können, indem wir die Übersetzungsumgebung MMPE entwickeln und testen. MMPE ermöglicht es, Text handschriftlich, per Sprache oder über Handgesten zu verändern, Wörter per Drag & Drop neu anzuordnen oder all diese Eingabemodalitäten zu kombinieren. Zweitens stellen wir ein Sensor-Framework vor, das eine Vielzahl physiologischer und verhaltensbezogener Messwerte verwendet, um die kognitive Last (KL) abzuschätzen. In drei Studien konnten wir zeigen, dass multimodale Messung von Augen-, Herz- und Hautmerkmalen verwendet werden kann, um Übersetzungsumgebungen an die KL der Übersetzer anzupassen. Drittens stellen wir zwei Multi-Encoder-Transformer-Architekturen für das automatische Post-Editieren (APE) vor und erörtern, wie diese die MÜ-Ausgabe an eine Domäne anpassen und dadurch die Korrektur von sich wiederholenden MÜ-Fehlern vermeiden können.Deutsche Forschungsgemeinschaft (DFG), Projekt MMP
- …