    A Deep Learning based Model using Review Associated Feature Extraction Approach for Sentiment Analysis

    With the advancement of internet technologies, in the present days, online forums, social media platforms and e-commerce sites have made the product reviews process very easy. There are a lot of mobile applications, websites and forums where consumers used to share and circulate their opinions, experiences, ideas and views regarding products, brands and services. In consequence, online user reviews have become a deciding factor for many consumers prior to purchasing their selected items. The sentiment analysis is a technique to extract sentiments, feelings and insights from customer reviews and public texts. Therefore, plenty of businesses perform sentiment analysis in order to more thoroughly comprehend of their customer opinions and suggestions regarding their products and services. Furthermore, a number of scientific researchers also have a keen interest in classifying customer reviews into a set of labels employing text classification techniques. The objective of the this research work is to develop an approach to extract review associated features using Part-of-Speech (POS) tagging and design a CNN model to classify the reviews' sentiment as positive or negative. In this paper, an approach to extract review associated feature has been presented. Natural Language Processing (NLP) techniques are utilized for data preprocessing to remove uninformative data from reviews. Deep learning model CNN is used for sentiment classification and Amazon mobile reviews dataset is used for the experiment. The proposed model is experimentally evaluated and provides enhanced performance than other models also provides improved accuracy of 97.23% on Amazon mobile review dataset

    Sentiment Analysis in Spanish for Improvement of Products and Services: A Deep Learning Approach

    Definición de un lenguaje controlado para el análisis de sentimientos en Twitter para mensajes en inglés

    El análisis de sentimientos, también conocido como minería de opinión, surge con el fin de determinar la polaridad de un texto dado. Es el campo de estudio que permite analizar la respuesta emocional de los usuarios en redes sociales, con el fin de ayudar en la toma de decisiones en ámbitos sociales, económicos, políticos, laborales y financieros. Una de las redes sociales que está ganando más popularidad en el análisis de sentimientos es Twitter, ya que es una herramienta de microblogging social que permite a los usuarios expresar opiniones e ideas en textos cortos (280 caracteres) y concisos, lo cual es ideal para extraer estadísticas de temas específicos. Sin embargo, el vocabulario que se utiliza en redes sociales posee problemas inherentes al lenguaje natural. Tradicionalmente, en lingüística computacional se solucionan estos problemas utilizando lenguajes controlados. Un lenguaje controlado es un subconjunto del lenguaje natural que posee restricciones en la terminología, lo cual permite reducir la ambigüedad y aporta precisión para posteriores análisis. Para entender el significado que tienen las palabras en un vocabulario, es necesario recurrir a los términos de connotación y denotación. La connotación de una palabra incluye elementos de carácter subjetivo e implica interpretación. Por su parte, la denotación, es la expresión formal y objetiva, es decir, el significado universal que se le da a una palabra. En lo relativo a la función que realizan las palabras dentro de una oración, el análisis sintáctico suministra apoyo en cuanto a describir cómo las palabras de la oración se relacionan y la categoría gramatical que tiene cada una de estas. Para realizar el análisis sintáctico se puede utilizar un árbol de constituyentes. Comúnmente, en la minería de opinión se utiliza el preprocesamiento de datos para eliminar ruidos y/o inconsistencias, con el fin de preparar los datos para un posterior análisis. Esta intervención y transformación se realiza mediante los siguientes pasos: tokenización, eliminación de datos ruidosos, eliminación de palabras de jerga, revisión ortográfica y, por último, stemming. En las aproximaciones encontradas en la literatura para el análisis de sentimientos se encuentran métodos basados en léxico, los cuales contienen enfoques basados en diccionarios. En esas aproximaciones se aprecian análisis de ciertos elementos de los corpus de diferentes redes sociales, pero, debido a la falta de un lenguaje controlado, requieren una mayor intervención y transformación del mensaje antes de definir su polaridad. Esto se debe a que el lenguaje natural utilizado en las redes sociales posee características como la polisemia y sinonimia que plantean desafíos en el análisis computacional y, además, tiene datos ruidosos como emoticones, hashtags, caracteres especiales, hipervínculos o etiquetas HTML. Debido a lo anterior, se pierde parte de la información que puede ser relevante a la hora de definir la polaridad de un mensaje. Adicionalmente, los enfoques basados en diccionarios fallan a la hora de relacionar términos y analizar el contexto en que las palabras se escriben. Por ello, en esta Tesis de Maestría se propone la definición de un lenguaje controlado para el análisis de sentimientos en Twitter para mensajes en inglés, el cual tiene como finalidad transformar las opiniones de los usuarios en una estructura que facilite su clasificación en polaridades negativas o positivas mediante reglas sintácticas, representando el lenguaje natural en textos controlados que se podrían utilizar para mejorar los métodos existentes en la literatura para el análisis de sentimientos. Una vez se implementa el lenguaje controlado, las palabras adquieren valor sintáctico, se estandariza la terminología y el formato de la información, por lo cual los mensajes son más precisos e inequívocos y, por consiguiente, son útiles como punto de partida para la automatización del razonamiento.Abstract: Sentiment analysis—SA, also known as opinion mining—is intended to determine the polarity of a given text. SA is the field of study for analyzing the emotional response of social network users in order to help decision making in social, economic, political, labor, and financial fields. One of the social networks gaining more popularity in sentiment analysis is Twitter, since it is a social microblogging tool which allows users for expressing their opinions and ideas in concise and short (280 characters) texts ideal to draw statistics from specific topics. The vocabulary used in social networks has inherent problems regarding natural language. Commonly, such problems are solved in computational linguistics by using controlled languages. A controlled language is a subset of natural language with restrictions in terminology for reducing ambiguity and providing accuracy to future analyses. Concepts of connotation and denotation are needed to understand the meaning words have in certain vocabulary. Connotation of a word includes subjective elements and it implies interpretation. On the other hand, denotation is defined as a formal and objective expression, i.e., the universal meaning a word. Related to the function of words within a text, the syntax analysis supports the way the words in a sentence are interrelated and the grammatical category of each word. Syntax trees can be used for carrying out a syntax analysis. In opinion mining, data pre-processing is commonly used for removing noise/inconsistencies in order to prepare data for a future analysis. Pre-processing has some steps: tokenization, deletion of noisy data, deletion of jargon words, spell checking, and finally stemming. Previous work about sentiment analysis includes lexicon-based methods with dictionary-based methods. In such methods, analysis of different corpus elements is estimated in some social networks, but it requires a greater intervention and transformation of the message before defining its polarity due to the lack of a controlled language. Also, natural language used in social networks has challenges in its computational analysis, because it presents polysemy and synonymy, and noisy data—e.g., emojis, hashtags, special characters, hyperlinks, and HTML tags. As a result, part of. the information relevant at the time of defining the polarity of a message is lost. In addition, dictionary-based methods fail in relating concepts and analyzing the context in which the words are written. For this reason, in this M.Sc. thesis we propose the definition of a controlled language for sentiment analysis in Twitter English messages. We intend to transform the user opinion into a structure for easing their classification into negative/positive polarities by using syntax rules. We represent natural language in controlled texts to be used for improving the existing sentiment analysis methods of the state of the art. Once controlled language is implemented, the words acquire syntactic value, the terminology and the format of the information are standardized, and consequently the messages are more precise and unambiguous. So, they are useful for the reasoning automation.Maestrí

    What Airbnb Reviews can Tell us? An Advanced Latent Aspect Rating Analysis Approach

    There is no doubt that the rapid growth of Airbnb has changed the lodging industry and tourists’ behaviors dramatically since the advent of the sharing economy. Airbnb welcomes customers and engages them by creating and providing unique travel experiences to “live like a local” through the delivery of lodging services. With the special experiences that Airbnb customers pursue, more investigation is needed to systematically examine the Airbnb customer lodging experience. Online reviews offer a representative look at individual customers’ personal and unique lodging experiences. Moreover, the overall ratings given by customers are reflections of their experiences with a product or service. Since customers take overall ratings into account in their purchase decisions, a study that bridges the customer lodging experience and the overall rating is needed. In contrast to traditional research methods, mining customer reviews has become a useful method to study customers’ opinions about products and services. User-generated reviews are a form of evaluation generated by peers that users post on business or other (e.g., third-party) websites (Mudambi & Schuff, 2010). The main purpose of this study is to identify the weights of latent lodging experience aspects that customers consider in order to form their overall ratings based on the eight basic emotions. This study applied both aspect-based sentiment analysis and the latent aspect rating analysis (LARA) model to predict the aspect ratings and determine the latent aspect weights. Specifically, this study extracted the innovative lodging experience aspects that Airbnb customers care about most by mining a total of 248,693 customer reviews from 6,946 Airbnb accommodations. Then, the NRC Emotion Lexicon with eight emotions was employed to assess the sentiments associated with each lodging aspect. By applying latent rating regression, the predicted aspect ratings were generated. With the aspect ratings, , the aspect weights, and the predicted overall ratings were calculated. It was suggested that the overall rating be assessed based on the sentiment words of five lodging aspects: communication, experience, location, product/service, and value. It was found that, compared with the aspects of location, product/service, and value, customers expressed less joy and more surprise than they did over the aspects of communication and experience. The LRR results demonstrate that Airbnb customers care most about a listing location, followed by experience, value, communication, and product/service. The results also revealed that even listings with the same overall rating may have different predicted aspect ratings based on the different aspect weights. Finally, the LARA model demonstrated the different preferences between customers seeking expensive versus cheap accommodations. Understanding customer experience and its role in forming customer rating behavior is important. This study empirically confirms and expands the usefulness of LARA as the prediction model in deconstructing overall ratings into aspect ratings, and then further predicting aspect level weights. This study makes meaningful academic contributions to the evolving customer behavior and customer experience research. It also benefits the shared-lodging industry through its development of pragmatic methods to establish effective marketing strategies for improving customer perceptions and create personalized review filter systems

    Artificial Intelligence for Online Review Platforms - Data Understanding, Enhanced Approaches and Explanations in Recommender Systems and Aspect-based Sentiment Analysis

    The epoch-making and ever faster technological progress provokes disruptive changes and poses pivotal challenges for individuals and organizations. In particular, artificial intelligence (AI) is a disruptive technology that offers tremendous potential for many fields such as information systems and electronic commerce. Therefore, this dissertation contributes to AI for online review platforms aiming at enabling the future for consumers, businesses and platforms by unveiling the potential of AI. To achieve this goal, the dissertation investigates six major research questions embedded in the triad of data understanding of online consumer reviews, enhanced approaches and explanations in recommender systems and aspect-based sentiment analysis