4,119 research outputs found
Recommended from our members
Unsupervised intralingual and cross-lingual speaker adaptation for HMM-based speech synthesis using two-pass decision tree construction
Hidden Markov model (HMM)-based speech synthesis systems possess several advantages over concatenative synthesis systems. One such advantage is the relative ease with which HMM-based systems are adapted to speakers not present in the training dataset. Speaker adaptation methods used in the field of HMM-based automatic speech recognition (ASR) are adopted for this task. In the case of unsupervised speaker adaptation, previous work has used a supplementary set of acoustic models to estimate the transcription of the adaptation data. This paper firstly presents an approach to the unsupervised speaker adaptation task for HMM-based speech synthesis models which avoids the need for such supplementary acoustic models. This is achieved by defining a mapping between HMM-based synthesis models and ASR-style models, via a two-pass decision tree construction process. Secondly, it is shown that this mapping also enables unsupervised adaptation of HMM-based speech synthesis models without the need to perform linguistic analysis of the estimated transcription of the adaptation data. Thirdly, this paper demonstrates how this technique lends itself to the task of unsupervised cross-lingual adaptation of HMM-based speech synthesis models, and explains the advantages of such an approach. Finally, listener evaluations reveal that the proposed unsupervised adaptation methods deliver performance approaching that of supervised adaptation
Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN
Different types of sentences express sentiment in very different ways. Traditional sentence-level sentiment classification research focuses on one-technique-fits-all solution or only centers on one special type of sentences. In this paper, we propose a divide-and-conquer approach which first classifies sentences into different types, then performs sentiment analysis separately on sentences from each type. Specifically, we find that sentences tend to be more complex if they contain more sentiment targets. Thus, we propose to first apply a neural network based sequence model to classify opinionated sentences into three types according to the number of targets appeared in a sentence. Each group of sentences is then fed into a one-dimensional convolutional neural network separately for sentiment classification. Our approach has been evaluated on four sentiment classification datasets and compared with a wide range of baselines. Experimental results show that: (1) sentence type classification can improve the performance of sentence-level sentiment analysis; (2) the proposed approach achieves state-of-the-art results on several benchmarking datasets
Unsupervised and knowledge-poor approaches to sentiment analysis
Sentiment analysis focuses upon automatic classiffication of a document's sentiment (and more generally extraction of opinion from text). Ways of expressing sentiment have been
shown to be dependent on what a document is about (domain-dependency). This complicates supervised methods for sentiment analysis which rely on extensive use of training data or linguistic resources that are usually either domain-specific or generic. Both kinds of resources prevent classiffiers from performing well across a range of domains, as this requires appropriate in-domain (domain-specific) data.
This thesis presents a novel unsupervised, knowledge-poor approach to sentiment analysis aimed at creating a domain-independent and multilingual sentiment analysis system.
The approach extracts domain-specific resources from documents that are to be processed, and uses them for sentiment analysis. This approach does not require any training corpora, large sets of rules or generic sentiment lexicons, which makes it domain- and languageindependent but at the same time able to utilise domain- and language-specific information.
The thesis describes and tests the approach, which is applied to diffeerent data, including customer reviews of various types of products, reviews of films and books, and news items; and to four languages: Chinese, English, Russian and Japanese. The approach is applied not only to binary sentiment classiffication, but also to three-way sentiment classiffication (positive, negative and neutral), subjectivity classifiation of documents and sentences, and to the extraction of opinion holders and opinion targets. Experimental results suggest that the approach is often a viable alternative to supervised systems, especially when applied to large document collections
Efficient Utilization of Dependency Pattern and Sequential Covering for Aspect Extraction Rule Learning
The use of dependency rules for aspect extraction tasks in aspect-based sentiment analysis is a promising approach. One problem with this approach is incomplete rules. This paper presents an aspect extraction rule learning method that combines dependency rules with the Sequential Covering algorithm. Sequential Covering is known for its characteristics in constructing rules that increase positive examples covered and decrease negative ones. This property is vital to make sure that the rule set used has high performance, but not inevitably high coverage, which is a characteristic of the aspect extraction task. To test the new method, four datasets were used from four product domains and three baselines: Double Propagation, Aspectator, and a previous work by the authors. The results show that the proposed approach performed better than the three baseline methods for the F-measure metric, with the highest F-measure value at 0.633
Latent Opinions Transfer Network for Target-Oriented Opinion Words Extraction
Target-oriented opinion words extraction (TOWE) is a new subtask of ABSA,
which aims to extract the corresponding opinion words for a given opinion
target in a sentence. Recently, neural network methods have been applied to
this task and achieve promising results. However, the difficulty of annotation
causes the datasets of TOWE to be insufficient, which heavily limits the
performance of neural models. By contrast, abundant review sentiment
classification data are easily available at online review sites. These reviews
contain substantial latent opinions information and semantic patterns. In this
paper, we propose a novel model to transfer these opinions knowledge from
resource-rich review sentiment classification datasets to low-resource task
TOWE. To address the challenges in the transfer process, we design an effective
transformation method to obtain latent opinions, then integrate them into TOWE.
Extensive experimental results show that our model achieves better performance
compared to other state-of-the-art methods and significantly outperforms the
base model without transferring opinions knowledge. Further analysis validates
the effectiveness of our model.Comment: Accepted by the 34th AAAI Conference on Artificial Intelligence (AAAI
2020
Multilingual opinion mining
170 p.Cada día se genera gran cantidad de texto en diferentes medios online. Gran parte de ese texto contiene opiniones acerca de multitud de entidades, productos, servicios, etc. Dada la creciente necesidad de disponer de medios automatizados para analizar, procesar y explotar esa información, las técnicas de análisis de sentimiento han recibido gran cantidad de atención por parte de la industria y la comunidad científica durante la última década y media. No obstante, muchas de las técnicas empleadas suelen requerir de entrenamiento supervisado utilizando para ello ejemplos anotados manualmente, u otros recursos lingüísticos relacionados con un idioma o dominio de aplicación específicos. Esto limita la aplicación de este tipo de técnicas, ya que dicho recursos y ejemplos anotados no son sencillos de obtener. En esta tesis se explora una serie de métodos para realizar diversos análisis automáticos de texto en el marco del análisis de sentimiento, incluyendo la obtención automática de términos de un dominio, palabras que expresan opinión, polaridad del sentimiento de dichas palabras (positivas o negativas), etc. Finalmente se propone y se evalúa un método que combina representación continua de palabras (continuous word embeddings) y topic-modelling inspirado en la técnica de Latent Dirichlet Allocation (LDA), para obtener un sistema de análisis de sentimiento basado en aspectos (ABSA), que sólo necesita unas pocas palabras semilla para procesar textos de un idioma o dominio determinados. De este modo, la adaptación a otro idioma o dominio se reduce a la traducción de las palabras semilla correspondientes
- …