5 research outputs found

    Multilabel Classification for News Article Using Long Short-Term Memory

    Get PDF
    oai:ojs.sjia.ilkom.unsri.ac.id:article/14Multilabel text classification is a task of categorizing text into one or more categories. Like other machine learning, multilabel classification performance is limited when there is small labeled data and leads to the difficulty of capturing semantic relationships. In this case, it requires a multi-label text classification technique that can group four labels from news articles. Deep Learning is a proposed method for solving problems in multi-label text classification techniques. By comparing the seven proposed Long Short-Term Memory (LSTM) models with large-scale datasets by dividing 4 LSTM models with 1 layer, 2 layer and 3-layer LSTM and Bidirectional LSTM to show that LSTM can achieve good performance in multi-label text classification. The results show that the evaluation of the performance of the 2-layer LSTM model in the training process obtained an accuracy of 96 with the highest testing accuracy of all models at 94.3. The performance results for model 3 with 1-layer LSTM obtained the average value of precision, recall, and f1-score equal to the 94 training process accuracy. This states that model 3 with 1-layer LSTM both training and testing process is better.  The comparison among seven proposed LSTM models shows that model 3 with 1 layer LSTM is the best model

    CorrNet: Fine-grained emotion recognition for video watching using wearable physiological sensors

    Get PDF
    Recognizing user emotions while they watch short-form videos anytime and anywhere is essential for facilitating video content customization and personalization. However, most works either classify a single emotion per video stimuli, or are restricted to static, desktop environments. To address this, we propose a correlation-based emotion recognition algorithm (CorrNet) to recognize the valence and arousal (V-A) of each instance (fine-grained segment of signals) using only wearable, physiological signals (e.g., electrodermal activity, heart rate). CorrNet takes advantage of features both inside each instance (intra-modality features) and between different instances for the same video stimuli (correlation-based features). We first test our approach on an indoor-desktop affect dataset (CASE), and thereafter on an outdoor-mobile affect dataset (MERCA) which we collected using a smart wristband and wearable eyetracker. Results show that for subject-independent binary classification (high-low), CorrNet yields promising recognition accuracies: 76.37% and 74.03% for V-A on CASE, and 70.29% and 68.15% for V-A on MERCA. Our findings show: (1) instance segment lengths between 1–4 s result in highest recognition accuracies (2) accuracies between laboratory-grade and wearable sensors are comparable, even under low sampling rates (≤64 Hz) (3) large amounts of neu-tral V-A labels, an artifact of continuous affect annotation, result in varied recognition performance

    Abordagens multimodais com utilização de deep learning e unimodais com aprendizado de máquina no reconhecimento de emoções em músicas

    Get PDF
    Orientadora: Profa. Dra. Denise Fukumi TsunodaCoorientadora: Profa. Dra. Marília Nunes SilvaTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Sociais Aplicadas, Programa de Pós-Graduação em Gestão da Informação. Defesa : Curitiba, 30/08/2023Inclui referênciasResumo: Esta pesquisa foi realizada com base na compreensão da relevância da relação entre música e emoção na vida humana, abrangendo desde o lazer até estudos científicos. Embora a organização emocional da música seja intrínseca à natureza humana, o reconhecimento automático de emoções musicais enfrenta desafios, configurando-se como um tema complexo na recuperação de informações musicais. Nesse contexto, o propósito central desta tese foi investigar se a adoção de abordagens multimodais, envolvendo informações de diferentes fontes e arquiteturas de deep learning, pode superar o desempenho das abordagens unimodais baseadas em algoritmos de aprendizado de máquina. Essa indagação emergiu da carência de estratégias multimodais na área e da perspectiva de melhoria nos resultados de classificação reportados em pesquisas correlatas. Com cinco objetivos específicos, esta pesquisa abordou a identificação de um modelo cognitivo de emoções, definição de modalidades, construção de bases de dados multimodais, comparação de arquiteturas de deep learning e avaliação comparativa das abordagens multimodais com abordagens unimodais utilizando algoritmos tradicionais de aprendizado de máquina. A análise dos resultados demonstrou que as abordagens multimodais alcançaram desempenho superior em diversos cenários de classificação, comparadas às estratégias unimodais. Tais resultados contribuem positivamente para a compreensão da eficácia das abordagens multimodais e das arquiteturas de deep learning no reconhecimento de emoções em músicas. Adicionalmente, a pesquisa ressalta a necessidade de atenção aos modelos emocionais e metadados em plataformas online, visando evitar vieses e ruídos. Esta tese oferece contribuições relevantes na área de reconhecimento de emoções em músicas, particularmente no desenvolvimento de bases de dados multimodais, avaliação de arquiteturas de deep learning para problemas tabulares, protocolos de experimentos e abordagens voltadas à cognição musical. A comparação sistemática entre abordagens multimodais e unimodais evidencia as vantagens das primeiras, incentivando novas pesquisas nesse campoAbstract: This research was conducted based on the understanding of the significance of the relationship between music and emotion in human life, spanning from leisure to scientific studies. Although the emotional organization of music is intrinsic to human nature, the automatic recognition of musical emotions faces challenges, manifesting as a complex theme in the retrieval of musical information. Within this context, the central purpose of this thesis was to investigate whether the adoption of multimodal approaches, involving information from different sources and deep learning architectures, can outperform unimodal approaches based on machine learning algorithms. This inquiry arose from the lack of multimodal strategies in the field and the prospect of improvement in classification results reported in related research. With five specific objectives, this research addressed the identification of a cognitive model of emotions, definition of modalities, construction of multimodal databases, comparison of deep learning architectures, and comparative evaluation of multimodal approaches with unimodal approaches using traditional machine learning algorithms. The analysis of results demonstrated that multimodal approaches achieved superior performance in various classification scenarios, compared to unimodal strategies. These findings positively contribute to the understanding of the effectiveness of multimodal approaches and deep learning architectures in the recognition of emotions in music. Additionally, the research emphasizes the need for attention to emotional models and metadata in online platforms, aiming to avoid biases and noise. This thesis offers relevant contributions to the field of music emotion recognition, particularly in the development of multimodal databases, evaluation of deep learning architectures for tabular problems, experimental protocols, and approaches focused on musical cognition. The systematic comparison between multimodal and unimodal approaches highlights the advantages of the former, encouraging new research in this fiel

    Multiple instance learning under real-world conditions

    Get PDF
    Multiple instance learning (MIL) is a form of weakly-supervised learning that deals with data arranged in sets called bags. In MIL problems, a label is provided for bags, but not for each individual instance in the bag. Like other weakly-supervised frameworks, MIL is useful in situations where obtaining labels is costly. It is also useful in applications where instance labels cannot be observed individually. MIL algorithms learn from bags, however, prediction can be performed at instance- and bag-level. MIL has been used in several applications from drug activity prediction to object localization in image. Real-world data poses many challenges to MIL methods. These challenges arise from different problem characteristics that are sometimes not well understood or even completely ignored. This causes MIL methods to perform unevenly and often fail in real-world applications. In this thesis, we propose methods for both classification levels under different working assumptions. These methods are designed to address challenging problem characteristics that arise in real-world applications. As a first contribution, we survey these characteristics that make MIL uniquely challenging. Four categories of characteristics are identified: the prediction level, the composition of bags, the data distribution types and the label ambiguity. Each category is analyzed and related state-of-the-art MIL methods are surveyed. MIL applications are examined in light of these characteristics and extensive experiments are conducted to show how these characteristics affect the performance of MIL methods. From these analyses and experiments, several conclusions are drawn and future research avenues are identified. Then, as a second contribution, we propose a method for bag classification which relies on the identification of positive instances to train an ensemble of instance classifiers. The bag classifier uses the predictions made on instances to infer bag labels. The method identifies positive instances by projecting the instances into random subspaces. Clustering is performed on the data in these subspaces and positive instances are probabilistically identified based on the bag label of instances in clusters. Experiments show that the method achieves state-of-theart performance while being robust to several characteristics identified in the survey. In some applications, the instances cannot be assigned to a positive or negative class. Bag classes are defined by a composition of different types of instances. In such cases, interrelations between instances convey the information used to discriminate between positive and negative bags. As a third contribution, we propose a bag classification method that learns under these conditions. The method is a applied to predict speaker personality from speech signals represented as bags of instances. A sparse dictionary learning algorithm is used to learn a dictionary and encode instances. Encoded instances are embedded in a single feature vector summarizing the speech signal. Experimental results on real-world data reveal that the proposed method yields state-of-the-art accuracy results while requiring less complexity than commonly used methods in the field. Finally, we propose two methods for querying bags in a multiple instance active learning (MIAL) framework. In this framework the objective is to train a reliable instance classifier using a minimal amount of labeled data. Single instance methods are suboptimal is this framework because they do not account the bag structure of MIL. The proposed methods address the problem from different angles. One aims at directly refining the decision boundary, while the other leverage instance and bag labels to query instances in the most promising clusters. Experiments are conducted in an inductive and transductive setting. Results on data from 3 application domains show that leveraging bag structure in this MIAL framework is important to effectively reduce the number of queries necessary to attain a high level of classification accuracy. This thesis shows that real-world MIL problems pose a wide range of challenges. After an in-depth analysis, we show experimentally that these challenges have a profound impact on the performance of MIL algorithms. We propose methods to address some of these challenges and validate them on real-world data sets. We also identify future directions for research and remaining open problems
    corecore