1,544 research outputs found

    Audio Event Detection using Weakly Labeled Data

    Full text link
    Acoustic event detection is essential for content analysis and description of multimedia recordings. The majority of current literature on the topic learns the detectors through fully-supervised techniques employing strongly labeled data. However, the labels available for majority of multimedia data are generally weak and do not provide sufficient detail for such methods to be employed. In this paper we propose a framework for learning acoustic event detectors using only weakly labeled data. We first show that audio event detection using weak labels can be formulated as an Multiple Instance Learning problem. We then suggest two frameworks for solving multiple-instance learning, one based on support vector machines, and the other on neural networks. The proposed methods can help in removing the time consuming and expensive process of manually annotating data to facilitate fully supervised learning. Moreover, it can not only detect events in a recording but can also provide temporal locations of events in the recording. This helps in obtaining a complete description of the recording and is notable since temporal information was never known in the first place in weakly labeled data.Comment: ACM Multimedia 201

    A Machine Learning Approach For Opinion Holder Extraction In Arabic Language

    Full text link
    Opinion mining aims at extracting useful subjective information from reliable amounts of text. Opinion mining holder recognition is a task that has not been considered yet in Arabic Language. This task essentially requires deep understanding of clauses structures. Unfortunately, the lack of a robust, publicly available, Arabic parser further complicates the research. This paper presents a leading research for the opinion holder extraction in Arabic news independent from any lexical parsers. We investigate constructing a comprehensive feature set to compensate the lack of parsing structural outcomes. The proposed feature set is tuned from English previous works coupled with our proposed semantic field and named entities features. Our feature analysis is based on Conditional Random Fields (CRF) and semi-supervised pattern recognition techniques. Different research models are evaluated via cross-validation experiments achieving 54.03 F-measure. We publicly release our own research outcome corpus and lexicon for opinion mining community to encourage further research

    Classification of Animal Sound Using Convolutional Neural Network

    Get PDF
    Recently, labeling of acoustic events has emerged as an active topic covering a wide range of applications. High-level semantic inference can be conducted based on main audioeffects to facilitate various content-based applications for analysis, efficient recovery and content management. This paper proposes a flexible Convolutional neural network-based framework for animal audio classification. The work takes inspiration from various deep neural network developed for multimedia classification recently. The model is driven by the ideology of identifying the animal sound in the audio file by forcing the network to pay attention to core audio effect present in the audio to generate Mel-spectrogram. The designed framework achieves an accuracy of 98% while classifying the animal audio on weekly labelled datasets. The state-of-the-art in this research is to build a framework which could even run on the basic machine and do not necessarily require high end devices to run the classification

    Learning to Identify Ambiguous and Misleading News Headlines

    Full text link
    Accuracy is one of the basic principles of journalism. However, it is increasingly hard to manage due to the diversity of news media. Some editors of online news tend to use catchy headlines which trick readers into clicking. These headlines are either ambiguous or misleading, degrading the reading experience of the audience. Thus, identifying inaccurate news headlines is a task worth studying. Previous work names these headlines "clickbaits" and mainly focus on the features extracted from the headlines, which limits the performance since the consistency between headlines and news bodies is underappreciated. In this paper, we clearly redefine the problem and identify ambiguous and misleading headlines separately. We utilize class sequential rules to exploit structure information when detecting ambiguous headlines. For the identification of misleading headlines, we extract features based on the congruence between headlines and bodies. To make use of the large unlabeled data set, we apply a co-training method and gain an increase in performance. The experiment results show the effectiveness of our methods. Then we use our classifiers to detect inaccurate headlines crawled from different sources and conduct a data analysis.Comment: Accepted by IJCAI 201

    Weakly and Partially Supervised Learning Frameworks for Anomaly Detection

    Get PDF
    The automatic detection of abnormal events in surveillance footage is still a concern of the research community. Since protection is the primary purpose of installing video surveillance systems, the monitoring capability to keep public safety, and its rapid response to satisfy this purpose, is a significant challenge even for humans. Nowadays, human capacity has not kept pace with the increased use of surveillance systems, requiring much supervision to identify unusual events that could put any person or company at risk, without ignoring the fact that there is a substantial waste of labor and time due to the extremely low likelihood of occurring anomalous events compared to normal ones. Consequently, the need for an automatic detection algorithm of abnormal events has become crucial in video surveillance. Even being in the scope of various research works published in the last decade, the state-of-the-art performance is still unsatisfactory and far below the required for an effective deployment of this kind of technology in fully unconstrained scenarios. Nevertheless, despite all the research done in this area, the automatic detection of abnormal events remains a challenge for many reasons. Starting by environmental diversity, the complexity of movements resemblance in different actions, crowded scenarios, and taking into account all possible standard patterns to define a normal action is undoubtedly difficult or impossible. Despite the difficulty of solving these problems, the substantive problem lies in obtaining sufficient amounts of labeled abnormal samples, which concerning computer vision algorithms, is fundamental. More importantly, obtaining an extensive set of different videos that satisfy the previously mentioned conditions is not a simple task. In addition to its effort and time-consuming, defining the boundary between normal and abnormal actions is usually unclear. Henceforward, in this work, the main objective is to provide several solutions to the problems mentioned above, by focusing on analyzing previous state-of-the-art methods and presenting an extensive overview to clarify the concepts employed on capturing normal and abnormal patterns. Also, by exploring different strategies, we were able to develop new approaches that consistently advance the state-of-the-art performance. Moreover, we announce the availability of a new large-scale first of its kind dataset fully annotated at the frame level, concerning a specific anomaly detection event with a wide diversity in fighting scenarios, that can be freely used by the research community. Along with this document with the purpose of requiring minimal supervision, two different proposals are described; the first method employs the recent technique of self-supervised learning to avoid the laborious task of annotation, where the training set is autonomously labeled using an iterative learning framework composed of two independent experts that feed data to each other through a Bayesian framework. The second proposal explores a new method to learn an anomaly ranking model in the multiple instance learning paradigm by leveraging weakly labeled videos, where the training labels are done at the video-level. The experiments were conducted in several well-known datasets, and our solutions solidly outperform the state-of-the-art. Additionally, as a proof-of-concept system, we also present the results of collected real-world simulations in different environments to perform a field test of our learned models.A detecção automática de eventos anómalos em imagens de videovigilância permanece uma inquietação por parte da comunidade científica. Sendo a proteção o principal propósito da instalação de sistemas de vigilância, a capacidade de monitorização da segurança pública, e a sua rápida resposta para satisfazer essa finalidade, é uma adversidade até para o ser humano. Nos dias de hoje, com o aumento do uso de sistemas de videovigilância, a capacidade humana não tem alcançado a cadência necessária, exigindo uma supervisão exorbitante para a identificação de acontecimentos invulgares que coloquem uma identidade ou sociedade em risco. O facto da probabilidade de se suceder um incidente ser extremamente reduzida comparada a eventualidades normais, existe um gasto substancial de tempo de ofício. Consequentemente, a necessidade para um algorítmo de detecção automática de incidentes tem vindo a ser crucial em videovigilância. Mesmo sendo alvo de vários trabalhos científicos publicados na última década, o desempenho do estado-da-arte continua insatisfatório e abaixo do requisitado para uma implementação eficiente deste tipo de tecnologias em ambientes e cenários totalmente espontâneos e incontinentes. Porém, apesar de toda a investigação realizada nesta área, a automatização de detecção de incidentes é um desafio que perdura por várias razões. Começando pela diversidade ambiental, a complexidade da semalhança entre movimentos de ações distintas, cenários de multidões, e ter em conta todos os padrões para definir uma ação normal, é indiscutivelmente difícil ou impossível. Não obstante a dificuldade de resolução destes problemas, o obstáculo fundamental consiste na obtenção de um número suficiente de instâncias classificadas anormais, considerando algoritmos de visão computacional é essencial. Mais importante ainda, obter um vasto conjunto de diferentes vídeos capazes de satisfazer as condições previamente mencionadas, não é uma tarefa simples. Em adição ao esforço e tempo despendido, estabelecer um limite entre ações normais e anormais é frequentemente indistinto. Tendo estes aspetos em consideração, neste trabalho, o principal objetivo é providenciar diversas soluções para os problemas previamente mencionados, concentrando na análise de métodos do estado-da-arte e apresentando uma visão abrangente dos mesmos para clarificar os conceitos aplicados na captura de padrões normais e anormais. Inclusive, a exploração de diferentes estratégias habilitou-nos a desenvolver novas abordagens que aprimoram consistentemente o desempenho do estado-da-arte. Por último, anunciamos a disponibilidade de um novo conjunto de dados, em grande escala, totalmente anotado ao nível da frame em relação à detecção de anomalias em um evento específico com uma vasta diversidade em cenários de luta, podendo ser livremente utilizado pela comunidade científica. Neste documento, com o propósito de requerer o mínimo de supervisão, são descritas duas propostas diferentes; O primeiro método põe em prática a recente técnica de aprendizagem auto-supervisionada para evitar a árdua tarefa de anotação, onde o conjunto de treino é classificado autonomamente usando uma estrutura de aprendizagem iterativa composta por duas redes neuronais independentes que fornecem dados entre si através de uma estrutura Bayesiana. A segunda proposta explora um novo método para aprender um modelo de classificação de anomalias no paradigma multiple-instance learning manuseando vídeos fracamente anotados, onde a classificação do conjunto de treino é feita ao nível do vídeo. As experiências foram concebidas em vários conjuntos de dados, e as nossas soluções superam consolidamente o estado-da-arte. Adicionalmente, como sistema de prova de conceito, apresentamos os resultados da execução do nosso modelo em simulações reais em diferentes ambientes
    corecore