1,544 research outputs found
Audio Event Detection using Weakly Labeled Data
Acoustic event detection is essential for content analysis and description of
multimedia recordings. The majority of current literature on the topic learns
the detectors through fully-supervised techniques employing strongly labeled
data. However, the labels available for majority of multimedia data are
generally weak and do not provide sufficient detail for such methods to be
employed. In this paper we propose a framework for learning acoustic event
detectors using only weakly labeled data. We first show that audio event
detection using weak labels can be formulated as an Multiple Instance Learning
problem. We then suggest two frameworks for solving multiple-instance learning,
one based on support vector machines, and the other on neural networks. The
proposed methods can help in removing the time consuming and expensive process
of manually annotating data to facilitate fully supervised learning. Moreover,
it can not only detect events in a recording but can also provide temporal
locations of events in the recording. This helps in obtaining a complete
description of the recording and is notable since temporal information was
never known in the first place in weakly labeled data.Comment: ACM Multimedia 201
A Machine Learning Approach For Opinion Holder Extraction In Arabic Language
Opinion mining aims at extracting useful subjective information from reliable
amounts of text. Opinion mining holder recognition is a task that has not been
considered yet in Arabic Language. This task essentially requires deep
understanding of clauses structures. Unfortunately, the lack of a robust,
publicly available, Arabic parser further complicates the research. This paper
presents a leading research for the opinion holder extraction in Arabic news
independent from any lexical parsers. We investigate constructing a
comprehensive feature set to compensate the lack of parsing structural
outcomes. The proposed feature set is tuned from English previous works coupled
with our proposed semantic field and named entities features. Our feature
analysis is based on Conditional Random Fields (CRF) and semi-supervised
pattern recognition techniques. Different research models are evaluated via
cross-validation experiments achieving 54.03 F-measure. We publicly release our
own research outcome corpus and lexicon for opinion mining community to
encourage further research
Classification of Animal Sound Using Convolutional Neural Network
Recently, labeling of acoustic events has emerged as an active topic covering a wide range of applications. High-level semantic inference can be conducted based on main audioeffects to facilitate various content-based applications for analysis, efficient recovery and content management. This paper proposes a flexible Convolutional neural network-based framework for animal audio classification. The work takes inspiration from various deep neural network developed for multimedia classification recently. The model is driven by the ideology of identifying the animal sound in the audio file by forcing the network to pay attention to core audio effect present in the audio to generate Mel-spectrogram. The designed framework achieves an accuracy of 98% while classifying the animal audio on weekly labelled datasets. The state-of-the-art in this research is to build a framework which could even run on the basic machine and do not necessarily require high end devices to run the classification
Learning to Identify Ambiguous and Misleading News Headlines
Accuracy is one of the basic principles of journalism. However, it is
increasingly hard to manage due to the diversity of news media. Some editors of
online news tend to use catchy headlines which trick readers into clicking.
These headlines are either ambiguous or misleading, degrading the reading
experience of the audience. Thus, identifying inaccurate news headlines is a
task worth studying. Previous work names these headlines "clickbaits" and
mainly focus on the features extracted from the headlines, which limits the
performance since the consistency between headlines and news bodies is
underappreciated. In this paper, we clearly redefine the problem and identify
ambiguous and misleading headlines separately. We utilize class sequential
rules to exploit structure information when detecting ambiguous headlines. For
the identification of misleading headlines, we extract features based on the
congruence between headlines and bodies. To make use of the large unlabeled
data set, we apply a co-training method and gain an increase in performance.
The experiment results show the effectiveness of our methods. Then we use our
classifiers to detect inaccurate headlines crawled from different sources and
conduct a data analysis.Comment: Accepted by IJCAI 201
Weakly and Partially Supervised Learning Frameworks for Anomaly Detection
The automatic detection of abnormal events in surveillance footage is still a concern of the
research community. Since protection is the primary purpose of installing video surveillance systems, the monitoring capability to keep public safety, and its rapid response to
satisfy this purpose, is a significant challenge even for humans. Nowadays, human capacity has not kept pace with the increased use of surveillance systems, requiring much
supervision to identify unusual events that could put any person or company at risk, without ignoring the fact that there is a substantial waste of labor and time due to the extremely
low likelihood of occurring anomalous events compared to normal ones. Consequently,
the need for an automatic detection algorithm of abnormal events has become crucial in
video surveillance. Even being in the scope of various research works published in the last
decade, the state-of-the-art performance is still unsatisfactory and far below the required
for an effective deployment of this kind of technology in fully unconstrained scenarios.
Nevertheless, despite all the research done in this area, the automatic detection of abnormal events remains a challenge for many reasons. Starting by environmental diversity, the
complexity of movements resemblance in different actions, crowded scenarios, and taking into account all possible standard patterns to define a normal action is undoubtedly
difficult or impossible. Despite the difficulty of solving these problems, the substantive
problem lies in obtaining sufficient amounts of labeled abnormal samples, which concerning computer vision algorithms, is fundamental. More importantly, obtaining an extensive set of different videos that satisfy the previously mentioned conditions is not a
simple task. In addition to its effort and time-consuming, defining the boundary between
normal and abnormal actions is usually unclear.
Henceforward, in this work, the main objective is to provide several solutions to the
problems mentioned above, by focusing on analyzing previous state-of-the-art methods
and presenting an extensive overview to clarify the concepts employed on capturing normal and abnormal patterns. Also, by exploring different strategies, we were able to develop new approaches that consistently advance the state-of-the-art performance. Moreover, we announce the availability of a new large-scale first of its kind dataset fully annotated at the frame level, concerning a specific anomaly detection event with a wide diversity in fighting scenarios, that can be freely used by the research community. Along with
this document with the purpose of requiring minimal supervision, two different proposals
are described; the first method employs the recent technique of self-supervised learning
to avoid the laborious task of annotation, where the training set is autonomously labeled
using an iterative learning framework composed of two independent experts that feed
data to each other through a Bayesian framework. The second proposal explores a new
method to learn an anomaly ranking model in the multiple instance learning paradigm by
leveraging weakly labeled videos, where the training labels are done at the video-level. The
experiments were conducted in several well-known datasets, and our solutions solidly outperform the state-of-the-art. Additionally, as a proof-of-concept system, we also present the results of collected real-world simulations in different environments to perform a field
test of our learned models.A detecção automática de eventos anómalos em imagens de videovigilância permanece
uma inquietação por parte da comunidade científica. Sendo a proteção o principal
propósito da instalação de sistemas de vigilância, a capacidade de monitorização da segurança pública, e a sua rápida resposta para satisfazer essa finalidade, é uma adversidade
até para o ser humano. Nos dias de hoje, com o aumento do uso de sistemas de videovigilância, a capacidade humana não tem alcançado a cadência necessária, exigindo uma
supervisão exorbitante para a identificação de acontecimentos invulgares que coloquem
uma identidade ou sociedade em risco. O facto da probabilidade de se suceder um incidente ser extremamente reduzida comparada a eventualidades normais, existe um gasto
substancial de tempo de ofício. Consequentemente, a necessidade para um algorítmo de
detecção automática de incidentes tem vindo a ser crucial em videovigilância. Mesmo
sendo alvo de vários trabalhos científicos publicados na última década, o desempenho
do estado-da-arte continua insatisfatório e abaixo do requisitado para uma implementação eficiente deste tipo de tecnologias em ambientes e cenários totalmente espontâneos
e incontinentes. Porém, apesar de toda a investigação realizada nesta área, a automatização de detecção de incidentes é um desafio que perdura por várias razões. Começando
pela diversidade ambiental, a complexidade da semalhança entre movimentos de ações
distintas, cenários de multidões, e ter em conta todos os padrões para definir uma ação
normal, é indiscutivelmente difícil ou impossível. Não obstante a dificuldade de resolução
destes problemas, o obstáculo fundamental consiste na obtenção de um número suficiente
de instâncias classificadas anormais, considerando algoritmos de visão computacional é
essencial. Mais importante ainda, obter um vasto conjunto de diferentes vídeos capazes de
satisfazer as condições previamente mencionadas, não é uma tarefa simples. Em adição
ao esforço e tempo despendido, estabelecer um limite entre ações normais e anormais é
frequentemente indistinto.
Tendo estes aspetos em consideração, neste trabalho, o principal objetivo é providenciar diversas soluções para os problemas previamente mencionados, concentrando na
análise de métodos do estado-da-arte e apresentando uma visão abrangente dos mesmos
para clarificar os conceitos aplicados na captura de padrões normais e anormais. Inclusive, a exploração de diferentes estratégias habilitou-nos a desenvolver novas abordagens
que aprimoram consistentemente o desempenho do estado-da-arte. Por último, anunciamos a disponibilidade de um novo conjunto de dados, em grande escala, totalmente anotado ao nível da frame em relação à detecção de anomalias em um evento específico com
uma vasta diversidade em cenários de luta, podendo ser livremente utilizado pela comunidade científica. Neste documento, com o propósito de requerer o mínimo de supervisão,
são descritas duas propostas diferentes; O primeiro método põe em prática a recente técnica de aprendizagem auto-supervisionada para evitar a árdua tarefa de anotação, onde o
conjunto de treino é classificado autonomamente usando uma estrutura de aprendizagem
iterativa composta por duas redes neuronais independentes que fornecem dados entre si através de uma estrutura Bayesiana. A segunda proposta explora um novo método para
aprender um modelo de classificação de anomalias no paradigma multiple-instance learning manuseando vídeos fracamente anotados, onde a classificação do conjunto de treino
é feita ao nível do vídeo. As experiências foram concebidas em vários conjuntos de dados,
e as nossas soluções superam consolidamente o estado-da-arte. Adicionalmente, como
sistema de prova de conceito, apresentamos os resultados da execução do nosso modelo
em simulações reais em diferentes ambientes
- …