1,954 research outputs found
Fractals in the Nervous System: conceptual Implications for Theoretical Neuroscience
This essay is presented with two principal objectives in mind: first, to
document the prevalence of fractals at all levels of the nervous system, giving
credence to the notion of their functional relevance; and second, to draw
attention to the as yet still unresolved issues of the detailed relationships
among power law scaling, self-similarity, and self-organized criticality. As
regards criticality, I will document that it has become a pivotal reference
point in Neurodynamics. Furthermore, I will emphasize the not yet fully
appreciated significance of allometric control processes. For dynamic fractals,
I will assemble reasons for attributing to them the capacity to adapt task
execution to contextual changes across a range of scales. The final Section
consists of general reflections on the implications of the reviewed data, and
identifies what appear to be issues of fundamental importance for future
research in the rapidly evolving topic of this review
Weakly and Partially Supervised Learning Frameworks for Anomaly Detection
The automatic detection of abnormal events in surveillance footage is still a concern of the
research community. Since protection is the primary purpose of installing video surveillance systems, the monitoring capability to keep public safety, and its rapid response to
satisfy this purpose, is a significant challenge even for humans. Nowadays, human capacity has not kept pace with the increased use of surveillance systems, requiring much
supervision to identify unusual events that could put any person or company at risk, without ignoring the fact that there is a substantial waste of labor and time due to the extremely
low likelihood of occurring anomalous events compared to normal ones. Consequently,
the need for an automatic detection algorithm of abnormal events has become crucial in
video surveillance. Even being in the scope of various research works published in the last
decade, the state-of-the-art performance is still unsatisfactory and far below the required
for an effective deployment of this kind of technology in fully unconstrained scenarios.
Nevertheless, despite all the research done in this area, the automatic detection of abnormal events remains a challenge for many reasons. Starting by environmental diversity, the
complexity of movements resemblance in different actions, crowded scenarios, and taking into account all possible standard patterns to define a normal action is undoubtedly
difficult or impossible. Despite the difficulty of solving these problems, the substantive
problem lies in obtaining sufficient amounts of labeled abnormal samples, which concerning computer vision algorithms, is fundamental. More importantly, obtaining an extensive set of different videos that satisfy the previously mentioned conditions is not a
simple task. In addition to its effort and time-consuming, defining the boundary between
normal and abnormal actions is usually unclear.
Henceforward, in this work, the main objective is to provide several solutions to the
problems mentioned above, by focusing on analyzing previous state-of-the-art methods
and presenting an extensive overview to clarify the concepts employed on capturing normal and abnormal patterns. Also, by exploring different strategies, we were able to develop new approaches that consistently advance the state-of-the-art performance. Moreover, we announce the availability of a new large-scale first of its kind dataset fully annotated at the frame level, concerning a specific anomaly detection event with a wide diversity in fighting scenarios, that can be freely used by the research community. Along with
this document with the purpose of requiring minimal supervision, two different proposals
are described; the first method employs the recent technique of self-supervised learning
to avoid the laborious task of annotation, where the training set is autonomously labeled
using an iterative learning framework composed of two independent experts that feed
data to each other through a Bayesian framework. The second proposal explores a new
method to learn an anomaly ranking model in the multiple instance learning paradigm by
leveraging weakly labeled videos, where the training labels are done at the video-level. The
experiments were conducted in several well-known datasets, and our solutions solidly outperform the state-of-the-art. Additionally, as a proof-of-concept system, we also present the results of collected real-world simulations in different environments to perform a field
test of our learned models.A detecção automática de eventos anómalos em imagens de videovigilância permanece
uma inquietação por parte da comunidade cientÃfica. Sendo a proteção o principal
propósito da instalação de sistemas de vigilância, a capacidade de monitorização da segurança pública, e a sua rápida resposta para satisfazer essa finalidade, é uma adversidade
até para o ser humano. Nos dias de hoje, com o aumento do uso de sistemas de videovigilância, a capacidade humana não tem alcançado a cadência necessária, exigindo uma
supervisão exorbitante para a identificação de acontecimentos invulgares que coloquem
uma identidade ou sociedade em risco. O facto da probabilidade de se suceder um incidente ser extremamente reduzida comparada a eventualidades normais, existe um gasto
substancial de tempo de ofÃcio. Consequentemente, a necessidade para um algorÃtmo de
detecção automática de incidentes tem vindo a ser crucial em videovigilância. Mesmo
sendo alvo de vários trabalhos cientÃficos publicados na última década, o desempenho
do estado-da-arte continua insatisfatório e abaixo do requisitado para uma implementação eficiente deste tipo de tecnologias em ambientes e cenários totalmente espontâneos
e incontinentes. Porém, apesar de toda a investigação realizada nesta área, a automatização de detecção de incidentes é um desafio que perdura por várias razões. Começando
pela diversidade ambiental, a complexidade da semalhança entre movimentos de ações
distintas, cenários de multidões, e ter em conta todos os padrões para definir uma ação
normal, é indiscutivelmente difÃcil ou impossÃvel. Não obstante a dificuldade de resolução
destes problemas, o obstáculo fundamental consiste na obtenção de um número suficiente
de instâncias classificadas anormais, considerando algoritmos de visão computacional é
essencial. Mais importante ainda, obter um vasto conjunto de diferentes vÃdeos capazes de
satisfazer as condições previamente mencionadas, não é uma tarefa simples. Em adição
ao esforço e tempo despendido, estabelecer um limite entre ações normais e anormais é
frequentemente indistinto.
Tendo estes aspetos em consideração, neste trabalho, o principal objetivo é providenciar diversas soluções para os problemas previamente mencionados, concentrando na
análise de métodos do estado-da-arte e apresentando uma visão abrangente dos mesmos
para clarificar os conceitos aplicados na captura de padrões normais e anormais. Inclusive, a exploração de diferentes estratégias habilitou-nos a desenvolver novas abordagens
que aprimoram consistentemente o desempenho do estado-da-arte. Por último, anunciamos a disponibilidade de um novo conjunto de dados, em grande escala, totalmente anotado ao nÃvel da frame em relação à detecção de anomalias em um evento especÃfico com
uma vasta diversidade em cenários de luta, podendo ser livremente utilizado pela comunidade cientÃfica. Neste documento, com o propósito de requerer o mÃnimo de supervisão,
são descritas duas propostas diferentes; O primeiro método põe em prática a recente técnica de aprendizagem auto-supervisionada para evitar a árdua tarefa de anotação, onde o
conjunto de treino é classificado autonomamente usando uma estrutura de aprendizagem
iterativa composta por duas redes neuronais independentes que fornecem dados entre si através de uma estrutura Bayesiana. A segunda proposta explora um novo método para
aprender um modelo de classificação de anomalias no paradigma multiple-instance learning manuseando vÃdeos fracamente anotados, onde a classificação do conjunto de treino
é feita ao nÃvel do vÃdeo. As experiências foram concebidas em vários conjuntos de dados,
e as nossas soluções superam consolidamente o estado-da-arte. Adicionalmente, como
sistema de prova de conceito, apresentamos os resultados da execução do nosso modelo
em simulações reais em diferentes ambientes
Unsupervised representation learning in interactive environments
Extraire une représentation de tous les facteurs de haut niveau de l'état d'un agent à partir d'informations sensorielles de bas niveau est une tâche importante, mais difficile, dans l'apprentissage automatique. Dans ce memoire, nous explorerons plusieurs approches non supervisées pour apprendre ces représentations. Nous appliquons et analysons des méthodes d'apprentissage de représentations non supervisées existantes dans des environnements d'apprentissage par renforcement, et nous apportons notre propre suite d'évaluations et notre propre méthode novatrice d'apprentissage de représentations d'état.
Dans le premier chapitre de ce travail, nous passerons en revue et motiverons l'apprentissage non supervisé de représentations pour l'apprentissage automatique en général et pour l'apprentissage par renforcement. Nous introduirons ensuite un sous-domaine relativement nouveau de l'apprentissage de représentations : l'apprentissage auto-supervisé. Nous aborderons ensuite deux approches fondamentales de l'apprentissage de représentations, les méthodes génératives et les méthodes discriminatives. Plus précisément, nous nous concentrerons sur une collection de méthodes discriminantes d'apprentissage de représentations, appelées méthodes contrastives d'apprentissage de représentations non supervisées (CURL). Nous terminerons le premier chapitre en détaillant diverses approches pour évaluer l'utilité des représentations.
Dans le deuxième chapitre, nous présenterons un article de workshop dans lequel nous évaluons un ensemble de méthodes d'auto-supervision standards pour les problèmes d'apprentissage par renforcement. Nous découvrons que la performance de ces représentations dépend fortement de la dynamique et de la structure de l'environnement. À ce titre, nous déterminons qu'une étude plus systématique des environnements et des méthodes est nécessaire.
Notre troisième chapitre couvre notre deuxième article, Unsupervised State Representation Learning in Atari, où nous essayons d'effectuer une étude plus approfondie des méthodes d'apprentissage de représentations en apprentissage par renforcement, comme expliqué dans le deuxième chapitre. Pour faciliter une évaluation plus approfondie des représentations en apprentissage par renforcement, nous introduisons une suite de 22 jeux Atari entièrement labellisés. De plus, nous choisissons de comparer les méthodes d'apprentissage de représentations de façon plus systématique, en nous concentrant sur une comparaison entre méthodes génératives et méthodes contrastives, plutôt que les méthodes générales du deuxième chapitre choisies de façon moins systématique. Enfin, nous introduisons une nouvelle méthode contrastive, ST-DIM, qui excelle sur ces 22 jeux Atari.Extracting a representation of all the high-level factors of an agent’s state from level-level sensory information is an important, but challenging task in machine learning. In this thesis, we will explore several unsupervised approaches for learning these state representations. We apply and analyze existing unsupervised representation learning methods in reinforcement learning environments, as well as contribute our own evaluation benchmark and our own novel state representation learning method.
In the first chapter, we will overview and motivate unsupervised representation learning for machine learning in general and for reinforcement learning. We will then introduce a relatively new subfield of representation learning: self-supervised learning. We will then cover two core representation learning approaches, generative methods and discriminative methods. Specifically, we will focus on a collection of discriminative representation learning methods called contrastive unsupervised representation learning (CURL) methods. We will close the first chapter by detailing various approaches for evaluating the usefulness of representations.
In the second chapter, we will present a workshop paper, where we evaluate a handful of off-the-shelf self-supervised methods in reinforcement learning problems. We discover that the performance of these representations depends heavily on the dynamics and visual structure of the environment. As such, we determine that a more systematic study of environments and methods is required.
Our third chapter covers our second article, Unsupervised State Representation Learning in Atari, where we try to execute a more thorough study of representation learning methods in RL as motivated by the second chapter. To facilitate a more thorough evaluation of representations in RL we introduce a benchmark of 22 fully labelled Atari games. In addition, we choose the representation learning methods for comparison in a more systematic way by focusing on comparing generative methods with contrastive methods, instead of the less systematically chosen off-the-shelf methods from the second chapter. Finally, we introduce a new contrastive method, ST-DIM, which excels at the 22 Atari games
A new framework for deep learning video based Human Action Recognition on the edge
Nowadays, video surveillance systems are commonly found in most public and private spaces. These systems typically consist of a network of cameras that feed into a central node. However, the processing aspect is evolving towards distributed approaches, leveraging edge-computing. These distributed systems are capable of effectively addressing the detection of people or events at each individual node. Most of these systems, rely on the use of deep-learning and segmentation algorithms which enable them to achieve high performance, but usually with a significant computational cost, hindering real-time execution. This paper presents an approach for people detection and action recognition in the wild, optimized for running on the edge, and that is able to work in real-time, in an embedded platform. Human Action Recognition (HAR) is performed by using a Recurrent Neural Network (RNN), specifically a Long Short-Term Memory (LSTM). The input to the LSTM is an ad-hoc, lightweight feature vector obtained from the bounding box of each detected person in the video surveillance image. The resulting system is highly portable and easily scalable, providing a powerful tool for real-world video surveillance applications (in the wild and real-time action recognition). The proposal has been exhaustively evaluated and compared against other state-of-the-art (SOTA) proposals in five datasets, including four widely used (KTH, WEIZMAN, WVU, IXMAX) and a novel one (GBA) recorded in the wild, that includes several people performing different actions simultaneously. The obtained results validate the proposal, since it achieves SOTA accuracy within a much more complicated video surveillance real scenario, and using a lightweight embedded hardware.European CommissionAgencia Estatal de InvestigaciónUniversidad de Alcal
- …