22 research outputs found
Machine Learning Challenges of Biological Factors in Insect Image Data
The BIOSCAN project, led by the International Barcode of Life Consortium,
seeks to study changes in biodiversity on a global scale. One component of the
project is focused on studying the species interaction and dynamics of all
insects. In addition to genetically barcoding insects, over 1.5 million images
per year will be collected, each needing taxonomic classification. With the
immense volume of incoming images, relying solely on expert taxonomists to
label the images would be impossible; however, artificial intelligence and
computer vision technology may offer a viable high-throughput solution.
Additional tasks including manually weighing individual insects to determine
biomass, remain tedious and costly. Here again, computer vision may offer an
efficient and compelling alternative. While the use of computer vision methods
is appealing for addressing these problems, significant challenges resulting
from biological factors present themselves. These challenges are formulated in
the context of machine learning in this paper.Comment: 4 pages, 3 figures. Submitted to the Journal of Computational Vision
and Imaging System
Generative Causal Representation Learning for Out-of-Distribution Motion Forecasting
Conventional supervised learning methods typically assume i.i.d samples and
are found to be sensitive to out-of-distribution (OOD) data. We propose
Generative Causal Representation Learning (GCRL) which leverages causality to
facilitate knowledge transfer under distribution shifts. While we evaluate the
effectiveness of our proposed method in human trajectory prediction models,
GCRL can be applied to other domains as well. First, we propose a novel causal
model that explains the generative factors in motion forecasting datasets using
features that are common across all environments and with features that are
specific to each environment. Selection variables are used to determine which
parts of the model can be directly transferred to a new environment without
fine-tuning. Second, we propose an end-to-end variational learning paradigm to
learn the causal mechanisms that generate observations from features. GCRL is
supported by strong theoretical results that imply identifiability of the
causal model under certain assumptions. Experimental results on synthetic and
real-world motion forecasting datasets show the robustness and effectiveness of
our proposed method for knowledge transfer under zero-shot and low-shot
settings by substantially outperforming the prior motion forecasting models on
out-of-distribution prediction. Our code is available at
https://github.com/sshirahmad/GCRL
A Step Towards Worldwide Biodiversity Assessment: The BIOSCAN-1M Insect Dataset
In an effort to catalog insect biodiversity, we propose a new large dataset
of hand-labelled insect images, the BIOSCAN-Insect Dataset. Each record is
taxonomically classified by an expert, and also has associated genetic
information including raw nucleotide barcode sequences and assigned barcode
index numbers, which are genetically-based proxies for species classification.
This paper presents a curated million-image dataset, primarily to train
computer-vision models capable of providing image-based taxonomic assessment,
however, the dataset also presents compelling characteristics, the study of
which would be of interest to the broader machine learning community. Driven by
the biological nature inherent to the dataset, a characteristic long-tailed
class-imbalance distribution is exhibited. Furthermore, taxonomic labelling is
a hierarchical classification scheme, presenting a highly fine-grained
classification problem at lower levels. Beyond spurring interest in
biodiversity research within the machine learning community, progress on
creating an image-based taxonomic classifier will also further the ultimate
goal of all BIOSCAN research: to lay the foundation for a comprehensive survey
of global biodiversity. This paper introduces the dataset and explores the
classification task through the implementation and analysis of a baseline
classifier
Action in Mind : A Neural Network Approach to Action Recognition and Segmentation
Recognizing and categorizing human actions is an important task with applications in various fields such as human-robot interaction, video analysis, surveillance, video retrieval, health care system and entertainment industry.This thesis presents a novel computational approach for human action recognition through different implementations of multi-layer architectures based on artificial neural networks. Each system level development is designed to solve different aspects of the action recognition problem including online real-time processing, action segmentation and the involvement of objects. The analysis of the experimentalresults are illustrated and described in six articles.The proposed action recognition architecture of this thesis is composed of several processing layers including a preprocessing layer, an ordered vector representation layer and three layers of neural networks.It utilizes self-organizing neural networks such as Kohonen feature maps and growing grids as the main neural network layers. Thus the architecture presents a biological plausible approach with certain features such as topographic organization of the neurons, lateral interactions, semi-supervised learning and the ability to represent high dimensional input space in lower dimensional maps.For each level of development the system is trained with the input data consisting of consecutive 3D body postures and tested with generalized input data that the system has never met before. The experimental results of different system level developments show that the system performs well with quite high accuracy for recognizing human actions
Online recognition of unsegmented actions with hierarchical SOM architecture
Automatic recognition of an online series of unsegmented actions requires a method for segmentation that determines when an action starts and when it ends. In this paper, a novel approach for recognizing unsegmented actions in online test experiments is proposed. The method uses self-organizing neural networks to build a three-layer cognitive architecture. The unique features of an action sequence are represented as a series of elicited key activations by the first-layer self-organizing map. An average length of a key activation vector is calculated for all action sequences in a training set and adjusted in learning trials to generate input patterns to the second-layer self-organizing map. The pattern vectors are clustered in the second layer, and the clusters are then labeled by an action identity in the third layer neural network. The experiment results show that although the performance drops slightly in online experiments compared to the offline tests, the ability of the proposed architecture to deal with the unsegmented action sequences as well as the online performance makes the system more plausible and practical in real-case scenarios
Hierarchical growing grid networks for skeleton based action recognition
In this paper, a novel cognitive architecture for action recognition is developed by applying layers of growing grid neural networks. Using these layers makes the system capable of automatically arranging its representational structure. In addition to the expansion of the neural map during the growth phase, the system is provided with a prior knowledge of the input space, which increases the processing speed of the learning phase. Apart from two layers of growing grid networks the architecture is composed of a preprocessing layer, an ordered vector representation layer and a one-layer supervised neural network. These layers are designed to solve the action recognition problem. The first-layer growing grid receives the input data of human actions and the neural map generates an action pattern vector representing each action sequence by connecting the elicited activation of the trained map. The pattern vectors are then sent to the ordered vector representation layer to build the time-invariant input vectors of key activations for the second-layer growing grid. The second-layer growing grid categorizes the input vectors to the corresponding action clusters/sub-clusters and finally the one-layer supervised neural network labels the shaped clusters with action labels. Three experiments using different datasets of actions show that the system is capable of learning to categorize the actions quickly and efficiently. The performance of the growing grid architecture is compared with the results from a system based on Self-Organizing Maps, showing that the growing grid architecture performs significantly superior on the action recognition tasks
Predicting the intended action using internal simulation of perception
This article proposes an architecture, which allows the prediction of intention by internally simulating perceptual states represented by action pattern vectors. To this end, associative self-organising neural networks (A-SOM) is utilised to build a hierarchical cognitive architecture for recognition and simulation of the skeleton based human actions. The abilities of the proposed architecture in recognising and predicting actions is evaluated in experiments using three different datasets of 3D actions. Based on the experiments of this article, applying internally simulated perceptual states represented by action pattern vectors improves the performance of the recognition task in all experiments. Furthermore, internal simulation of perception addresses the problem of having limited access to the sensory input, and also the future prediction of the consecutive perceptual sequences. The performance of the system is compared and discussed with similar architecture using self-organizing neural networks (SOM)
An observer based fault detection and isolation in quadruple-tank process
In this study, a new strategy for fault detection and isolation is presented. This strategy is based on the design of a Lüneburg observer which is implemented via pole placement using linear matrix inequalities. Two residuals are formulated based on the state estimation error in order to be utilized in detecting and isolating faults happened on the system. Fault detection problem solves by changes occur in the residual value and fault isolation is done through determining threshold on residuals according to system behavior in faulty condition. The procedure performs in four simulations steps in which there are certain numbers of faults happen in the system in each step. This method is validated in simulation on a quadruple tank process while each faulty condition is considered as a leak at the bottom of a tank in the process. This can lead to an undesirable flow of liquid out of the tank which results to a decrease in tank's level. The simulation results represented in the paper shows the applicability of this strategy
Action Recognition Online with Hierarchical Self-Organizing Maps
We present a hierarchical self-organizing map based system for online recognition of human actions. We have made a first evaluation of our system by training it on two different sets of recorded human actions, one set containing manner actions and one set containing result actions, and then tested it by letting a human performer carry out the actions online in real time in front of the system’s 3D-camera. The system successfully recognized more than 94% of the manner actions and most of the result actions carried out by the human performer