1,047 research outputs found
Classification of protein interaction sentences via gaussian processes
The increase in the availability of protein interaction studies in textual format coupled with the demand for easier access to the key results has lead to a need for text mining solutions. In the text processing pipeline, classification is a key step for extraction of small sections of relevant text. Consequently, for the task of locating protein-protein interaction sentences, we examine the use of a classifier which has rarely been applied to text, the Gaussian processes (GPs). GPs are a non-parametric probabilistic analogue to the more popular support vector machines (SVMs). We find that GPs outperform the SVM and na\"ive Bayes classifiers on binary sentence data, whilst showing equivalent performance on abstract and multiclass sentence corpora. In addition, the lack of the margin parameter, which requires costly tuning, along with the principled multiclass extensions enabled by the probabilistic framework make GPs an appealing alternative worth of further adoption
Learning Intelligent Dialogs for Bounding Box Annotation
We introduce Intelligent Annotation Dialogs for bounding box annotation. We
train an agent to automatically choose a sequence of actions for a human
annotator to produce a bounding box in a minimal amount of time. Specifically,
we consider two actions: box verification, where the annotator verifies a box
generated by an object detector, and manual box drawing. We explore two kinds
of agents, one based on predicting the probability that a box will be
positively verified, and the other based on reinforcement learning. We
demonstrate that (1) our agents are able to learn efficient annotation
strategies in several scenarios, automatically adapting to the image
difficulty, the desired quality of the boxes, and the detector strength; (2) in
all scenarios the resulting annotation dialogs speed up annotation compared to
manual box drawing alone and box verification alone, while also outperforming
any fixed combination of verification and drawing in most scenarios; (3) in a
realistic scenario where the detector is iteratively re-trained, our agents
evolve a series of strategies that reflect the shifting trade-off between
verification and drawing as the detector grows stronger.Comment: This paper appeared at CVPR 201
Rain Removal in Traffic Surveillance: Does it Matter?
Varying weather conditions, including rainfall and snowfall, are generally
regarded as a challenge for computer vision algorithms. One proposed solution
to the challenges induced by rain and snowfall is to artificially remove the
rain from images or video using rain removal algorithms. It is the promise of
these algorithms that the rain-removed image frames will improve the
performance of subsequent segmentation and tracking algorithms. However, rain
removal algorithms are typically evaluated on their ability to remove synthetic
rain on a small subset of images. Currently, their behavior is unknown on
real-world videos when integrated with a typical computer vision pipeline. In
this paper, we review the existing rain removal algorithms and propose a new
dataset that consists of 22 traffic surveillance sequences under a broad
variety of weather conditions that all include either rain or snowfall. We
propose a new evaluation protocol that evaluates the rain removal algorithms on
their ability to improve the performance of subsequent segmentation, instance
segmentation, and feature tracking algorithms under rain and snow. If
successful, the de-rained frames of a rain removal algorithm should improve
segmentation performance and increase the number of accurately tracked
features. The results show that a recent single-frame-based rain removal
algorithm increases the segmentation performance by 19.7% on our proposed
dataset, but it eventually decreases the feature tracking performance and
showed mixed results with recent instance segmentation methods. However, the
best video-based rain removal algorithm improves the feature tracking accuracy
by 7.72%.Comment: Published in IEEE Transactions on Intelligent Transportation System
Deep Active Learning for Named Entity Recognition
Deep learning has yielded state-of-the-art performance on many natural
language processing tasks including named entity recognition (NER). However,
this typically requires large amounts of labeled data. In this work, we
demonstrate that the amount of labeled training data can be drastically reduced
when deep learning is combined with active learning. While active learning is
sample-efficient, it can be computationally expensive since it requires
iterative retraining. To speed this up, we introduce a lightweight architecture
for NER, viz., the CNN-CNN-LSTM model consisting of convolutional character and
word encoders and a long short term memory (LSTM) tag decoder. The model
achieves nearly state-of-the-art performance on standard datasets for the task
while being computationally much more efficient than best performing models. We
carry out incremental active learning, during the training process, and are
able to nearly match state-of-the-art performance with just 25\% of the
original training data
Learning to Transform Time Series with a Few Examples
We describe a semi-supervised regression algorithm that learns to transform one time series into another time series given examples of the transformation. This algorithm is applied to tracking, where a time series of observations from sensors is transformed to a time series describing the pose of a target. Instead of defining and implementing such transformations for each tracking task separately, our algorithm learns a memoryless transformation of time series from a few example input-output mappings. The algorithm searches for a smooth function that fits the training examples and, when applied to the input time series, produces a time series that evolves according to assumed dynamics. The learning procedure is fast and lends itself to a closed-form solution. It is closely related to nonlinear system identification and manifold learning techniques. We demonstrate our algorithm on the tasks of tracking RFID tags from signal strength measurements, recovering the pose of rigid objects, deformable bodies, and articulated bodies from video sequences. For these tasks, this algorithm requires significantly fewer examples compared to fully-supervised regression algorithms or semi-supervised learning algorithms that do not take the dynamics of the output time series into account
Predicting continuous conflict perception with Bayesian Gaussian processes
Conflict is one of the most important phenomena of social life, but it is still largely neglected by the computing community. This work proposes an approach
that detects common conversational social signals (loudness, overlapping speech,
etc.) and predicts the conflict level perceived by human observers in continuous,
non-categorical terms. The proposed regression approach is fully Bayesian and it
adopts Automatic Relevance Determination to identify the social signals that influence most the outcome of the prediction. The experiments are performed over the SSPNet Conflict Corpus, a publicly available collection of 1430 clips extracted from televised political debates (roughly 12 hours of material for 138 subjects in total). The results show that it is possible to achieve a correlation close to 0.8 between actual and predicted conflict perception
Transfer learning for multi-channel time-series Human Activity Recognition
Abstract for the PHD Thesis
Transfer Learning for Multi-Channel Time-Series Human Activity Recognition
Methods of human activity recognition (HAR) have been developed for the purpose of automatically classifying recordings of human movements into a set of activities. Capturing, evaluating, and analysing sequential data to recognise human activities accurately is critical for many applications in pervasive and ubiquitous computing applications, e.g., in applications such as mobile- or ambient-assisted living, smart-homes, activities of daily living, health support and rehabilitation, sports, automotive surveillance, and industry 4.0. For example, HAR is particularly interesting for optimisation in those industries where manual work remains dominant.
HAR takes as inputs signals from videos or from multi-channel time-series, e.g., human joint measurements from marker-based motion capturing systems and inertial measurements measured by wearables or on-body devices. Wearables have become relevant as they extend the potential of HAR beyond constrained or laboratory settings. This thesis focuses on HAR using multi-channel time-series.
Multi-channel Time-Series HAR is, in general, a challenging classification task. This is because human activities and movements show a large variation. Humans carry out in similar manner activities that are semantically very distinctive; conversely, they carry out similar activities in many different ways. Furthermore, multi-channel Time-Series HAR datasets suffer from the class unbalance problem, with more samples of certain activities than others. This problem strongly depends on the annotation. Moreover, there are non-standard definitions of human activities for annotation.
Methods based on Deep Neural Networks (DNNs) are prevalent for Multi-channel Time-Series HAR. Nevertheless, the performance of DNNs has not significantly increased compared to as other fields such as image classification or segmentation. DNNs present a low sample efficiency as they learn the temporal structure from activities completely from data. Considering supervised DNNs, the scarcity of annotated data is the primary concern. Annotated data from human behaviour is scarce and costly to obtain. The annotation process demands enormous resources. Additionally, annotation reliability varies because they can be subject to human errors or unclear and non-elaborated annotation protocols.
Transfer learning has been used to cope with a limited amount of annotated data, overfitting, zero-shot learning or classification of unseen human activities, and the class-unbalance problem. Transfer learning can alleviate the problem of scarcity of annotated data. Learnt parameters and feature representations from a specific source domain are transferred to a target domain. Transfer learning extends the usability of large annotated data from source domains to related problems.
This thesis proposes a general transfer learning approach to improve automatic multi-channel Time-Series HAR. The proposed transfer learning method combines a semantic attribute representation of activities and a specific deep neural network. It handles situations where the source and target domains differ, i.e., the sensor space and the set of activities change, without needing a large amount of annotated data from the target domain.
The method considers different levels of transferability. First, an architecture handles a variate of dataset configurations in regard to the number of devices and their type; it creates fixed-size representations of sensor recordings that are representative of the human limbs. These networks will process sequences of movements from the human limbs, either from poses or inertial measurements. Second, it introduces a search of semantic attribute representations that favourably represent signal segments for recognising human activities in unknown scenarios, as they only consider annotations of activities, and they lack human-annotated semantic attributes. And third, it covers transferability from data of a variety of source datasets. The method takes advantage of a large human-pose dataset as a source domain, which is created during the develop of this thesis. Furthermore, synthetic-inertial measurements will be derived from sequences of human poses either from a marker-based motion capturing system or video-based HAR and pose-based HAR datasets. The latter will specifically use the annotations of pixel-coordinate of human poses as multi-channel time-series data. Real inertial measurements and these synthetic measurements will then be deployed as a source domain for parameter transfer learning.
Experimentation on different target datasets demonstrates that the proposed transfer learning method improves performance, most evidently when deploying a proportion of their training material. This outcome suggests that the temporal convolutional filters are rather general as they learn local temporal relations of human movements related to the semantic attributes, independent of the number of devices and their type. A human-limb-oriented deep architecture and an evolutionary algorithm provide an out-of-the-shelf predictor of semantic attributes that can be deployed directly on a new target scenario. Very related problems can directly be addressed by manually giving the attribute-to-activity relations without the need for a search throughout an evolutionary algorithm. Besides, the learnt convolutional filters are activity class dependent. Hence, the classification performance on the activities shared among the datasets improves
- …