1,046 research outputs found
Discriminatively Trained Latent Ordinal Model for Video Classification
We study the problem of video classification for facial analysis and human
action recognition. We propose a novel weakly supervised learning method that
models the video as a sequence of automatically mined, discriminative
sub-events (eg. onset and offset phase for "smile", running and jumping for
"highjump"). The proposed model is inspired by the recent works on Multiple
Instance Learning and latent SVM/HCRF -- it extends such frameworks to model
the ordinal aspect in the videos, approximately. We obtain consistent
improvements over relevant competitive baselines on four challenging and
publicly available video based facial analysis datasets for prediction of
expression, clinical pain and intent in dyadic conversations and on three
challenging human action datasets. We also validate the method with qualitative
results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text
overlap with arXiv:1604.0150
Semi-WTC: A Practical Semi-supervised Framework for Attack Categorization through Weight-Task Consistency
Supervised learning has been widely used for attack categorization, requiring
high-quality data and labels. However, the data is often imbalanced and it is
difficult to obtain sufficient annotations. Moreover, supervised models are
subject to real-world deployment issues, such as defending against unseen
artificial attacks. To tackle the challenges, we propose a semi-supervised
fine-grained attack categorization framework consisting of an encoder and a
two-branch structure and this framework can be generalized to different
supervised models. The multilayer perceptron with residual connection is used
as the encoder to extract features and reduce the complexity. The Recurrent
Prototype Module (RPM) is proposed to train the encoder effectively in a
semi-supervised manner. To alleviate the data imbalance problem, we introduce
the Weight-Task Consistency (WTC) into the iterative process of RPM by
assigning larger weights to classes with fewer samples in the loss function. In
addition, to cope with new attacks in real-world deployment, we propose an
Active Adaption Resampling (AAR) method, which can better discover the
distribution of unseen sample data and adapt the parameters of encoder.
Experimental results show that our model outperforms the state-of-the-art
semi-supervised attack detection methods with a 3% improvement in
classification accuracy and a 90% reduction in training time.Comment: Tech repor
The iNaturalist Species Classification and Detection Dataset
Existing image classification datasets used in computer vision tend to have a
uniform distribution of images across object categories. In contrast, the
natural world is heavily imbalanced, as some species are more abundant and
easier to photograph than others. To encourage further progress in challenging
real world conditions we present the iNaturalist species classification and
detection dataset, consisting of 859,000 images from over 5,000 different
species of plants and animals. It features visually similar species, captured
in a wide variety of situations, from all over the world. Images were collected
with different camera types, have varying image quality, feature a large class
imbalance, and have been verified by multiple citizen scientists. We discuss
the collection of the dataset and present extensive baseline experiments using
state-of-the-art computer vision classification and detection models. Results
show that current non-ensemble based methods achieve only 67% top one
classification accuracy, illustrating the difficulty of the dataset.
Specifically, we observe poor results for classes with small numbers of
training examples suggesting more attention is needed in low-shot learning.Comment: CVPR 201
Multiclass insect counting through deep learning-based density maps estimation
The use of digital technologies and artificial intelligence techniques for the automation of some visual assessment processes in agriculture is currently a reality. Image-based, and recently deep learning-based systems are being used in several applications. Main challenge of these applications is to achieve a correct performance in real field conditions over images that are usually acquired with mobile devices and thus offer limited quality. Plagues control is a problem to be tackled in the field. Pest management strategies relies on the identification of the level of infestation. This degree of infestation is established through a counting task manually done by the field researcher so far.
Current models were not able to appropriately count due to the small size of the insects and on the last year we presented a density map based algorithm that superseded state of the art methods for a single insect type. In this paper, we extend previous work into a multiclass and multi-stadia approach. Concretely, the proposed algorithm has been tested in two use cases: on the one hand, it counts five different types of adult individuals over multiple crop leaves; and on the other hand, it identifies four different stages for immatures over 2-cm leaf disks. In these leaf disks, some of the species are in different stadia being some of them micron size and difficult to be identified even for the non-expert user.
The proposed method achieves good results in both cases. The model for counting adult insects in a leaf achieves a RMSE ranging from 0.89 to 4.47, MAE ranging from 0.40 to 2.15, and R2 ranging from 0.86 to 0.91 for 4 different species in its adult phase (BEMITA, FRANOC, MYZUPE and APHIGO) that may appear together in the same leaf. Besides, for FRANOC, two stadia nymphs and adults are considered. The model developed for counting BEMITA immatures in 2-cm disks obtains R2 values up to 0.98 for big nymphs. This solution was embedded in a docker and can be accessed through an app via REST service in mobile devices. It has been tested in the wild under real conditions in different locations worldwide and over 14 different crops.The authors would like to thank all field researchers that generated the dataset, carried out the annotation process, performed the validation in the wild, and in general, supported the work in Tecnalia and BASF specially to Javier Romero, Carlos Javier Jim ́enez, Amaia Ortiz, Aitor
Alvarez and Jone Echazarra
Deep learning methods for knowledge base population
Knowledge bases store structured information about entities or concepts of the world and can be used in various applications, such as information retrieval or question answering. A major drawback of existing knowledge bases is their incompleteness. In this thesis, we explore deep learning methods for automatically populating them from text, addressing the following tasks: slot filling, uncertainty detection and type-aware relation extraction.
Slot filling aims at extracting information about entities from a large text corpus. The Text Analysis Conference yearly provides new evaluation data in the context of an international shared task. We develop a modular system to address this challenge. It was one of the top-ranked systems in the shared task evaluations in 2015. For its slot filler classification module, we propose contextCNN, a convolutional neural network based on context splitting. It improves the performance of the slot filling system by 5.0% micro and 2.9% macro F1. To train our binary and multiclass classification models, we create a dataset using distant supervision and reduce the number of noisy labels with a self-training strategy. For model optimization and evaluation, we automatically extract a labeled benchmark for slot filler classification from the manual shared task assessments from 2012-2014. We show that results on this benchmark are correlated with slot filling pipeline results with a Pearson's correlation coefficient of 0.89 (0.82) on data from 2013 (2014). The combination of patterns, support vector machines and contextCNN achieves the best results on the benchmark with a micro (macro) F1 of 51% (53%) on test. Finally, we analyze the results of the slot filling pipeline and the impact of its components.
For knowledge base population, it is essential to assess the factuality of the statements extracted from text. From the sentence "Obama was rumored to be born in Kenya", a system should not conclude that Kenya is the place of birth of Obama. Therefore, we address uncertainty detection in the second part of this thesis. We investigate attention-based models and make a first attempt to systematize the attention design space. Moreover, we propose novel attention variants: External attention, which incorporates an external knowledge source, k-max average attention, which only considers the vectors with the k maximum attention weights, and sequence-preserving attention, which allows to maintain order information. Our convolutional neural network with external k-max average attention sets the new state of the art on a Wikipedia benchmark dataset with an F1 score of 68%. To the best of our knowledge, we are the first to integrate an uncertainty detection component into a slot filling pipeline. It improves precision by 1.4% and micro F1 by 0.4%.
In the last part of the thesis, we investigate type-aware relation extraction with neural networks. We compare different models for joint entity and relation classification: pipeline models, jointly trained models and globally normalized models based on structured prediction. First, we show that using entity class prediction scores instead of binary decisions helps relation classification. Second, joint training clearly outperforms pipeline models on a large-scale distantly supervised dataset with fine-grained entity classes. It improves the area under the precision-recall curve from 0.53 to 0.66. Third, we propose a model with a structured prediction output layer, which globally normalizes the score of a triple consisting of the classes of two entities and the relation between them. It improves relation extraction results by 4.4% F1 on a manually labeled benchmark dataset. Our analysis shows that the model learns correct correlations between entity and relation classes. Finally, we are the first to use neural networks for joint entity and relation classification in a slot filling pipeline. The jointly trained model achieves the best micro F1 score with a score of 22% while the neural structured prediction model performs best in terms of macro F1 with a score of 25%
Collaborative-demographic hybrid for financial: product recommendation
Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsDue to the increased availability of mature data mining and analysis technologies supporting CRM
processes, several financial institutions are striving to leverage customer data and integrate insights
regarding customer behaviour, needs, and preferences into their marketing approach. As decision
support systems assisting marketing and commercial efforts, Recommender Systems applied to the
financial domain have been gaining increased attention. This thesis studies a Collaborative-
Demographic Hybrid Recommendation System, applied to the financial services sector, based on real
data provided by a Portuguese private commercial bank. This work establishes a framework to support
account managers’ advice on which financial product is most suitable for each of the bank’s corporate
clients. The recommendation problem is further developed by conducting a performance comparison
for both multi-output regression and multiclass classification prediction approaches. Experimental
results indicate that multiclass architectures are better suited for the prediction task, outperforming
alternative multi-output regression models on the evaluation metrics considered. Withal, multiclass
Feed-Forward Neural Networks, combined with Recursive Feature Elimination, is identified as the topperforming
algorithm, yielding a 10-fold cross-validated F1 Measure of 83.16%, and achieving
corresponding values of Precision and Recall of 84.34%, and 85.29%, respectively. Overall, this study
provides important contributions for positioning the bank’s commercial efforts around customers’
future requirements. By allowing for a better understanding of customers’ needs and preferences, the
proposed Recommender allows for more personalized and targeted marketing contacts, leading to
higher conversion rates, corporate profitability, and customer satisfaction and loyalty
- …