753 research outputs found
Evaluation of Output Embeddings for Fine-Grained Image Classification
Image classification has advanced significantly in recent years with the
availability of large-scale image sets. However, fine-grained classification
remains a major challenge due to the annotation cost of large numbers of
fine-grained categories. This project shows that compelling classification
performance can be achieved on such categories even without labeled training
data. Given image and class embeddings, we learn a compatibility function such
that matching embeddings are assigned a higher score than mismatching ones;
zero-shot classification of an image proceeds by finding the label yielding the
highest joint compatibility score. We use state-of-the-art image features and
focus on different supervised attributes and unsupervised output embeddings
either derived from hierarchies or learned from unlabeled text corpora. We
establish a substantially improved state-of-the-art on the Animals with
Attributes and Caltech-UCSD Birds datasets. Most encouragingly, we demonstrate
that purely unsupervised output embeddings (learned from Wikipedia and improved
with fine-grained text) achieve compelling results, even outperforming the
previous supervised state-of-the-art. By combining different output embeddings,
we further improve results.Comment: @inproceedings {ARWLS15, title = {Evaluation of Output Embeddings for
Fine-Grained Image Classification}, booktitle = {IEEE Computer Vision and
Pattern Recognition}, year = {2015}, author = {Zeynep Akata and Scott Reed
and Daniel Walter and Honglak Lee and Bernt Schiele}
A knowledge-light approach to personalised and open-ended human activity recognition.
Human Activity Recognition (HAR) is a core component of clinical decision support systems that rely on activity monitoring for self-management of chronic conditions such as Musculoskeletal Disorders. Deployment success of such applications in part depend on their ability to adapt to individual variations in human movement and to facilitate a range of human activity classes. Research in personalised HAR aims to learn models that are sensitive to the subtle nuances in human movement whilst Open-ended HAR learns models that can recognise activity classes out of the pre-defined set available at training. Current approaches to personalised HAR impose a data collection burden on the end user; whilst Open-ended HAR algorithms are heavily reliant on intermediary-level class descriptions. Instead of these 'knowledge-intensive' HAR algorithms; in this article, we propose a 'knowledge-light' method. Specifically, we show how by using a few seconds of raw sensor data, obtained through micro-interactions with the end-user, we can effectively personalise HAR models and transfer recognition functionality to new activities with zero re-training of the model after deployment. We introduce a Personalised Open-ended HAR algorithm, MNZ, a user context aware Matching Network architecture and evaluate on 3 HAR data sources. Performance results show up to 48.9% improvement with personalisation and up to 18.3% improvement compared to the most common 'knowledge-intensive' Open-ended HAR algorithms
Dynamic Rule Covering Classification in Data Mining with Cyber Security Phishing Application
Data mining is the process of discovering useful patterns from datasets using intelligent techniques to help users make certain decisions. A typical data mining task is classification, which involves predicting a target variable known as the class in previously unseen data based on models learnt from an input dataset. Covering is a well-known classification approach that derives models with If-Then rules. Covering methods, such as PRISM, have a competitive predictive performance to other classical classification techniques such as greedy, decision tree and associative classification. Therefore, Covering models are appropriate decision-making tools and users favour them carrying out decisions.
Despite the use of Covering approach in data processing for different classification applications, it is also acknowledged that this approach suffers from the noticeable drawback of inducing massive numbers of rules making the resulting model large and unmanageable by users. This issue is attributed to the way Covering techniques induce the rules as they keep adding items to the ruleâs body, despite the limited data coverage (number of training instances that the rule classifies), until the rule becomes with zero error. This excessive learning overfits the training dataset and also limits the applicability of Covering models in decision making, because managers normally prefer a summarised set of knowledge that they are able to control and comprehend rather a high maintenance models. In practice, there should be a trade-off between the number of rules offered by a classification model and its predictive performance. Another issue associated with the Covering models is the overlapping of training data among the rules, which happens when a ruleâs classified data are discarded during the rule discovery phase. Unfortunately, the impact of a ruleâs removed data on other potential rules is not considered by this approach. However, When removing training data linked with a rule, both frequency and rank of other rulesâ items which have appeared in the removed data are updated. The impacted rules should maintain their true rank and frequency in a dynamic manner during the rule discovery phase rather just keeping the initial computed frequency from the original input dataset.
In response to the aforementioned issues, a new dynamic learning technique based on Covering and rule induction, that we call Enhanced Dynamic Rule Induction (eDRI), is developed. eDRI has been implemented in Java and it has been embedded in WEKA machine learning tool. The developed algorithm incrementally discovers the rules using primarily frequency and rule strength thresholds. These thresholds in practice limit the search space for both items as well as potential rules by discarding any with insufficient data representation as early as possible resulting in an efficient training phase. More importantly, eDRI substantially cuts down the number of training examples scans by continuously updating potential rulesâ frequency and strength parameters in a dynamic manner whenever a rule gets inserted into the classifier. In particular, and for each derived rule, eDRI adjusts on the fly the remaining potential rulesâ items frequencies as well as ranks specifically for those that appeared within the deleted training instances of the derived rule. This gives a more realistic model with minimal rules redundancy, and makes the process of rule induction efficient and dynamic and not static. Moreover, the proposed technique minimises the classifierâs number of rules at preliminary stages by stopping learning when any rule does not meet the ruleâs strength threshold therefore minimising overfitting and ensuring a manageable classifier. Lastly, eDRI prediction procedure not only priorities using the best ranked rule for class forecasting of test data but also restricts the use of the default class rule thus reduces the number of misclassifications.
The aforementioned improvements guarantee classification models with smaller size that do not overfit the training dataset, while maintaining their predictive performance. The eDRI derived models particularly benefit greatly users taking key business decisions since they can provide a rich knowledge base to support their decision making. This is because these modelsâ predictive accuracies are high, easy to understand, and controllable as well as robust, i.e. flexible to be amended without drastic change. eDRI applicability has been evaluated on the hard problem of phishing detection. Phishing normally involves creating a fake well-designed website that has identical similarity to an existing business trustful website aiming to trick users and illegally obtain their credentials such as login information in order to access their financial assets. The experimental results against large phishing datasets revealed that eDRI is highly useful as an anti-phishing tool since it derived manageable size models when compared with other traditional techniques without hindering the classification performance. Further evaluation results using other several classification datasets from different domains obtained from University of California Data Repository have corroborated eDRIâs competitive performance with respect to accuracy, number of knowledge representation, training time and items space reduction. This makes the proposed technique not only efficient in inducing rules but also effective
A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning
Current deep learning research is dominated by benchmark evaluation. A method
is regarded as favorable if it empirically performs well on the dedicated test
set. This mentality is seamlessly reflected in the resurfacing area of
continual learning, where consecutively arriving sets of benchmark data are
investigated. The core challenge is framed as protecting previously acquired
representations from being catastrophically forgotten due to the iterative
parameter updates. However, comparison of individual methods is nevertheless
treated in isolation from real world application and typically judged by
monitoring accumulated test set performance. The closed world assumption
remains predominant. It is assumed that during deployment a model is guaranteed
to encounter data that stems from the same distribution as used for training.
This poses a massive challenge as neural networks are well known to provide
overconfident false predictions on unknown instances and break down in the face
of corrupted data. In this work we argue that notable lessons from open set
recognition, the identification of statistically deviating data outside of the
observed dataset, and the adjacent field of active learning, where data is
incrementally queried such that the expected performance gain is maximized, are
frequently overlooked in the deep learning era. Based on these forgotten
lessons, we propose a consolidated view to bridge continual learning, active
learning and open set recognition in deep neural networks. Our results show
that this not only benefits each individual paradigm, but highlights the
natural synergies in a common framework. We empirically demonstrate
improvements when alleviating catastrophic forgetting, querying data in active
learning, selecting task orders, while exhibiting robust open world application
where previously proposed methods fail.Comment: 32 page
LC an effective classification based association rule mining algorithm
Classification using association rules is a research field in data mining that primarily uses association rule discovery techniques in classification benchmarks. It has been confirmed by many research studies in the literature that classification using association tends to generate more predictive classification systems than traditional classification data mining techniques like probabilistic, statistical and decision tree. In this thesis, we introduce a novel data mining algorithm based on classification using association called âLooking at the Classâ (LC), which can be used in for mining a range of classification data sets. Unlike known algorithms in classification using the association approach such as Classification based on Association rule (CBA) system and Classification based on Predictive Association (CPAR) system, which merge disjoint items in the rule learning step without anticipating the class label similarity, the proposed algorithm merges only items with identical class labels. This saves too many unnecessary items combining during the rule learning step, and consequently results in large saving in computational time and memory.
Furthermore, the LC algorithm uses a novel prediction procedure that employs multiple rules to make the prediction decision instead of a single rule. The proposed algorithm has been evaluated thoroughly on real world security data sets collected using an automated tool developed at Huddersfield University. The security application which we have considered in this thesis is about categorizing websites based on their features to legitimate or fake which is a typical binary classification problem. Also, experimental results on a number of UCI data sets have been conducted and the measures used for evaluation is the classification accuracy, memory usage, and others. The results show that LC algorithm outperformed traditional classification algorithms such as C4.5, PART and NaĂŻve Bayes as well as known classification based association algorithms like CBA with respect to classification accuracy, memory usage, and execution time on most data sets we consider
Learning New Classes from Limited Data in Image Segmentation and Object Detection
L'abstract Ăš presente nell'allegato / the abstract is in the attachmen
Learning from limited labeled data - Zero-Shot and Few-Shot Learning
Human beings have the remarkable ability to recognize novel visual concepts after observing only few or zero examples of them. Deep learning, however, often requires a large amount of labeled data to achieve a good performance. Labeled instances are expensive, difficult and even infeasible to obtain because the distribution of training instances among labels naturally exhibits a long tail. Therefore, it is of great interest to investigate how to learn efficiently from limited labeled data.
This thesis concerns an important subfield of learning from limited labeled data, namely, low-shot learning. The setting assumes the availability of many labeled examples from known classes and the goal is to learn novel classes from only a few~(few-shot learning) or zero~(zero-shot learning) training examples of them. To this end, we have developed a series of multi-modal learning approaches to facilitate the knowledge transfer from known classes to novel classes for a wide range of visual recognition tasks including image classification, semantic image segmentation and video action recognition. More specifically, this thesis mainly makes the following contributions. First, as there is no agreed upon zero-shot image classification benchmark, we define a new benchmark by unifying both the evaluation protocols and data splits of publicly available datasets. Second, in order to tackle the labeled data scarcity, we propose feature generation frameworks that synthesize data in the visual feature space for novel classes. Third, we extend zero-shot learning and few-shot learning to the semantic segmentation task and propose a challenging benchmark for it. We show that incorporating semantic information into a semantic segmentation network is effective in segmenting novel classes. Finally, we develop better video representation for the few-shot video classification task and leverage weakly-labeled videos by an efficient retrieval method.Menschen haben die bemerkenswerte FĂ€higkeit, neuartige visuelle Konzepte zu erkennen, nachdem sie nur wenige oder gar keine Beispiele davon beobachtet haben. Tiefes Lernen erfordert jedoch oft eine groĂe Menge an beschrifteten Daten, um eine gute Leistung zu erzielen. Etikettierte Instanzen sind teuer, schwierig und sogar undurchfĂŒhrbar, weil die Verteilung der Trainingsinstanzen auf die Etiketten naturgemÀà einen langen Schwanz aufweist. Daher ist es von groĂem Interesse zu untersuchen, wie man effizient aus begrenzten gelabelten Daten lernen kann. Diese These betrifft einen wichtigen Teilbereich des Lernens aus begrenzt gelabelten Daten, nĂ€mlich das Low-Shot-Lernen. Das Setting setzt die VerfĂŒgbarkeit vieler gelabelter Beispiele aus bekannten Klassen voraus, und das Ziel ist es, neuartige Klassen aus nur wenigen (few-shot learning) oder null (zero-shot learning) Trainingsbeispielen davon zu lernen. Zu diesem Zweck haben wir eine Reihe von multimodalen LernansĂ€tzen entwickelt, um den Wissenstransfer von bekannten Klassen zu neuartigen Klassen fĂŒr ein breites Spektrum von visuellen Erkennungsaufgaben zu erleichtern, darunter Bildklassifizierung, semantische Bildsegmentierung und Videoaktionserkennung. Genauer gesagt, leistet diese Arbeit hauptsĂ€chlich die folgenden BeitrĂ€ge. Da es keinen vereinbarten Benchmark fĂŒr die Zero-Shot- Bildklassifikation gibt, definieren wir zunĂ€chst einen neuen Benchmark, indem wir sowohl die Evaluierungsprotokolle als auch die Datensplits öffentlich zugĂ€nglicher DatensĂ€tze vereinheitlichen. Zweitens schlagen wir zur BewĂ€ltigung der etikettierten Datenknappheit einen Rahmen fĂŒr die Generierung von Merkmalen vor, der Daten im visuellen Merkmalsraum fĂŒr neuartige Klassen synthetisiert. Drittens dehnen wir das Zero-Shot-Lernen und das few-Shot-Lernen auf die semantische Segmentierungsaufgabe aus und schlagen dafĂŒr einen anspruchsvollen Benchmark vor. Wir zeigen, dass die Einbindung semantischer Informationen in ein semantisches Segmentierungsnetz bei der Segmentierung neuartiger Klassen effektiv ist. SchlieĂlich entwickeln wir eine bessere Videodarstellung fĂŒr die Klassifizierungsaufgabe âfew-shot videoâ und nutzen schwach markierte Videos durch eine effiziente Abrufmethode.Max Planck Institute Informatic
- âŠ