69 research outputs found
Visual Transfer Learning: Informal Introduction and Literature Overview
Transfer learning techniques are important to handle small training sets and
to allow for quick generalization even from only a few examples. The following
paper is the introduction as well as the literature overview part of my thesis
related to the topic of transfer learning for visual recognition problems.Comment: part of my PhD thesi
Detecting cyberattacks in industrial control systems using online learning algorithms
Industrial control systems are critical to the operation of industrial
facilities, especially for critical infrastructures, such as refineries, power
grids, and transportation systems. Similar to other information systems, a
significant threat to industrial control systems is the attack from
cyberspace---the offensive maneuvers launched by "anonymous" in the digital
world that target computer-based assets with the goal of compromising a
system's functions or probing for information. Owing to the importance of
industrial control systems, and the possibly devastating consequences of being
attacked, significant endeavors have been attempted to secure industrial
control systems from cyberattacks. Among them are intrusion detection systems
that serve as the first line of defense by monitoring and reporting potentially
malicious activities. Classical machine-learning-based intrusion detection
methods usually generate prediction models by learning modest-sized training
samples all at once. Such approach is not always applicable to industrial
control systems, as industrial control systems must process continuous control
commands with limited computational resources in a nonstop way. To satisfy such
requirements, we propose using online learning to learn prediction models from
the controlling data stream. We introduce several state-of-the-art online
learning algorithms categorically, and illustrate their efficacies on two
typically used testbeds---power system and gas pipeline. Further, we explore a
new cost-sensitive online learning algorithm to solve the class-imbalance
problem that is pervasive in industrial intrusion detection systems. Our
experimental results indicate that the proposed algorithm can achieve an
overall improvement in the detection rate of cyberattacks in industrial control
systems
Open-Category Classification by Adversarial Sample Generation
In real-world classification tasks, it is difficult to collect training
samples from all possible categories of the environment. Therefore, when an
instance of an unseen class appears in the prediction stage, a robust
classifier should be able to tell that it is from an unseen class, instead of
classifying it to be any known category. In this paper, adopting the idea of
adversarial learning, we propose the ASG framework for open-category
classification. ASG generates positive and negative samples of seen categories
in the unsupervised manner via an adversarial learning strategy. With the
generated samples, ASG then learns to tell seen from unseen in the supervised
manner. Experiments performed on several datasets show the effectiveness of
ASG.Comment: Published in IJCAI 201
Open-Ended Learning of Visual and Multi-Modal Patterns
A common trend in machine learning and pattern classification research is the exploitation of massive amounts of information in order to achieve an increase in performance. In particular, learning from huge collections of data obtained from the web, and using multiple features generated from different sources, have led to significantly boost of performance on problems that have been considered very hard for several years. In this thesis, we present two ways of using these information to build learning systems with robust performance and some degrees of autonomy. These ways are Cue Integration and Cue Exploitation, and constitute the two building blocks of this thesis. In the first block, we introduce several algorithms to answer the research question on how to integrate optimally multiple features. We first present a simple online learning framework which is a wrapper algorithm based on the high-level integration approach in the cue integration literature. It can be implemented with existing online learning algorithms, and preserves the theoretical properties of the algorithms being used. We then extend the Multiple Kernel Learning (MKL) framework, where each feature is converted into a kernel and the system learns the cue integration classifier by solving a joint optimization problem. To make the problem practical, We have designed two new regularization functions making it possible to optimize the problem efficiently. This results in the first online method for MKL. We also show two algorithms to solve the batch problem of MKL. Both of them have a guaranteed convergence rate. These approaches achieve state-of-the-art performance on several standard benchmark datasets, and are order of magnitude faster than other MKL solvers. In the second block, We present two examples on how to exploit information between different sources, in order to reduce the effort of labeling a large amount of training data. The first example is an algorithm to learn from partially annotated data, where each data point is tagged with a few possible labels. We show that it is possible to train a face classification system from data gathered from Internet, without any human labeling, but generating in an automatic way possible lists of labels from the captions of the images. Another example is under the transfer learning setting. The system uses existing models from potentially correlated tasks as experts, and transfers their outputs over the new incoming samples, of a new learning task where very few labeled data are available, to boost the performance
Deep Structured Models for Large Scale Object Co-detection and Segmentation
Structured decisions are often required for a large variety of
image and scene understanding tasks in computer vision, with few
of them being object detection, localization, semantic
segmentation and many more. Structured prediction deals with
learning inherent structure by incorporating contextual
information from several images and multiple tasks. However, it
is very challenging when dealing with large scale image datasets
where performance is limited by high computational costs and
expressive power of the underlying representation learning
techniques. In this thesis,
we present efficient and effective deep structured models for
context-aware object detection, co-localization and
instance-level semantic segmentation.
First, we introduce a principled formulation for object
co-detection using a fully-connected conditional random field
(CRF). We build an explicit graph whose vertices represent object
candidates (instead of pixel values) and edges encode the object
similarity via simple, yet effective pairwise potentials. More
specifically, we design a weighted mixture of Gaussian kernels
for class-specific object similarity, and formulate kernel
weights estimation as a least-squares regression problem. Its
solution can therefore be obtained in closed-form. Furthermore,
in contrast with traditional co-detection approaches, it has been
shown that inference in such fully-connected CRFs can be
performed efficiently using an approximate mean-field method with
high-dimensional Gaussian filtering. This lets us effectively
leverage information in multiple images.
Next, we extend our class-specific co-detection framework to
multiple object categories. We model object candidates with rich,
high-dimensional features learned using a deep convolutional
neural network. In particular, our max-margin and directloss
structural boosting algorithms enable us to learn the most
suitable features that best encode pairwise similarity
relationships within our CRF framework. Furthermore, it
guarantees that the time and space complexity is O(n t) where n
is the total number of candidate boxes in the pool and t the
number of mean-field iterations.
Moreover, our experiments evidence the importance of learning
rich similarity measures to account for the contextual relations
across object classes and instances. However, all these methods
are based on precomputed object candidates (or proposals), thus
localization performance is limited by the quality of
bounding-boxes.
To address this, we present an efficient object proposal
co-generation technique that leverages the collective power of
multiple images. In particular, we design a deep neural network
layer that takes unary and pairwise features as input, builds a
fully-connected CRF and produces mean-field marginals as output.
It also lets us backpropagate the gradient through entire network
by unrolling the iterations of CRF inference. Furthermore, this
layer simplifies the end-to-end learning, thus effectively
benefiting from multiple candidates to co-generate high-quality
object proposals.
Finally, we develop a multi-task strategy to jointly learn object
detection, localization and instance-level semantic segmentation
in a single network. In particular, we introduce a novel
representation based on the distance transform of the object
masks. To this end, we design a new residual-deconvolution
architecture that infers such a representation and decodes it
into the final binary object mask. We show that the predicted
masks can go beyond the scope of the bounding boxes and that the
multiple tasks can benefit from each other.
In summary, in this thesis, we exploit the joint power of
multiple images as well as multiple tasks to improve
generalization performance of structured learning. Our novel deep
structured models, similarity learning techniques and
residual-deconvolution architecture can be used to make accurate
and reliable inference for key vision tasks. Furthermore, our
quantitative and qualitative experiments on large scale
challenging image datasets demonstrate the superiority of the
proposed approaches over the state-of-the-art methods
Simple but Not Simplistic: Reducing the Complexity of Machine Learning Methods
Programa Oficial de Doutoramento en Computación . 5009V01[Resumo]
A chegada do Big Data e a explosión do Internet das cousas supuxeron un gran
reto para os investigadores en Aprendizaxe Automática, facendo que o proceso de
aprendizaxe sexa mesmo roáis complexo. No mundo real, os problemas da aprendizaxe
automática xeralmente teñen complexidades inherentes, como poden ser as
caracterÃsticas intrÃnsecas dos datos, o gran número de mostras, a alta dimensión dos
datos de entrada, os cambios na distribución entre o conxunto de adestramento e
test, etc. Todos estes aspectos son importantes, e requiren novoS modelos que poi dan
facer fronte a estas situacións. Nesta tese, abordáronse todos estes problemas, tratando
de simplificar o proceso de aprendizaxe automática no escenario actual. En
primeiro lugar, realÃzase unha análise de complexidade para observar como inflúe
esta na tarefa de clasificación, e se é posible que a aplicación dun proceso previo
de selección de caracterÃsticas reduza esta complexidade. Logo, abórdase o proceso
de simplificación da fase de aprendizaxe automática mediante a filosofÃa divide e
vencerás, usando un enfoque distribuÃdo. Seguidamente, aplicamos esa mesma filosofÃa
sobre o proceso de selección de caracterÃsticas. Finalmente, optamos por un
enfoque diferente seguindo a filosofÃa do Edge Computing, a cal permite que os datos
producidos polos dispositivos do Internet das cousas se procesen máis preto de
onde se crearon. Os enfoques propostos demostraron a súa capacidade para reducir
a complexidade dos métodos de aprendizaxe automática tradicionais e, polo tanto,
espérase que a contribución desta tese abra as portas ao desenvolvemento de novos
métodos de aprendizaxe máquina máis simples, máis robustos, e máis eficientes
computacionalmente.[Resumen]
La llegada del Big Data y la explosión del Internet de las cosas han supuesto
un gran reto para los investigadores en Aprendizaje Automático, haciendo que el
proceso de aprendizaje sea incluso más complejo. En el mundo real, los problemas de
aprendizaje automático generalmente tienen complejidades inherentes) como pueden
ser las caracterÃsticas intrÃnsecas de los datos, el gran número de muestras, la alta
dimensión de los datos de entrada, los cambios en la distribución entre el conjunto de
entrenamiento y test, etc. Todos estos aspectos son importantes, y requieren nuevos
modelos que puedan hacer frente a estas situaciones. En esta tesis, se han abordado
todos estos problemas, tratando de simplificar el proceso de aprendizaje automático
en el escenario actual. En primer lugar, se realiza un análisis de complejidad para
observar cómo influye ésta en la tarea de clasificación1 y si es posible que la aplicación
de un proceso previo de selección de caracterÃsticas reduzca esta complejidad.
Luego, se aborda el proceso de simplificación de la fase de aprendizaje automático
mediante la filosofÃa divide y vencerás, usando un enfoque distribuido. A continuación,
aplicamos esa misma filosofÃa sobre el proceso de selección de caracterÃsticas.
Finalmente, optamos por un enfoque diferente siguiendo la filosofÃa del Edge Computing,
la cual permite que los datos producidos por los dispositivos del Internet de
las cosas se procesen más cerca de donde se crearon. Los enfoques propuestos han
demostrado su capacidad para reducir la complejidad de los métodos de aprendizaje
automático tnidicionales y, por lo tanto, se espera que la contribución de esta
tesis abra las puertas al desarrollo de nuevos métodos de aprendizaje máquina más
simples, más robustos, y más eficientes computacionalmente.[Abstract]
The advent of Big Data and the explosion of the Internet of Things, has brought
unprecedented challenges to Machine Learning researchers, making the learning task
more complexo Real-world machine learning problems usually have inherent complexities,
such as the intrinsic characteristics of the data, large number of instauces,
high input dimensionality, dataset shift, etc. AH these aspects matter, and can
fOI new models that can confront these situations. Thus, in this thesis, we have
addressed aH these issues) simplifying the machine learning process in the current
scenario. First, we carry out a complexity analysis to see how it inftuences the
classification models, and if it is possible that feature selection might result in a
deerease of that eomplexity. Then, we address the proeess of simplifying learning
with the divide-and-conquer philosophy of the distributed approaeh. Later, we aim
to reduce the complexity of the feature seleetion preprocessing through the same
philosophy. FinallYl we opt for a different approaeh following the eurrent philosophy
Edge eomputing, whieh allows the data produeed by Internet of Things deviees
to be proeessed closer to where they were ereated. The proposed approaehes have
demonstrated their eapability to reduce the complexity of traditional maehine learning
algorithms, and thus it is expeeted that the eontribution of this thesis will open
the doors to the development of new maehine learning methods that are simpler,
more robust, and more eomputationally efficient
- …