745 research outputs found
No More Pesky Learning Rates
The performance of stochastic gradient descent (SGD) depends critically on
how learning rates are tuned and decreased over time. We propose a method to
automatically adjust multiple learning rates so as to minimize the expected
error at any one time. The method relies on local gradient variations across
samples. In our approach, learning rates can increase as well as decrease,
making it suitable for non-stationary problems. Using a number of convex and
non-convex learning tasks, we show that the resulting algorithm matches the
performance of SGD or other adaptive approaches with their best settings
obtained through systematic search, and effectively removes the need for
learning rate tuning
Learning Learning Algorithms
Machine learning models rely on data to learn any given task and depending on the universal diversity of each of the elements of the task and the design objectives, multiple data may be required
for better performance, which in turn could exponentially increase learning time and computational cost. Although most of the training of machine learning models today are done using GPUs
(Graphics Processing Unit) to speed up the training process, most however, depending on the
dataset, still require a huge amount of training time to attain good performance.
This study aims to look into learning learning algorithms or popularly known as metalearning
which is a method that not only tries to improve the learning speed but also the model performance and in addition it requires fewer data and entails multiple tasks. The concept involves
training a model that constantly learns to learn novel tasks at a fast rate from previously learned
tasks.
For the review of the related work, attention will be given to optimization-based methods and most
precisely MAML (Model Agnostic MetaLearning), because first of all, it is one of the most popular
state-of-the-art metalearning method, and second of all, this thesis focuses on creating a MAML
based method called MAML-DBL that uses an adaptive learning rate technique with dynamic
bounds that enables it to attain quick convergence at the beginning of the training process and
good generalization towards the end.
The proposed MAML variant aims to try to prevent vanishing learning rates during training and
slowing down at the end where dense features are prevalent, although further hyperparameter tunning might be necessary for some models or where sparse features may be prevalent, for improved
performance.
MAML-DBL and MAML, were tested on the most commonly used datasets for metalearning models, and based on the results of the experiments, the proposed method showed a rather competitive
performance on some of the models and even outperformed the baseline in some of the carried out
tests.
The results obtained with both MAML-DBL (in one of the dataset) and MAML, show that metalearning methods are highly recommendable solutions whenever good performance, less data and
a multi-task or versatile model are required or desired.Os modelos de aprendizagem automática dependem dos dados para aprender qualquer tarefa e,
dependendo da diversidade de cada um dos elementos da tarefa e dos objetivos do projeto, a quantidade de dados pode ser elevada, o que, por sua vez, pode aumentar exponencialmente o tempo de
aprendizagem e o custo computacional. Embora a maioria do treino dos modelos de aprendizagem
automática hoje seja feito usando GPUs (unidade de processamento gráfico), ainda é necessária
uma quantidade enorme de tempo de treino para obter o desempenho desejado.
Este trabalho tem como objetivo analisar os algoritmos de aprendizagem de aprendizagem ou popularmente conhecidos como metalearning, que são métodos que não apenas tentam melhorar a
velocidade de aprendizagem, mas também o desempenho do modelo e, além disso, requerem menos
dados e envolvem várias tarefas. O conceito envolve o treino de um modelo que aprende constantemente a aprender tarefas novas em ritmo acelerado, a partir de tarefas aprendidas anteriormente.
Para a revisão do trabalho relacionado, será dada atenção aos métodos baseados em otimização
e, mais precisamente, ao MAML (Model Agnostic MetaLearning), porque em primeiro lugar é um
dos métodos de metalearning mais populares e em segundo lugar, esta tese foca a criação de um
método baseado em MAML, chamado MAML-DBL, que usa uma técnica de taxa de aprendizagem
adaptável com limites dinâmicos que permite obter convergência rápida no início do processo de
treino e boa generalização no fim.
A proposta variante de MAML tem como objetivo tentar evitar o desaparecimento das taxas de
aprendizagem durante o treino e a desaceleração no fim onde entradas densas são predominantes,
embora possa ser necessário um ajuste adicional dos hiperparâmetros para alguns modelos ou onde
entradas esparsas podem ser predominantes, para melhorar o desempenho.
O MAML-DBL e o MAML foram testados nos conjuntos de dados mais comumente usados para
modelos de metalearning, e com base nos resultados das experiências, o método proposto mostrou
um desempenho bastante competitivo em alguns dos modelos e até superou o baseline em alguns
dos testes realizados.
Os resultados obtidos com o MAML e MAML-DBL (num dos conjuntos de dados) mostram que os
métodos de metalearning são soluções altamente recomendáveis sempre que um bom desempenho,
menos dados e um modelo versátil ou com várias tarefas são necessários ou desejados
Metaheuristic design of feedforward neural networks: a review of two decades of research
Over the past two decades, the feedforward neural network (FNN) optimization has been a key interest among the researchers and practitioners of multiple disciplines. The FNN optimization is often viewed from the various perspectives: the optimization of weights, network architecture, activation nodes, learning parameters, learning environment, etc. Researchers adopted such different viewpoints mainly to improve the FNN's generalization ability. The gradient-descent algorithm such as backpropagation has been widely applied to optimize the FNNs. Its success is evident from the FNN's application to numerous real-world problems. However, due to the limitations of the gradient-based optimization methods, the metaheuristic algorithms including the evolutionary algorithms, swarm intelligence, etc., are still being widely explored by the researchers aiming to obtain generalized FNN for a given problem. This article attempts to summarize a broad spectrum of FNN optimization methodologies including conventional and metaheuristic approaches. This article also tries to connect various research directions emerged out of the FNN optimization practices, such as evolving neural network (NN), cooperative coevolution NN, complex-valued NN, deep learning, extreme learning machine, quantum NN, etc. Additionally, it provides interesting research challenges for future research to cope-up with the present information processing era
- …