12 research outputs found
Efficient approaches for escaping higher order saddle points in non-convex optimization
Local search heuristics for non-convex optimizations are popular in applied
machine learning. However, in general it is hard to guarantee that such
algorithms even converge to a local minimum, due to the existence of
complicated saddle point structures in high dimensions. Many functions have
degenerate saddle points such that the first and second order derivatives
cannot distinguish them with local optima. In this paper we use higher order
derivatives to escape these saddle points: we design the first efficient
algorithm guaranteed to converge to a third order local optimum (while existing
techniques are at most second order). We also show that it is NP-hard to extend
this further to finding fourth order local optima
On the Principle of Least Symmetry Breaking in Shallow ReLU Models
We consider the optimization problem associated with fitting two-layer ReLU
networks with respect to the squared loss, where labels are assumed to be
generated by a target network. Focusing first on standard Gaussian inputs, we
show that the structure of spurious local minima detected by stochastic
gradient descent (SGD) is, in a well-defined sense, the \emph{least loss of
symmetry} with respect to the target weights. A closer look at the analysis
indicates that this principle of least symmetry breaking may apply to a broader
range of settings. Motivated by this, we conduct a series of experiments which
corroborate this hypothesis for different classes of non-isotropic non-product
distributions, smooth activation functions and networks with a few layers
Levenberg-Marquardt Algorithm for Mackey-Glass Chaotic Time Series Prediction
For decades, Mackey-Glass chaotic time series prediction has attracted more and more attention. When the multilayer perceptron is used to predict the Mackey-Glass chaotic time series, what
we should do is to minimize the loss function. As is well known, the convergence speed of the loss function is rapid in the beginning of the learning process, while the convergence speed is very slow when the parameter is near to the minimum point. In order to overcome these problems, we introduce the Levenberg-Marquardt algorithm (LMA). Firstly, a rough introduction is given to the multilayer perceptron, including the structure and the model approximation method. Secondly, we introduce the LMA and discuss how to implement the LMA. Lastly, an illustrative example is carried out to show the prediction efficiency of the LMA. Simulations show that the LMA can give more accurate prediction than the gradient descent method
Non-attracting Regions of Local Minima in Deep and Wide Neural Networks
Understanding the loss surface of neural networks is essential for the design
of models with predictable performance and their success in applications.
Experimental results suggest that sufficiently deep and wide neural networks
are not negatively impacted by suboptimal local minima. Despite recent
progress, the reason for this outcome is not fully understood. Could deep
networks have very few, if at all, suboptimal local optima? or could all of
them be equally good? We provide a construction to show that suboptimal local
minima (i.e., non-global ones), even though degenerate, exist for fully
connected neural networks with sigmoid activation functions. The local minima
obtained by our construction belong to a connected set of local solutions that
can be escaped from via a non-increasing path on the loss curve. For extremely
wide neural networks of decreasing width after the wide layer, we prove that
every suboptimal local minimum belongs to such a connected set. This provides a
partial explanation for the successful application of deep neural networks. In
addition, we also characterize under what conditions the same construction
leads to saddle points instead of local minima for deep neural networks
Виявлення та відстежування об’єктів методами машинного навчання
Дипломна робота: 158 с., 83 рис., 6 табл., 2 додатки, 42 джерела.
НЕЙРОННІ МЕРЕЖІ, ВИЯВЛЕННЯ ТА ВІДСТЕЖУВАННЯ
ОБ’ЄКТІВ, ЗГОРТКОВІ НЕЙРОННІ МЕРЕЖІ, ПРОГРАМА ДЛЯ
ВІДСТЕЖУВАННЯ.
Об’єкт дослідження – виявлення та відстежування об’єктів з
використанням периферійних пристроїв (різного роду відео-камер).
Часто коли люди працюють з відео-матеріалами, вони стикаються з
проблемою виявлення та класифікації об’єктів, які знаходяться на поточному
кадрі. Це є необхідним у багатьох сферах людської діяльності. Наприклад, для
створення автономної системи керування автомобілем, оскільки перш ніж
штучний інтелект буде приймати рішення щодо керування авто має бути чітке
розуміння з якими об’єктами він стикається у поточний момент часу.
Однак може виникнути задача відстеження цілої історії об’єкта або
об’єктів на відповідному відео матеріалі, точніше кажучи – траєкторії руху
цілей, які були присутні на відео. Таким чином задача відстежування об’єктів
є логічним продовженням попередньої задачі.
Мета роботи – розробити програму з використанням існуючих моделей
для виявлення та відстежування об’єктів, причому зробити це з оптимальним
використанням ресурсів.
Бажано щоб розроблена програма виконувала поставлену задачу в
реальному часі, а також щоб модель була не занадто складною з точки зору
часу опрацювання зображень аби можна було не витрачати додаткові ресурси
на облаштування серверів, тобто щоб по суті уся робота виконувалася лише
периферійними пристроями.
Програмний продукт розроблено на мові програмування Python. Було
реалізовано модель yolo в комбінації з алгоритмом DeepSort. Практичним
результатом роботи є система виявлення і відстеження об'єктів.Bachelor thesis: 158 p., 83 fig., 6 tabl., 2 appendices, 42 sources.
NEURAL NETWORKS, OBJECT DETECTION AND TRACKING,
CONVOLUTIONAL NEURAL NETWORKS, TRACKING SOFTWARE.
The object of research is the detection and tracking of objects using
peripheral devices (various video cameras).
Often, when people work with video materials, they face the problem of
identifying and classifying objects that are in the current frame. This is necessary in
many areas of human activity. For example, to create an autonomous car control
system, because before artificial intelligence can make decisions about driving a car,
there must be a clear understanding of what objects it encounters at the current
moment in time.
However, there may be a problem of tracking the entire history of an object
or objects on the corresponding video material, more precisely, the trajectory of the
targets that were present on the video. Thus, the object tracking task is a logical
continuation of the previous task.
The goal of the work is to develop a program using existing models for object
detection and tracking, and do it with the optimal use of resources.
It is desirable that the developed program performs the task in real time, and
also that the model is not too complicated in terms of image processing time so that
additional resources cannot be spent on setting up servers, i.e. that essentially all
work is performed only by peripheral devices.
The software product is developed in the Python programming language.
The yolo model was implemented in combination with the DeepSort algorithm. The
practical result of the work is a system of detection and tracking of objects