3 research outputs found
Statistical Learning Theory of Quasi-Regular Cases
Many learning machines such as normal mixtures and layered neural networks
are not regular but singular statistical models, because the map from a
parameter to a probability distribution is not one-to-one. The conventional
statistical asymptotic theory can not be applied to such learning machines
because the likelihood function can not be approximated by any normal
distribution. Recently, new statistical theory has been established based on
algebraic geometry and it was clarified that the generalization and training
errors are determined by two birational invariants, the real log canonical
threshold and the singular fluctuation. However, their concrete values are left
unknown. In the present paper, we propose a new concept, a quasi-regular case
in statistical learning theory. A quasi-regular case is not a regular case but
a singular case, however, it has the same property as a regular case. In fact,
we prove that, in a quasi-regular case, two birational invariants are equal to
each other, resulting that the symmetry of the generalization and training
errors holds. Moreover, the concrete values of two birational invariants are
explicitly obtained, the quasi-regular case is useful to study statistical
learning theory
Machine learning: statistical physics based theory and smart industry applications
The increasing computational power and the availability of data have made it possible to train ever-bigger artificial neural networks. These so-called deep neural networks have been used for impressive applications, like advanced driver assistance and support in medical diagnoses. However, various vulnerabilities have been revealed and there are many open questions concerning the workings of neural networks. Theoretical analyses are therefore essential for further progress. One current question is: why is it that networks with Rectified Linear Unit (ReLU) activation seemingly perform better than networks with sigmoidal activation?We contribute to the answer to this question by comparing ReLU networks with sigmoidal networks in diverse theoretical learning scenarios. In contrast to analysing specific datasets, we use a theoretical modelling using methods from statistical physics. They give the typical learning behaviour for chosen model scenarios. We analyse both the learning behaviour on a fixed dataset and on a data stream in the presence of a changing task. The emphasis is on the analysis of the network’s transition to a state wherein specific concepts have been learnt. We find significant benefits of ReLU networks: they exhibit continuous increases of their performance and adapt more quickly to changing tasks.In the second part of the thesis we treat applications of machine learning: we design a quick quality control method for material in a production line and study the relationship with product faults. Furthermore, we introduce a methodology for the interpretable classification of time series data