When we are given a set of data with which to train a neural network, the size of the input vectors tells us how many inputs our network will require. As regards outputs, we may have some freedom in deciding how many, but our choices will be limited. For instance, in a 2-class classification problem we can choose between a single output (where +1 means one class and −1 means the other) or we might choose to have two outputs, where a high output indicates the class. After deciding on the number of outputs, there remain the issues of deciding on how many hidden units and how many layers to use. As we have already seen, there is currently no simple method of deciding these issues and laborious cross-validation (ie training a network on one data set and checking its behaviour on another data set) is usually employed to determine the best-performing network out of several proposed networks. This is not very satisfactory and in this lecture we will examine an approach that allows the data to determine network structure in rather a different way. There are two basic approaches. (i) Start with a network that you are sure is big enough to accommodate the problem, train it, and then identify neural elements and connections that can be removed because they are contributing little or nothing to the solution – this is called network pruning. (ii) Commence with a very small network and allow it to grow to accommodate the learning problem – this is called network construction. As we shall see, the network construction methods lead to a variety of network structures, not simply th
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.