3,619 research outputs found
Using Multiobjective Genetic Programming to Infer Logistic Polynomial Regression Models [and] Experimental Supplement
Abstract. In designing non-linear classifiers, there are important trade-offs to be made between predictive accuracy and model comprehensibility or complexity. We introduce the use of Genetic Programming to generate logistic polynomial models, a relatively comprehensible non-linear parametric model; describe an efficient twostage algorithm consisting of GP structure design and Quasi-Newton coefficient setting; demonstrate that Niched Pareto Multiobjective
Genetic Programming can be used to discover a range of classifiers with different complexity versus “performance” trade-offs; introduce a technique to integrate a new “ROC (Receiver Operating Characteristic) dominance” concept into the multiobjective setting; and suggest some modifications to the Niched Pareto GA for use in Genetic Programming. The technique successfully generates classifiers with diverse complexity and performance characteristics
Feature selection using genetic algorithms and probabilistic neural networks
Selection of input variables is a key stage in building
predictive models, and an important form of data mining. As exhaustive evaluation of potential input sets using full non-linear models is impractical, it is necessary to use simple fast-evaluating models and heuristic selection strategies. This paper discusses a fast, efficient, and powerful nonlinear input selection procedure using a combination of Probabilistic Neural Networks and repeated
bitwise gradient descent. The algorithm is compared
with forward elimination, backward elimination and genetic algorithms using a selection of real-world data sets. The algorithm has comparative performance and greatly reduced execution time with respect to these alternative approaches. It is demonstrated empirically that reliable results cannot be gained using any of these approaches without the use of resampling
Training feedforward neural networks using orthogonal iteration of the Hessian eigenvectors
Introduction
Training algorithms for Multilayer Perceptrons optimize the set of W weights and biases, w, so as to minimize an
error function, E, applied to a set of N training patterns. The well-known back propagation algorithm combines an
efficient method of estimating the gradient of the error function in weight space, DE=g, with a simple gradient
descent procedure to adjust the weights, Dw = -hg. More efficient algorithms maintain the gradient estimation
procedure, but replace the update step with a faster non-linear optimization strategy [1].
Efficient non-linear optimization algorithms are based upon second order approximation [2]. When sufficiently
close to a minimum the error surface is approximately quadratic, the shape being determined by the Hessian matrix.
Bishop [1] presents a detailed discussion of the properties and significance of the Hessian matrix. In principle, if
sufficiently close to a minimum it is possible to move directly to the minimum using the Newton step, -H-1g.
In practice, the Newton step is not used as H-1 is very expensive to evaluate; in addition, when not sufficiently close
to a minimum, the Newton step may cause a disastrously poor step to be taken. Second order algorithms either build
up an approximation to H-1, or construct a search strategy that implicitly exploits its structure without evaluating it;
they also either take precautions to prevent steps that lead to a deterioration in error, or explicitly reject such steps.
In applying non-linear optimization algorithms to neural networks, a key consideration is the high-dimensional
nature of the search space. Neural networks with thousands of weights are not uncommon. Some algorithms have
O(W2) or O(W3) memory or execution times, and are hence impracticable in such cases. It is desirable to identify
algorithms that have limited memory requirements, particularly algorithms where one may trade memory usage
against convergence speed.
The paper describes a new training algorithm that has scalable memory requirements, which may range from O(W)
to O(W2), although in practice the useful range is limited to lower complexity levels. The algorithm is based upon a
novel iterative estimation of the principal eigen-subspace of the Hessian, together with a quadratic step estimation
procedure.
It is shown that the new algorithm has convergence time comparable to conjugate gradient descent, and may be
preferable if early stopping is used as it converges more quickly during the initial phases.
Section 2 overviews the principles of second order training algorithms. Section 3 introduces the new algorithm.
Second 4 discusses some experiments to confirm the algorithm's performance; section 5 concludes the paper
Recommended from our members
Clarifying Pokhran-II in a multilinguistic setting
In May of 1998, India conducted its second nuclear test after a period of 24 years. This second test, known as Pokhran-II, caught the world by surprise and 17 days later it was followed by Pakistan’s first test of a nuclear device. The international community sought clarification for these developments and Indian and Pakistani leaders issued messages to explain their respective country’s rationale for testing. This report, focusing on the statements of then Indian Prime Minister Atal Bihari Vajpayee, argues that India’s motivations for testing can only be fully understood through consultation with both English and Hindi language statements. Four established models that explain why nations develop and test nuclear weapons are used to parse these sources to determine the contrast of rationale between each language. These models – gaining “security” from external threat, the interests of “domestic politics”, nuclear weapons as one of the “norms” indicating modernity, and the centering of victimhood and entitlement in “post-imperial ideology” – are variously represented across the statements. This displays that the complete picture of why the tests were conducted only can be seen by studying statements made in both languages. The implications of these findings suggest that attempts to clarify events that originate in multilinguistic settings should be made via consultation with sources in all of the languages that constitute the setting of the original event.Asian Studie
Nitrogen distribution by globin
This and other experiences with the tryptophane method of Fürth and Nobel led us to doubt seriously the reliability of quantitative data obtained by its application. When, therefore, just as we completed our work with it, Folin and Looney (6) described another and apparently better method of determination, a method based upon a different color reaction and capable moreover of convenient combination with a quantitative procedure for tyrosine, it seemed to us worth while to review the problem again. With the aid of this newer method we have now determined the tryptophane and tyrosine content of two series of globin preparations, and have, we believe, settled fairly decisively the proportion of these amino-acids yielded by the pure protein. We have also taken occasion to determine by the method of Van Slyke the general distribution of nitrogen in the globin molecule
Application of the self-organising map to trajectory classification
This paper presents an approach to the problem of automatically classifying events detected by video surveillance systems; specifically, of detecting unusual or suspicious movements. Approaches to this problem typically involve building complex 3D-models in real-world coordinates
to provide trajectory information for the classifier. In this paper we show that analysis of trajectories may be carried out in a model-free fashion, using self-organising
feature map neural networks to learn the characteristics of normal trajectories, and to detect novel ones. Trajectories are represented using positional and first and second order motion information, with moving-average smoothing. This allows novelty detection to be applied on a point-by-point basis in real time, and permits both instantaneous motion and whole trajectory motion to be subjected to novelty detection
A comparison of crossover operators in neural network feature selection with multiobjective evolutionary algorithms
Genetic algorithms are often employed for
neural network feature selection. The efficiency
of the search for a good subset of features,
depends on the capability of the recombination
operator to construct building blocks which
perform well, based on existing genetic material.
In this paper, a commonality-based crossover
operator is employed, in a multiobjective
evolutionary setting. The operator has two main
characteristics: first, it exploits the concept that
common schemata are more likely to form useful
building blocks; second, the offspring produced
are similar to their parents in terms of the subset
size they encode. The performance of the novel
operator is compared against that of uniform, 1
and 2-point crossover, in feature selection with
probabilistic neural networks
Scene modelling using an adaptive mixture of Gaussians in colour and space
We present an integrated pixel segmentation and region
tracking algorithm, designed for indoor environments. Visual monitoring systems often use frame differencing techniques to independently classify each image pixel as either foreground or background. Typically, this level of processing does not take account of the global image structure, resulting in frequent misclassification.
We use an adaptive Gaussian mixture model in colour and space to represent background and foreground regions of the scene. This model is used to probabilistically classify observed pixel values, incorporating the global scene structure into pixel-level segmentation. We evaluate our system over 4 sequences and show that it successfully segments foreground pixels and tracks major foreground regions as they move through the scene
Accurate methods for manually marking retinal vessel widths
This paper compares two manual measurement
techniques for measuring retinal vessel segment widths: the
kick-points technique and the edge marking technique. An image
set of 164 clear, high-resolution segments was used. The
kick-points approach uses kick points marked by observers
along interpolated cross-sectional intensity profile graphs; the
edge marking method allows observers to nominate the edges on
a zoomed-up image, and interpolates edge positions. The edgemarking
method provides more precise measurements than the
kick-points method, but these are subject to more inter-observer
variability; we speculate that this result is due to differing observer
perceptions of the edge location
- …