38,030 research outputs found
Exploiting the accumulated evidence for gene selection in microarray gene expression data
Machine Learning methods have of late made signicant efforts to solving multidisciplinary problems in the field of cancer classification using microarray gene expression data. Feature subset selection methods can play an important role in the modeling process, since these tasks are characterized by a large number of features and a few observations, making the modeling a non-trivial undertaking. In this particular scenario, it is extremely important to select genes by taking into account the possible interactions with other gene subsets. This paper shows that, by accumulating the evidence in favour (or against) each gene along the search process, the obtained gene subsets may constitute better solutions, either in terms of predictive accuracy or gene size, or in both. The proposed technique is extremely simple and applicable at a negligible overhead in cost.Postprint (published version
An evolutionary approach to the optimisation of autonomous pod distribution for application in an urban transportation service
For autonomous vehicles (AVs), which when deployed in urban areas are called âpodsâ, to be used as part of a commercially viable low-cost urban transport system, they will need to operate efficiently. Among ways to achieve efficiency, is to minimise time vehicles are not serving users. To reduce the amount of wasted time, this paper presents a novel approach for distribution of AVs within an urban environment. Our approach uses evolutionary computation, in the form of a genetic algorithm (GA), which is applied to a simulation of an intelligent transportation service, operating in the city of Coventry, UK. The goal of the GA is to optimise distribution of pods, to reduce the amount of user waiting time. To test the algorithm, real-world transport data was obtained for Coventry, which in turn was processed to generate user demand patterns. Results from the study showed a 30% increase in the number of successful journeys completed in a 24 hours, compared to a random distribution. The implications of these findings could yield significant benefits for fleet management companies. These include increases in profits per day, a decrease in capital cost, and better energy efficiency. The algorithm could also be adapted to any service offering pick up and drop of points, including package delivery and transportation of goods
Estimation and Regularization Techniques for Regression Models with Multidimensional Prediction Functions
Boosting is one of the most important methods for fitting
regression models and building prediction rules from
high-dimensional data. A notable feature of boosting is that the
technique has a built-in mechanism for shrinking coefficient
estimates and variable selection. This regularization mechanism
makes boosting a suitable method for analyzing data characterized by
small sample sizes and large numbers of predictors. We extend the
existing methodology by developing a boosting method for prediction
functions with multiple components. Such multidimensional functions
occur in many types of statistical models, for example in count data
models and in models involving outcome variables with a mixture
distribution. As will be demonstrated, the new algorithm is suitable
for both the estimation of the prediction function and
regularization of the estimates. In addition, nuisance parameters
can be estimated simultaneously with the prediction function
Using numerical plant models and phenotypic correlation space to design achievable ideotypes
Numerical plant models can predict the outcome of plant traits modifications
resulting from genetic variations, on plant performance, by simulating
physiological processes and their interaction with the environment.
Optimization methods complement those models to design ideotypes, i.e. ideal
values of a set of plant traits resulting in optimal adaptation for given
combinations of environment and management, mainly through the maximization of
a performance criteria (e.g. yield, light interception). As use of simulation
models gains momentum in plant breeding, numerical experiments must be
carefully engineered to provide accurate and attainable results, rooting them
in biological reality. Here, we propose a multi-objective optimization
formulation that includes a metric of performance, returned by the numerical
model, and a metric of feasibility, accounting for correlations between traits
based on field observations. We applied this approach to two contrasting
models: a process-based crop model of sunflower and a functional-structural
plant model of apple trees. In both cases, the method successfully
characterized key plant traits and identified a continuum of optimal solutions,
ranging from the most feasible to the most efficient. The present study thus
provides successful proof of concept for this enhanced modeling approach, which
identified paths for desirable trait modification, including direction and
intensity.Comment: 25 pages, 5 figures, 2017, Plant, Cell and Environmen
- âŠ