177 research outputs found
Some Guidelines For Using Nonparametric Methods For Modeling Data From Response Surface Designs
Traditional response surface methodology focuses on modeling responses using parametric models with designs chosen to balance cost with adequate estimation of parameters and prediction in the design space. Using nonparametric smoothing to approximate the response surface offers both opportunities as well as problems. This article explores some conditions under which these methods can be appropriately used to increase the flexibility of surfaces modeled. The Box and Draper (1987) printing ink study is considered to illustrate the methods
A More Efficient Way Of Obtaining A Unique Median Estimate For Circular Data
The procedure for computing the sample circular median occasionally leads to a non-unique estimate of the population circular median, since there can sometimes be two or more diameters that divide data equally and have the same circular mean deviation. A modification in the computation of the sample median is suggested, which not only eliminates this non-uniqueness problem, but is computationally easier and faster to work with than the existing alternative
Effect Of Position Of An Outlier On The Influence Curve Of The Measures Of Preferred Direction For Circular Data
Circular or angular data occur in many fields of applied statistics. A common problem of interest in circular data is estimating a preferred direction and its corresponding distribution. It is complicated by the wrap-around effect on the circle, which exists because there is no natural minimum or maximum. The usual statistics employed for linear data are inappropriate for directional data, as they do not account for its circular nature. The robustness of the three common choices for summarizing the preferred direction (the sample circular mean, sample circular median and a circular analog of the Hodges-Lehmann estimator) are evaluated via their influence functions
\u3ci\u3eI\u3c/i\u3e-optimal or \u3ci\u3eG\u3c/i\u3e-optimal: Do We Have to Choose?
When optimizing an experimental design for good prediction performance based on an assumed second order response surface model, it is common to focus on a single optimality criterion, either G-optimality, for best worst-case prediction precision, or I-optimality, for best average prediction precision. In this article, we illustrate how using particle swarm optimization to construct a Pareto front of non-dominated designs that balance these two criteria yields some highly desirable results. In most scenarios, there are designs that simultaneously perform well for both criteria. Seeing alternative designs that vary how they balance the performance of G- and I-efficiency provides experimenters with choices that allow selection of a better match for their study objectives. We provide an extensive repository of Pareto fronts with designs for 17 common experimental scenarios for 2 (design size N = 6 to 12), 3 (N = 10 to 16) and 4 (N = 15, 17, 20) experimental factors. These, when combined with a detailed strategy for how to efficiently analyze, assess, and select between alternatives, provide the reader with the tools to select the ideal design with a tailored balance between G- and I- optimality for their own experimental situations
How to Host a Data Competition: Statistical Advice for Design and Analysis of a Data Competition
Data competitions rely on real-time leaderboards to rank competitor entries
and stimulate algorithm improvement. While such competitions have become quite
popular and prevalent, particularly in supervised learning formats, their
implementations by the host are highly variable. Without careful planning, a
supervised learning competition is vulnerable to overfitting, where the winning
solutions are so closely tuned to the particular set of provided data that they
cannot generalize to the underlying problem of interest to the host. This paper
outlines some important considerations for strategically designing relevant and
informative data sets to maximize the learning outcome from hosting a
competition based on our experience. It also describes a post-competition
analysis that enables robust and efficient assessment of the strengths and
weaknesses of solutions from different competitors, as well as greater
understanding of the regions of the input space that are well-solved. The
post-competition analysis, which complements the leaderboard, uses exploratory
data analysis and generalized linear models (GLMs). The GLMs not only expand
the range of results we can explore, they also provide more detailed analysis
of individual sub-questions including similarities and differences between
algorithms across different types of scenarios, universally easy or hard
regions of the input space, and different learning objectives. When coupled
with a strategically planned data generation approach, the methods provide
richer and more informative summaries to enhance the interpretation of results
beyond just the rankings on the leaderboard. The methods are illustrated with a
recently completed competition to evaluate algorithms capable of detecting,
identifying, and locating radioactive materials in an urban environment.Comment: 36 page
A genetic algorithm with memory for mixed discrete-continuous design optimization
This paper describes a new approach for reducing the number of the fitness function
evaluations required by a genetic algorithm (GA) for optimization problems with
mixed continuous and discrete design variables. The proposed additions to the GA
make the search more effective and rapidly improve the fitness value from generation
to generation. The additions involve memory as a function of both discrete and
continuous design variables, multivariate approximation of the fitness function in
terms of several continuous design variables, and localized search based on the
multivariate approximation. The approximation is demonstrated for the minimum
weight design of a composite cylindrical shell with grid stiffeners
A Genetic Algorithm for Mixed Integer Nonlinear Programming Problems Using Separate Constraint Approximations
This paper describes a new approach for reducing the number of the fitness and constraint function evaluations required by a genetic algorithm (GA) for optimization problems with mixed continuous and discrete design variables. The proposed additions to the GA make the search more effective and rapidly improve the fitness value from generation to generation.The additions involve memory as a function of both discrete and continuous design variables, and multivariate approximation of the individual functions' responses in terms of several continuous design variables. The approximation is demonstrated for the minimum weight design of a composite cylindrical shell with grid stiffeners
Recommended from our members
A designed screening study with prespecified combinations of factor settings
In many applications, the experimenter has limited options about what factor combinations can be chosen for a designed study. Consider a screening study for a production process involving five input factors whose levels have been previously established. The goal of the study is to understand the effect of each factor on the response, a variable that is expensive to measure and results in destruction of the part. From an inventory of available parts with known factor values, we wish to identify a best collection of factor combinations with which to estimate the factor effects. Though the observational nature of the study cannot establish a causal relationship involving the response and the factors, the study can increase understanding of the underlying process. The study can also help determine where investment should be made to control input factors during production that will maximally influence the response. Because the factor combinations are observational, the chosen model matrix will be nonorthogonal and will not allow independent estimation of factor effects. In this manuscript we borrow principles from design of experiments to suggest an 'optimal' selection of factor combinations. Specifically, we consider precision of model parameter estimates, the issue of replication, and abilities to detect lack of fit and to estimate two-factor interactions. Through an example, we present strategies for selecting a subset of factor combinations that simultaneously balance multiple objectives, conduct a limited sensitivity analysis, and provide practical guidance for implementing our techniques across a variety of quality engineering disciplines
Does Work Affect Personality? A Study in Horses
It has been repeatedly hypothesized that job characteristics are related to changes in personality in humans, but often personality models still omit effects of life experience. Demonstrating reciprocal relationships between personality and work remains a challenge though, as in humans, many other influential factors may interfere. This study investigates this relationship by comparing the emotional reactivity of horses that differed only by their type of work. Horses are remarkable animal models to investigate this question as they share with humans working activities and their potential difficulties, such as âinterpersonalâ conflicts or âsuppressed emotionsâ. An earlier study showed that different types of work could be associated with different chronic behavioural disorders. Here, we hypothesised that type of work would affect horses' personality. Therefore over one hundred adult horses, differing only by their work characteristics were presented standardised behavioural tests. Subjects lived under the same conditions (same housing, same food), were of the same sex (geldings), and mostly one of two breeds, and had not been genetically selected for their current type of work. This is to our knowledge the first time that a direct relationship between type of work and personality traits has been investigated. Our results show that horses from different types of work differ not as much in their overall emotional levels as in the ways they express emotions (i.e. behavioural profile). Extremes were dressage horses, which presented the highest excitation components, and voltige horses, which were the quietest. The horses' type of work was decided by the stall managers, mostly on their jumping abilities, but unconscious choice based on individual behavioural characteristics cannot be totally excluded. Further research would require manipulating type of work. Our results nevertheless agree with reports on humans and suggest that more attention should be given to work characteristics when evaluating personalities
- âŠ