19 research outputs found
Analysis of variance in soil research: let the analysis fit the design
Sound design for experiments on soil is based on two fundamental principles: replication and randomization. Replication enables investigators to detect and measure contrasts between treatments against the backdrop of natural variation. Random allocation of experimental treatments to units enables effects to be estimated without bias and hypotheses to be tested. For inferential tests of effects to be valid an analysis of variance (anova) of the experimental data must match exactly the experimental design. Completely randomized designs are usually inefficient. Blocking will usually increase precision, and its role must be recognized as a unique entry in an anova table. Factorial designs enable questions on two or more factors and their interactions to be answered simultaneously, and split-plot designs may enable investigators to combine factors that require disparate amounts of land for each treatment. Each such design has its unique correct anova; no other anova will do. One outcome of an anova is a test of significance. If it turns out to be positive then the investigator may examine the contrasts between treatments to discover which themselves are significant. Those contrasts should have been ones in which the investigator was interested at the outset and which the experiment was designed to test. Post-hoc testing of all possible contrasts is deprecated as unsound, although the procedures may guide an investigator to further experimentation. Examples of the designs with simulated data and programs in GenStat and R for the analyses of variance are provided as File S1
The Statistician and the Computer
This paper reviews the impact of the computer on the analysis and interpretation of data. It suggests the need for professional statisticians to recognize that almost all future analysis of data will be carried out by non-statisticians with the aid of statistical program packages. Therefore, the emphasis of statistical training for scientists, engineers, administrators and decision-makers should be on the design of data collection and the choice of appropriate methods of analysis. Both in the teaching of statistics and in the development of computer programs for statistical analysis there are important and urgent tasks to be addressed by professional statisticians
Multivariate analysis of a reference collection of elm leaves
Analysis of leaf measurements of a reference collection of elm leaves suggests that the leaves, collected from trees which were selected by R. H. Richens as spanning the range of elm in Europe, fall into two primary groups. The first of these groups is made up of the representatives of the Wych elm (Ulmus glabra), which is known to be a distinct species. The second, larger, group is made up of an assemblage of subgroups that represent English elm (Ulmus procera), Ulmus minor and hybrids between Ulmus minor and Ulmus glabra. A minimum variance cluster analysis suggests that these subgroups are reasonably distinct, and discriminant analysis, logistic regression and a genetic algorithm are used to help identify the subgroups
The Importance of Exporatory Data-Analysis Before the use of Sophisticated Procedures
This note summarises a re-analysis of a set of data used as an example of the fitting of a reduced-rank multivariate regression in joint toxicity experiments. It illustrates the point that exploratory data analysis should always be used before embarking on more complex statistical procedures