Chapter 14
(AST301) Design and Analysis of Experiments II
14 Nested and Split-Plot Designs
This chapter introduces two important types of experimental designs, the nested design and the split-plot design.
Both of these designs find reasonably widespread application in the industrial use of designed experiments.
They also frequently involve one or more random factors, and so some of the concepts introduced in Chapter 13 will find application here.
14.1 The Two-Stage Nested Design
In a nested design, the levels of one factor
is similar to but not identical to each other at different levels of another factor ( )Consider a company that purchases material from three suppliers
- The material comes in batches
- Is the purity of the material uniform?
Experimental design
- Select four batches at random from each supplier
- Make three purity determinations from each batch
If this were a factorial, then batch 1 would always refer to the same batch, batch 2 would always refer to the same batch, and so on. This is clearly not the case because the batches from each supplier are unique for that particular supplier.
Sometimes we may not know whether a factor is crossed in a factorial arrangement or nested. If the levels of the factor can be renumbered arbitrarily as in Figure 14.2, then the factor is nested.
Statistical Analysis
- Statistical model for two-stage nested design
The notation
- Factors
and could be fixed and/or random - This is a balanced nested design as equal number of levels of
within each level of and equal number of replicates. - No interaction
Decomposition of sum of squares
What are the corresponding degrees of freedom and expressions of mean squares?
If the errors are
The appropriate statistics for testing the effects of factors
If factors
That is, the
If
Mixed models with
The expected mean squares for these three situations is given in the next table.
- If the levels of
and are fixed, is tested by and is tested by
- If
is a fixed factor and is random, is tested by and is tested by
- Finally, if both
and are random factors, is tested by and is tested by
Computing formulas for the sums of squares are given below
Example 14.1
Consider a company that buys raw material in batches from three different suppliers. The purity of this raw material varies considerably, which causes problems in manufacturing the finished product. We wish to determine whether the variability in purity is attributable to differences between the suppliers. Four batches of raw material are selected at random from each supplier, and three determinations of purity are made on each batch.
This is, of course, a two-stage nested design. The data, after coding by subtracting 93, are shown below.
The sums of squares are computed as follows
- There is no difference in purity among suppliers, but significant difference in purity among batches (within suppliers)
Diagnostic checking
For the model
, the estimates of parameters areThe fitted model
The residuals
ANOVA indicates that there is statistically significant batch-to-batch variability. But, is the variability within batches the same for all suppliers? The plot of residuals versus supplier can help us answer this.
Estimates of Variance Components
For the random effects case, the analysis of variance method can be used to estimate the variance components
Many applications of nested designs involve a mixed model, with the main factor
We estimate the variance components as
From the analysis in Example 14.1 (in Table 14.4), we know that the
14.2 The General -Stage Nested Design
Suppose a foundry wishes to investigate the hardness of two different formulations of a metal alloy. Three heats of each alloy formulation are prepared, two ingots are selected at random from each heat for testing, and two hardness measurements are made on each ingot. In this experiment, heats are nested under the levels of the factor alloy formulation, and ingots are nested under the levels of the factor heats. Thus, this is a three-stage nested design with two replicates.
The model for the general three stage nested design is
is the effect of the th alloy formulation, is the effect of the th heat within the th alloy, is the effect of the th ingot within the th heat and th alloy, and is the usual error term.
Exercise
Given a three-stage nested design
Find degrees of freedom and expressions of expected mean squares for the following situations:
- All the factors
, , and are fixed - Factors
and are fixed, and is random - Factor
is fixed, and and are random
- All the factors
14.3 Designs with Both Nested and Factorial Factors
Occasionally in a multifactor experiment, some factors are arranged in a factorial layout and other factors are nested.
We sometimes call these designs nested–factorial designs. The statistical analysis of one such design with three factors is illustrated in the following example.
Assume that fixtures and layouts are fixed, operators are random – gives a mixed model (use restricted form)
14.4 Split-plot design
The split-plot is a multifactor experiment where it is not possible to completely randomize the order of the runs
A paper manufacturer is interested in examining the effect of pulp preparation method and cooking temperature on the tensile strength of the paper
- Three pulp preparation methods
- Four different temperatures
- Each replicate requires 12 runs
- The experimenters want to use three replicates
- How many batches of pulp are required?
- Pulp preparation methods is a hard-to-change factor
Consider an alternate experimental design:
- In replicate 1: select a pulp preparation method, prepare a batch
- Divide the batch into four sections or samples, and assign one of the temperature levels to each
- Repeat for each pulp preparation method
- Conduct replicates 2 similarly
- Conduct replicates 3 similarly
Each replicate (sometimes called blocks) has been divided into three parts, called the whole plots
Pulp preparation methods is the whole plot treatment
Each whole plot has been divided into four subplots or split-plots
Temperature is the subplot treatment
Generally, the hard-to-change factor is assigned to the whole plots
This design requires only 9 batches of pulp (assuming three replicates)
This is not a randomized block design with three levels of pulp preparation and four levels of cooking temperature (why?)
To be a randomized block design the order of the experiments within a block should be completely randomized which is not the case for our example where we only randomize the order of cooking temperature within a pulp preparation
This design is known as split-plot design where each replicate (block) is divided into three whole plots (pulp preparation) and each whole plot is divided into four subplots (cooking temperature)
Since the whole plot treatments are confounded with whole plot where as subplots are not confounded, so treatment of interest best to assign into subplots, if possible
Split-plot design can be viewed as two experiments combined or superimposed on each other
One experiment has the whole plot factor applied to the large experimental units (factor whose level is hard to change) and the other experiment has the subplot factor applied to the smaller units (factor whose level is easy to change)
In general split-plot within a whole plot will be more similar than split plots in different whole plots.
Within whole plot, comparisons will generally be more precise than between whole plot comparisons, i.e. estimates of
and will be more precise compared to the estimates ofIf the levels of all factors are easy to change, split-plot designs are recommended only when there is a considerable less interest in one or more of the treatment factors
Analysis of Split-plot design
In the statistical analysis of split-plot designs, we must take into account the presence of two different sizes of experimental units used to test the effect of whole plot treatment and split-plot treatment.
Factor
effects are estimated using the whole plots and factor and the interaction effects are estimated using the split plots.
The linear model for the split-plot design is
, and represent the whole plot and correspond, respectively, to replicates, main treatments (factor A), and whole-plot error replicates , and represent the subplot and correspond, respectively, to the subplot treatment (factor ), the replicates and interactions, and the subplot error (replicates
The expected mean squares for the split-plot design, with replicates random and main treatments and subplot treatments fixed, are shown below
Note from Table 14.18 that the subplot error (4.24) is less than the whole-plot error (9.07).
This is the usual case in split-plot designs because the subplots are generally more homogeneous than the whole plots.
This results in two different error structures for the experiment.
Because the subplot treatments are compared with greater precision, it is preferable to assign the treatment we are most interested in to the subplots, if possible.
The split-plot design has an agricultural heritage, with the whole plots usually being large areas of land and the subplots being smaller areas of land within the large areas.
For example, several varieties of a crop could be planted in different fields (whole plots), one variety to a field. Then each field could be divided into, say, four subplots, and each subplot could be treated with a different type of fertilizer.
Here the crop varieties are the main treatments and the different fertilizers are the subtreatments.