Chapter 5
(AST301) Design and Analysis of Experiments II
5 Introduction to Factorial Designs
5.1 Basic Definitions and Principles
Factorial design
Factorial designs deal with the experiments that involve the study of two or more factors.
In factorial design, all possible combinations of the levels of the factors are investigated in each complete replication.
- E.g. if there are
levels of factor and levels of factor , each replicate contains all treatment combinations.
- E.g. if there are
The effect of a factor is defined to be the change in response produced by a change in the level of the factor.
This is frequently called a main effect because it refers to the primary factors of interest in the experiment.
Example 5.1
For example, consider the simple experiment in Figure 5.1.
This is a two-factor factorial experiment with both design factors at two levels. We have called these levels “low” and “high” and denoted them “−” and “+,” respectively.
The main effect of factor
in this two-level design can be thought of as the difference between the average response at the low level of and the average response at the high level of . Numerically, this isThat is, increasing factor
from the low level to the high level causes an average response increase of 21 unitsSimilarly, the main effect of
isIf the factors appear at more than two levels, the above procedure must be modified because there are other ways to define the effect of a factor. This point is discussed more completely later.
Example 5.2
In some experiments, we may find that the difference in response between the levels of one factor is not the same at all levels of the other factors.
When this occurs, there is an interaction between the factors. For example, consider the two-factor factorial experiment shown in Figure 5.2
At the low level of factor
(or ), the effect isAt the high level of factor
(or ), the effect isBecause the effect of
depends on the level chosen for factor , we see that there is interaction between and .The magnitude of the interaction effect is the average difference in these two
effects, or .Clearly, the interaction is large in this experiment.
- An interaction is the failure of one factor to produce the same effect on the response at different levels of another factor.
Illustrating interaction
- These ideas may be illustrated graphically. Figure 5.3 plots the response data in Figure 5.1 against factor
for both levels of factor .- Note that the
and lines are approximately parallel, indicating a lack of interaction between factors and .
- Note that the
- Similarly, Figure 5.4 plots the response data in Figure 5.2. Here we see that the
and lines are not parallel.- This indicates an interaction between factors
and .
- This indicates an interaction between factors
There is another way to illustrate the concept of interaction. Suppose that both of our design factors are quantitative (such as temperature, pressure, time).
Then a regression model representation of the two-factor factorial experiment could be written as
- where
is the response and the variables x1 and x2 are defined on a coded scale from −1 to +1 (the low and high levels of and ).
- where
The parameter estimates in this regression model turn out to be related to the effect estimates.
For the experiment shown in Figure 5.1 we found the main effects of A and
to be A = 21 and = 11.The least square estimates of
and are one-half the value of the corresponding main effect (more on this later)
Now suppose that the interaction contribution to this experiment was not negligible
Figure 5.6 presents the response surface and contour plot for the model
- Interaction is a form of curvature in the underlying response surface model for the experiment
5.2 The advantage of factorials
- Two factors both at two levels:
- Two possible designs
One-factor-at-a-time design
Because experimental error is present, it is desirable to take two observations, say, at each treatment combination and estimate the effects of the factors using average responses.
Thus, a total of six observations are required.
Effects of factors
and can be obtained, but interaction cannot be calculated
Two-factor factorial design
Using just four observations, two estimates of the
effect can be made. Similarly, two estimates of the effect can be made.These two estimates of each main effect could be averaged to produce average main effects that are just as precise as those from the single-factor experiment
Main effects of
and and their interaction can be calculated
Factorial designs are more efficient than one-factor-at-a-time experiments.
- For this example, the relative efficiency of the factorial design to the one-factor-at-a-time experiment is (6/4) = 1.5.
- Generally, this relative efficiency will increase as the number of factors increases.
A factorial design is necessary when interactions may be present to avoid misleading conclusions.
Factorial designs allow the effects of a factor to be estimated at several levels of the other factors, yielding conclusions that are valid over a range of experimental conditions.
5.3 Two-Factor factorial design
An Example
The simplest types of factorial designs involve only two factors or sets of treatments.
Factor
has levels and factor has levels, so there will be in total treatment combinations and each treatment combination is replicated times
Battery life experiment
An engineer is designing a battery for use in a device that will be subjected to some extreme variations in temperature. The only design parameter that he can select at this point is the plate material for the battery, and he has three possible choices. When the device is manufactured and is shipped to the field, the engineer has no control over the temperature extremes that the device will encounter, and he knows from experience that temperature will probably affect the effective battery life. However, temperature can be controlled in the product development laboratory for the purposes of a test.
The engineer decides to test all three plate materials at three temperature levels - 15, 70, and 125°F - because these temperature levels are consistent with the product end-use environment.
Four batteries are tested at each combination of plate material and temperature, and all 36 tests are run in random order.
The experiment and the resulting observed battery design experiment are given below
Important questions:
What effects do material type and temperature have on the life of the battery?
Is there a choice of material that would give uniformly long life regardless of temperature?
General notations
denotes the observed response when factor is at the level and factor is at the level for the replicate .The order in which the
observations are taken is selected at random so that this design is a completely randomized design.Data layout
Modelling data
The observations in a factorial experiment can be described by a model. There are several ways to write the model for a factorial experiment.
The means model
where is the mean corresponding to the treatment combination level of factor and level of factor , and is the random error term.The treatment model (or effects model)
where is the overall mean, is the effect of the level of factor , is the effect of the level of factor , is the interaction between and .
We could also use a regression model as in Section 5.1 (particularly useful when one or more of the factors in the experiment are quantitative).
Throughout most of this chapter we will use the effects model (Equation 5.1) with an illustration of the regression model in Section 5.5.
In the two-factor factorial, both row and column factors (or treatments),
and , are of equal interest.Specifically, we are interested in testing hypotheses about the equality of row treatment effects, say
and the equality of column treatment effects, say
We are also interested in determining whether row and column treatments interact:
Statistical analysis of the fixed effects model
Notations
Decomposing total variation
Total corrected sum of squares can be expressed as
- Equation 5.2 is the fundamental ANOVA equation for the two-factor factorial
The number of degrees of freedom associated with each sum of squares
- Each sum of squares divided by its degrees of freedom is a mean square, e.g.
Expected value of the mean squares
- Under the null hypotheses of no treatment effects and no interaction, the
, , all estimate . If there is significant treatment effect then corresponding mean squares will be larger than
Cochran’s theorem
Cochran’s theorem
Let
Distributions of sum of squares
For the observed response
If
Since
and , and and are independent, thenSimilarly,
Analysis of variance table
Manual computing formulas for sum of squares
Manual computing formulas for sum of squares
The battery design experiment
The battery design experiment
Let
denote the observed lifetime of the battery corresponding to the replication of the treatment combination material type (treatment A) and temperature (treatment B) .Consider the effects model
where is the overall mean, and are the effects of the level of factor and level of factor , is the interction between the level of factor and level of factor .Random error term
is assumed to be normally distributed with mean and a constant variance .
The battery design experiment (anova table)
There is a significant interaction between material type and temperature
Main effects of material type and temperature are also significant
The battery design experiment (interpreting results)
- The significant interaction is indicated by the lack of parallelism of the lines
- In general, longer life is attained at the low temperature, regardless of material type.
- Changing from low to intermediate temperature, battery life with material type 3 increases, whereas it decreases for types 1 and 2.
- From intermediate to high temperature, battery life decreases for material types 2 and 3, and is essentially unchanged for type 1.
- Material type 3 seems to give the best result if we want less loss of effective life as the temperature changes.
The battery design experiment (multiple comparisons)
One of the goals of the experiment is to identify the best treatment combination. In two-factor factorial experiment, significance of interaction plays an important role in selecting the best treatment combination.
When interaction is not significant, multiple comparison methods can be used to identify the best level for each factor separately.
When interaction is significant, the best level of one factor need to be identified at each level of the other factor. e.g. comparisons between the means of factor
can be obtained for a specific level of factor applying Tukey’s test.
Suppose we are interested in detecting differences among the means of the three material types.
Because interaction is significant, we make this comparison at just one level of temperature, say level 2 (70°F)
The three material type averages at
arranged in ascending order are
Model adequacy checking
Before making the conclusions from the analysis of variance, the adequacy of the underlying model should be checked using residual analysis (e.g. checking normality, independence, constant variance, etc.).
The residuals for the two-factor factorial model
Different tools of residual analysis:
- q-q normal plot of residuals
- Plot of residuals against fitted values
- Plot of residuals against different factors separately
Estimating model parameters
Estimating model parameters
The effects model for two-factor factorial is
- They may be estimated by least squares. Because the model has
parameters to be estimated, and there normal equations.
Using the method of Section 3.9, we find that it is not difficult to show that the normal equations are
The effects model (Equation 5.3) is an overparameterized model.
Notice that the
equations in Equation 5.14b sum to Equation 5.14a and that the equations of Equation 5.14c sum to Equation 5.14a.Also summing Equation 5.14d over
for a particular will give Equation 5.14b, and summing Equation 5.14d over for a particular will give Equation 5.14c.Therefore, there are
linear dependencies in this system of equations, and no unique solution will exist.In order to obtain a solution, we impose the constraints
Equations 5.15a and 5.15b constitute two constraints, whereas Equations 5.15c and 5.15d form
Applying these constraints, the normal equations (Equations 5.14) simplify considerably, and we obtain the solution
The MLE
The log-likelihood function
Here
Therefore, the solutions obtained (through LSE) are also MLE.
Choice of sample size
In any experimental design problem, a critical decision is the choice of sample size — that is, determining the number of replicates to run.
Generally, if the experimenter is interested in detecting small effects, more replicates are required than if the experimenter is interested in detecting large effects.
OC (Operating Characteristic) curve
An operating characteristic (OC) curve is a plot of the type II error probability
of a statistical test for a particular sample size versus a parameter that reflects the extent to which the null hypothesis is false.These curves can be used to guide the experimenter in selecting the number of replicates so that the design will be sensitive to important potential differences in the treatments.
The operating characteristic curves in Appendix Chart V (Montgomery book) can be used to assist the experimenter in determining an appropriate sample size (number of replicates,
) for a two-factor factorial design.Curves are available for
and and a range of degrees of freedom for numerator and denominator.
OC curve for two-factor factorial design
OC Curve parameters for chart V of the Appendix for the two-factor factorial, fixed effects model
- For two-factor factorial design, the appropriate value of the parameter
and the numerator and denominator degrees of freedom are shown in the table.
To determine
, we need to know the actual values of the treatment means on which the sample size decision should be based. If we know that, we can use the formulas of the table and proceed as the one-factor design. But a set of actual treatment means is not available most of the time.An alternate approach is to select a sample size such that if the difference between any two treatment means exceeds a specified value, the null hypothesis should be rejected.
For example, if the difference in any two row (Factor
) means is , then the minimum value of isIf the difference in any two column (Factor
) means is , then the minimum value of isFinally, the minimum value of
corresponding to a difference of between any two interaction effects is
OC curve(Battery design experiment)
To illustrate the use of these equations, consider the battery design experiment.
Suppose that before running the experiment we decide that the null hypothesis should be rejected with a high probability if the difference in mean battery life between any two temperatures is as great as 40 hours.
Thus a difference of
has engineering significance, and if we assume that the standard deviation of battery life is approximately 25 , then the corresponding equation gives
as the minimum value of
Assuming that
Note that
replicates give a risk of about 0.06, or approximately a 94 percent chance of rejecting the null hypothesis if the difference in mean battery life at any two temperature levels is as large as 40 hours.Thus, we conclude that four replicates are enough to provide the desired sensitivity as long as our estimate of the standard deviation of battery life is not seriously in error.
If in doubt, the experimenter could repeat the above procedure with other values of
to determine the effect of mis-estimating this parameter on the sensitivity of the design.
Assumption of no interaction in a two-factor model
Occasionally, an experimenter feels that a two-factor model without interaction is appropriate, say
The statistical analysis of a two-factor factorial model without interaction is straightforward. The following table presents the analysis of the battery design experiment, assuming no interaction.
As noted previously, both main effects are significant. However, as soon as a residual analysis is performed for these data, it becomes clear that the no-interaction model is inadequate.
For the two-factor model without interaction, the fitted values are
One observation per cell
Occasionally, one encounters a two-factor experiment with only a single replicate, that is, only one observation per cell. If there are two factors and only one observation per cell, the effects model is
The analysis of variance for this situation is shown in Table 5.9, assuming that both factors are fixed.
From examining the expected mean squares, we see that the error variance
5.4 The general factorial design
General factorial design
The two-factor factorial design may be extended to the general case where there are a levels of factor
In general, there will be
Note that we must have at least two replicates
For example, consider the three-factor analysis of variance model:
Assuming
Soft drink bottling problem
A soft drink bottler is interested in obtaining more uniform fill heights in the bottles produced by his manufacturing process. The filling machine theoretically fills each bottle to the correct target height, but in practice, there is variation around this target, and the bottler would like to understand the sources of this variability better and eventually reduce it.
The process engineer can control three variables during the filling process: the percent carbonation (A), the operating pressure in the filler (B), and the bottles produced per minute or the line speed (C). For purposes of an experiment, the engineer can control carbonation at three levels: 10, 12, and 14 percent. She chooses two levels for pressure (25 and 30 psi) and two levels for line speed (200 and 250 bpm)
The engineer decides to run two replicates of a factorial design in these three factors, with all 24 runs taken in random order. The response variable observed is the average deviation from the target fill height observed in a production run of bottles at each set of conditions. Positive deviations are fill heights above the target, whereas negative deviations are fill heights below the target. The circled numbers are the three-way cell totals
From the ANOVA we see that the percentage of carbonation, operating pressure, and line speed significantly affect the fill volume.
The carbonation pressure interaction
To assist in the practical interpretation of this experiment, the following figure presents plots of the three main effects and the
The main effect plots are just graphs of the marginal response averages at the levels of the three factors. Notice that all three variables have positive main effects; that is, increasing the variable moves the average deviation from the fill target upward.
The interaction between carbonation and pressure is fairly small, as shown by the similar shape of the two curves.
5.5 Fitting Response Curves and Surfaces
The ANOVA always treats all of the factors in the experiment as if they were qualitative or categorical. Many experiments involve at least one quantitative factor. It can be useful to fit a response curve to the levels of a quantitative factor so that the experimenter has an equation that relates the response to the factor
This equation might be used for interpolation, that is, for predicting the response at factor levels between those actually used in the experiment. When at least two factors are quantitative, we can fit a response surface for predicting y at various combinations of the design factors. In general, linear regression methods are used to fit these models to the experimental data
The battery design experiment
Consider the battery life experiment described previously. The factor temperature is quantitative, and the material type is qualitative.
Furthermore, there are three levels of temperature.
Consequently, we can compute a linear and a quadratic temperature effect to study how temperature affects the battery life.
Because material type is a qualitative factor there is an equation for predicted life as a function of temperature for each material type.
Figure 5.18 shows the response curves generated by these three prediction equations.
5.6 Blocking in a facorial design
Example 5.6
An engineer is studying methods for improving the ability to detect targets on a radar scope. Two factors she considers to be important are the amount of background noise, or ground clutter, on the scope and the type of filter placed over the screen. An experiment is designed using 3 levels of ground clutter and 2 filter types.
Because of operator availability, it is convenient to select an operator and keep him or her at the scope until all the necessary runs have been made. Furthermore, operators differ in their skill and ability to use the scope. Consequently, it seems logical to use the operators as blocks.
Four operators are randomly selected. Once an operator is chosen, the order in which the six treatment combinations are run is randomly determined.
Thus, we have a
Both ground clutter level and filter type are significant at the 1 percent level, whereas their interaction is significant only at the 10 percent level.
Thus, we conclude that both ground clutter level and the type of scope filter used affect the operator’s ability to detect the target, and there is some evidence of mild interaction between these factors.
Exercise
The following output was obtained from a computer program that performed a two-factor ANOVA on a factorial experiment.
- Fill in the blanks in the ANOVA table. You can use bounds on the p-values.
- How many levels were used for factor
? - How many replicates of the experiment were performed?
- What conclusions would you draw about this experiment?
Practice exercises from Montgomery book:
5.41, 5.42, 5.43, 5.44, 5.45, 5.46