Chapter 13

(AST301) Design and Analysis of Experiments II

Author

Md Rasel Biswas

13 Experiments with Random Factors

13.1 Introduction

Throughout most of this book we have assumed that the factors in an experiment were fixed factors, that is, the levels of the factors used by the experimenter were the specific levels of interest.

The implication of this, of course, is that the statistical inferences made about these factors are confined to the specific levels studied.

That is, if three material types are investigated as in the battery life experiment of Example 5.1, our conclusions are valid only about those specific material types.


A variation of this occurs when the factor or factors are quantitative. In these situations, we often use a regression model relating the response to the factors to predict the response over the region spanned by the factor levels used in the experimental design.

Several examples of this were presented in Chapters 5 through 9. In general, with a fixed effect, we say that the inference space of the experiment is the specific set of factor levels investigated.


In some experimental situations, the factor levels are chosen at random from a larger population of possible levels, and the experimenter wishes to draw conclusions about the entire population of levels, not just those that were used in the experimental design.

In this situation, the factor is said to be a random factor.

The random effect model was introduced in Chapter 3 for a single-factor experiment, and we used that to introduce the random effects model for the analysis of variance and components of variance.


For example: a company has 50 machines that make cardboard cartons for canned goods, and they want to understand the variation in strength of the cartons.

They choose ten machines at random from the 50 and make 40 cartons on each machine, assigning 400 lots of feedstock cardboard at random to the ten chosen machines.

The resulting cartons are tested for strength. This is a completely randomized design, with ten treatments and 400 units.


Fixed Effects Model

yij=μ+τi+ϵijϵijN(0,σ2)

  • μ: overall mean
  • τi: fixed effect of treatment i
  • ϵij: random error
  • τi are fixed unknown parameters

Random Effects Model

yij=μ+τi+ϵijτiN(0,στ2)ϵijN(0,σ2)

  • μ: overall mean
  • τi: random effect of treatment i
  • ϵij: random error
  • τi are random variables
  • Notice that we still decompose the model into: Overall mean (μ), Treatment effect (τi), Random error (ϵij)

  • Why Fixed-Effects Assumptions Don’t Make Sense in Random Effects Model?


1. Treatment levels are not fixed but randomly sampled

  • In the fixed-effects model, the treatment levels (e.g., different brands, machines, or methods) are specifically chosen and of interest.

  • In the random-effects model, these levels are assumed to be a random sample from a larger population of possible treatments.

  • Therefore, estimating individual treatment effects (τi) is less meaningful — we care more about the variation among treatments, not their specific values.


2. The focus shifts from estimation to generalization

  • In fixed-effects, we want to compare specific treatment effects.

  • In random-effects, we aim to generalize to the broader population of treatments.

  • So, we’re more interested in estimating variance components (like στ2) to understand how much treatments vary, not just how they differ.


3. Inference is about variance components

  • In random-effects, variability in treatment levels is treated as another source of random variation.

  • This affects how we partition the total variance and how we perform statistical inference (like testing and confidence intervals).


In this chapter, we focus on methods for the design and analysis of factorial experiments with random factors.

In Chapter 14, we will present nested and split-plot designs, two situations where random factors are frequently encountered in practice.

Review: Random Effects Model

  • Random effects model is defined only for the random factors, e.g. yij=μ+τi+ϵij,i=1,,a;j=1,,n where both τi and ϵij are random variables (τi is not parameter), which are assumed to follow N(0,στ2) and N(0,σ2), respectively.

  • τi and ϵij are independent

  • Variance structure cov(yij,yij)={στ2+σ2ifi=i,j=jστ2ifi=i,jj0ifii

  • στ2 and σ2 are known as variance components


The parameters of the random effects model are the overall mean μ, the error variance σ2, and the variance of the treatment effects στ2; the treatment effects τi are random variables, not parameters.

We want to make inferences about these parameters; we are not so interested in making inferences about the τi ’s and ϵij ’s.

Typical inferences would be point estimates or confidence intervals for the variance components, or a test of the null hypothesis that the treatment variance στ2 is 0


  • Hypothesis considered for the fixed effects model H0:no difference between treatment levels is no longer useful for the random effects model

  • For random effects model the hypothesis regarding no treatment effects is defined as H0:στ2=0vsH1:στ2>0


  • For random effects model, the sum of squares identity SST=SSTreat+SSE remains valid

  • It can be shown E(MSTreat)=σ2+nστ2andE(MSE)=σ2

  • Under the null hypothesis H0:στ2=0, the statistic F0=MSTreatMSE follows a F-distribution with (a1) and a(n1) degrees of freedom


  • Beside hypothesis testing, estimation of random effects parameters is also of interest in analyzing random effects models

  • We have E(MSTreat)=σ2+nστ2andE(MSE)=σ2 so the unbiased estimators of σ2 and στ2 are σ^2=MSEandσ^τ2=MSTreatMSEn


  • CI of σ2 can be constructed using the result a(n1)MSEσ2χa(n1)2

  • We can write the 100(1α)% CI pr[χa(n1),α/22a(n1)MSEσ2χa(n1),1α/22]=1αpr[a(n1)MSEχa(n1),1α/22σ2a(n1)MSEχa(n1),α/22]=1α


  • The CI for στ2 is not straight forward, but it is easy to obtain the CI for στ2/(σ2+στ2) and στ2/σ2 using the result MSTreat/(nστ2+σ2)MSE/σ2Fa1,a(n1)

Example 3.11

A textile company weaves a fabric on a large number of looms. It would like the looms to be homogeneous so that it obtains a fabric of uniform strength. The process engineer suspects that, in addition to the usual variation in strength within samples of fabric from the same loom, there may also be significant variations in strength between looms. To investigate this, she selects four looms at random and makes four strength determinations on the fabric manufactured on each loom. This experiment is run in random order, and the data obtained are shown in Table 3.17.


The standard ANOVA partition of the sum of squares is appropriate. There is nothing new in terms of computing.


From the ANOVA, we conclude that the looms in the plant differ significantly.

The variance components are estimated by σ^2=1.90 and σ^τ2=29.731.904=6.96

Therefore, the variance of any observation on strength is estimated by σ^y=σ^2+σ^τ2=1.90+6.96=8.86. Most of this variability is attributable to differences between looms.

13.2 The Two-Factor Factorial with Random Factors

  • Two factors A and B, a levels of A and b levels of B are randomly selected in the experiment. The model yijk=μ+τi+βj+(τβ)ij+ϵijk, where τi, βj, (τβ)ij, and ϵijk are random

  • Assumptions τiN(0,στ2),βjN(0,σβ2),(τβ)ijN(0,στβ2),ϵijkN(0,σ2)

  • V(yijk)=στ2+σβ2+στβ2+σ2

  • Hypotheses of interest (a)H0:στ2=0againstH1:στ2>0(b)H0:σβ2=0againstH1:σβ2>0(c)H0:στβ2=0againstH1:στβ2>0


  • The form of the test statistics depend on the expected mean squares

  • Expected mean squares E(MSA)=σ2+nστβ2+bnστ2E(MSB)=σ2+nστβ2+anσβ2E(MSAB)=σ2+nστβ2E(MSE)=σ2

  • Test statistic for H0:στβ2=0 F0=MSABMSEF(a1)(b1),ab(n1)


  • Expected mean squares E(MSA)=σ2+nστβ2+bnστ2E(MSB)=σ2+nστβ2+anσβ2E(MSAB)=σ2+nστβ2E(MSE)=σ2
  • Test statistic for H0:στ2=0 F0=MSAMSABF(a1),(a1)(b1)

  • Expected mean squares E(MSA)=σ2+nστβ2+bnστ2E(MSB)=σ2+nστβ2+anσβ2E(MSAB)=σ2+nστβ2E(MSE)=σ2
  • Test statistic for H0:σβ2=0 F0=MSBMSABF(b1),(a1)(b1)

Notice that these test statistics are not the same as those used if both factors A and B are fixed.

The expected mean squares are always used as a guide to test statistic construction.

In many experiments involving random factors, interest centers at least as much on estimating the variance components as on hypothesis testing.


  • Estimates of the variance components σ^2=MSEσ^τβ2=MSABMSEnσ^τ2=MSAMSEbnσ^β2=MSBMSEan

A Measurement Systems Capability Study

  • A Measurement System Capability Study (also called Gauge R&R study, where R&R stands for Repeatability and Reproducibility) is a key part of quality control and process improvement — especially in manufacturing and lab settings.

  • Gauge R&R study evaluates how much variation in your measurement data is coming from:

    • The actual process or product you’re measuring

    • The measurement system itself (which includes the instrument and the operator)

  • In short, it tells you: “Can we trust our measurement system?”


Main Goals

  • Assess how precise and reliable your measurements are
  • Quantify measurement error
  • Determine whether your measurement system is suitable for use in a process control or quality monitoring environment

Two Key Components

  1. Repeatability Variation when the same operator measures the same item multiple times using the same instrument.

  2. Reproducibility Variation between operators (or appraisers), i.e., when different people measure the same item using the same instrument.


Basic Experimental Setup

To perform a Gauge R&R study, you typically:

  • Choose n parts from the process (covering the process range)

  • Have m operators

  • Each operator measures each part r times (repeated measures)


(Example 13.1)

A typical gauge R&R experiment is shown in Table 13.1. An instrument or gauge is used to measure a critical dimension on a part.

Twenty parts have been selected from the production process, and three randomly selected operators measure each part twice with this gauge.

The order in which the measurements are made is completely randomized, so this is a two-factor factorial experiment with design factors parts and operators, with 2 replications.

Both parts and operators are random factors. So, we’re more interested in estimating variance components than testing specific factor levels.



Let: yijk=μ+Pi+Oj+(PO)ij+ϵijk

Where:

  • yijk: the k-th measurement of part i by operator j
  • μ: overall mean
  • PiN(0,σP2): random effect of the i-th part
  • OjN(0,σO2): random effect of the j-th operator
  • (PO)ijN(0,σPO2): interaction between part and operator
  • ϵijkN(0,σ2): repeatability (pure measurement error)


Estimating Variance Components

Using Method of Moments we can estimate:

σ^2(repeatability)=0.99σ^P2(part variation)=62.390.71(3)(2)=10.28σ^O2(operator variation)=1.310.71(20)(2)=0.015σ^PO2(part-operator interaction)=0.710.992=0.14


As interaction is not significant, the reduced model is yijk=μ+Pi+Oj+ϵijk


σ^P2=62.390.88(3)(2)=10.25σ^O2(reproducibility)=1.310.88(20)(2)=0.0108σ^2(repeatability)=0.88


Finally, we could estimate the variance of the gauge as the sum of the variance component estimates σ^2 and σ^O2 as σ^gauge 2=σ^2+σ^O2=0.88+0.0108=0.8908

The variability in the gauge appears small relative to the variability in the product.

This is generally a desirable situation, implying that the gauge is capable of distinguishing among different grades of product.

13.3 The Two-Factor Mixed Model

  • Suppose the levels of the factor A are fixed and the levels of factor B are random
  • The two-factor mixed model can be expressed as yijk=μ+τi+βj+(τβ)ij+ϵijk, where τi is fixed, and βj, (τβ)ij and ϵij are random
  • Assumptions βjN(0,σβ2),ϵijN(0,σ2),(τβ)ijN(0,στβ2(a1)/a)
  • Restrictions: iτi=0, i(τβ)ij=0
  • This type of mixed model is known as restricted mixed model

  • The expected value of the mean squares E(MSA)=σ2+nστβ2+bniτi2a1E(MSB)=σ2+anσβ2E(MSAB)=σ2+nστβ2E(MSE)=σ2
  • Test statistic for H0:τi=0,i F0=MSAMSABFa1,(a1)(b1)

  • The expected value of the mean squares E(MSA)=σ2+nστβ2+bniτi2a1E(MSB)=σ2+anσβ2E(MSAB)=σ2+nστβ2E(MSE)=σ2
  • Test statistic for H0:σβ2=0 F0=MSBMSEFb1,ab(n1)

  • The expected value of the mean squares E(MSA)=σ2+nστβ2+bniτi2a1E(MSB)=σ2+anσβ2E(MSAB)=σ2+nστβ2E(MSE)=σ2
  • Test statistic for H0:στβ2=0 F0=MSABMSEF(a1)(b1),ab(n1)

In the mixed model, it is possible to estimate the fixed factor effects as before which are shown here: μ^=y¯τ^i=y¯i..y¯i=1,2,,a The variance components can be estimated using the analysis of variance method by equating the expected mean squares to their observed values: σ^β2=MSBMSEanσ^τβ2=MSABMSEnσ^2=MSE


  • Unrestricted mixed models: no restriction of the random effects terms yij=μ+αi+γj+(αγ)ij+ϵijk, where αi’s are fixed effects such that iαi=0, γjN(0,σγ2), (αγ)ijN(0,σij2), and ϵijN(0,σ2).
  • The expected mean squares E(MSA)=σ2+nσαγ2+bniαi2a1E(MSB)=σ2+nσαγ2+anσγ2E(MSAB)=σ2+nσαγ2E(MSE)=σ2

13.4 Rules for Expected Mean Squares

An important part of experimental design problem is conducting the analysis of variance.

This involves determining the sum of squares for each component in the model and number of degrees of freedom associated with each sum of squares.

To construct appropriate test statistics, the expected mean squares must be determined.


By examining the expected mean squares, one may develop the appropriate statistic for testing hypotheses about any model parameter.

The test statistic is a ratio of mean squares that is chosen such that the expected value of the numerator mean square differs from the expected value of the denominator mean square only by the variance component or the fixed factor in which we are interested.


  • Rule 1. The error term in the model is ϵijm, where the subscript m denotes the replication subscript. For the two-factor model, this rule implies that the error term is ϵijk. The variance component associated with ϵijm is σ2.

  • Rule 2. In addition to an overall mean (μ) and an error term ϵijm, the model contains all the main effects and any interactions that the experimenter assumes exist. If all possible interactions between k factors exist, then there are (k2) two-factor interactions, (k3) three-factor interactions, ,1k-factor interaction. If one of the factors in a term appears in parentheses, then there is no interaction between that factor and the other factors in that term.

  • Rule 3. For each term in the model, divide the subscripts into three classes:

    • live - those subscripts that are present in the term and are not in the parenthesis
    • dead - those subscripts that are present in the term and are in the parenthesis
    • absent - those subscripts that are present in the model but not in that particular term

E.g. for two-factor fixed effects model, in (τβ)ij, i and j are live, and k is absent; in ϵ(ij)k, k is live, and i and j are dead

(We haven’t seen models with dead subscripts, but we will encounter such models later.)


  • Rule 4. Degrees of freedom. The number of degrees of freedom for any term in the model is the product of the number of levels associated with each dead subscript and the number of levels minus 1 with each live subscript.

E.g. the number of degrees of freedom associated with (τβ)ij is (a1)(b1), and the number of degrees of freedom associated with ϵ(ij)k is ab(n1).

The number of degrees of freedom for error is obtained by subtracting the sum of all other degrees of freedom from N1, where N is the total number of observations.


  • Rule 5. Each term in the model has either a variance component (random effect) or a fixed factor (fixed effect) associated with it.

If the interaction term contain at least one random effect, the entire effect is termed is considered as random

A variance component has Greek letters as subscripts to identify the particular random effect, e.g. σβ2 is the variance component corresponding to random factor B

A fixed effect always represented by the sum of squares of the model components associated with that factor divided by its degrees of freedom, e.g. iτi2/(a1) for factor A when it is fixed


  • Rule 6. There is an expected mean square for each model component. The expected mean square for error is E(MSE)=σ2.

In case of the restricted model, for every other model term, the expected mean square contains

  • σ2 plus
  • either the variance component or the fixed effect component for that term, plus
  • those components for all other model terms that contain the effect in question and that involve no interactions with other fixed effects.

The coefficient of each variance component or fixed effect is the number of observations at each distinct value of that component.


To illustrate for the case of the two-factor fixed effects model, consider finding the interaction expected mean square, E(MSAB).

  • The expected mean square will contain only the fixed effect for the AB interaction (because no other model terms contain AB) plus σ2, and the fixed effect for AB will be multiplied by n because there are n observations at each distinct value of the interaction component (the n observations in each cell).
  • Thus, the expected mean square for AB is E(MSAB)=σ2+ni=1aj=1b(τβ)ij2(a1)(b1)

  • As another illustration of the two-factor fixed effects model, the expected mean square for the main effect of A would be E(MSA)=σ2+bni=1aτi2(a1) The multiplier in the numerator is bn because there are bn observations at each level of A. The AB interaction term is not included in the expected mean square because while it does include the effect in question (A), factor B is a fixed effect.

To illustrate how Rule 6 applies to a model with random effects, consider the two-factor random model. The expected mean square for the AB interaction would be E(MSAB)=σ2+nστβ2 and the expected mean square for the main effect of A would be E(MSA)=σ2+nστβ2+bnστ2 Note that the variance component for the AB interaction term is included because A is included in AB and B is a random effect.


Two factor fixed effects model E(MSA)=σ2+bni=1aτi2(a1)E(MSB)=σ2+anj=1bβj2b1E(MSAB)=σ2+ni=1aj=1b(τβ)ij2(a1)(b1)E(MSE)=σ2


Two factor random model E(MSA)=σ2+nστβ2+bnστ2E(MSB)=σ2+nστβ2+anσβ2E(MSAB)=σ2+nστβ2E(MSE)=σ2


Restricted form of two factor mixed model E(MSA)=σ2+nστβ2+bni=1aτi2a1E(MSB)=σ2+anσβ2E(MSAB)=σ2+nστβ2E(MSE)=σ2


Rule 6 can be easily modified to give expected mean squares for the unrestricted form of the mixed model. Simply include the term for the effect in question, plus all the terms that contain this effect as long as there is at least one random factor.

Unrestricted form of two factor mixed model.

E(MSA)=σ2+nστβ2+bni=1aτi2a1E(MSB)=σ2+nστβ2+anσβ2E(MSAB)=σ2+nστβ2E(MSE)=σ2

13.5 Approximate F-Tests

Consider a three-factor factorial experiment with a levels of factor A, b levels of factor B, c levels of factor C, and n replicates.

First, assume that all the factors are fixed. yijkl=μ+τi+βj+γk+(τβ)ij+(τγ)ik+(βγ)jk+(τβγ)ijk+ϵijkl{i=1,2,,aj=1,2,,bk=1,2,,cl=1,2,,n Then the analysis of this design is given below



Now, assume that all the three factors are random. The three-factor random effects model is yijkl=τi+βj+γk+(τβ)ij+(τγ)ik+(βγ)jk+(τβγ)ijk+ϵijkl Assumptions:

  • τiN(0,στ2), βjN(0,σβ2), γkN(0,σγ2)
  • (τβ)ijN(0,στβ2), (τγ)ikN(0,στγ2), (βγ)jkN(0,σβγ2)
  • (τβγ)ijkN(0,στβγ2)
  • ϵijkN(0,σ2)
  • All the random effects are pair-wise independent

The expected mean squares assuming that all the factors are random are

  • What is the test statistic for H0:στ2=0?

  • For three-factor random effects model, no exact test statistic for testing certain effects, e.g.  for H0:στ2=0 one possible test statistic F0=MSAMSABC=σ2+cnστβ2+bnστγ2+nστβγ2+bcnστ2σ2+nστβγ2, which would be useful if the interactions στβ2 and στγ2 are negligible.

  • If we cannot assume that the certain interactions are negligible and we need to make inferences about those effects for which exact tests do not exist, Satterthwaite’ method can be used.

  • Satterthwaite’s method uses the linear combinations of mean squares, for example MS=MSr++MSsMS=MSu++MSv are chosen so that E(MS)E(MS) is equal to a multiple of the effect (the model parameter or variance component) considered in the null hypothesis.


Then the test statistic would be F=MSMS which is distributed approximately as Fp,q, where p=(MSr++MSs)2MSr2/fr++MSs2/fs q=(MSu++MSv)2MSu2/fu++MSv2/fv In p and q, fi is the number of degrees of freedom associated with the mean square MSi

E.g.

E.g. For our example, for testing the null hypothesis, H0:στ2=0, we can use the test statistic F=MSA+MSABCMSAB+MSAC=2σ2+cnστβ2+bnστγ2+2nστβγ2+bcnστ22σ2+2στβγ2+bnστγ2+cnστβ2


  • Under H0, the statistic F follows F-distribution with p and q degrees of freedom, where p=(MSA+MSABC)2(MSA2/fA)+(MSABC2/fABC)q=(MSAB+MSAC)2(MSAB2/fAB)+(MSAC2/fAC) and fA is degrees of freedom associated with MSA.