Graphical display of data for two-way analysis of variance

Produces a rotatable graphic (controlled by the mouse) to display all data points for any two way analysis of variance.

Usage

granova.2w(data, formula = NULL, fit = "linear", ident = FALSE, 
       offset = NULL, ...)

Arguments

data: An N x 3 dataframe. (If it is a matrix, it will be converted to a dataframe.) Column 1 must contain response values or scores for all groups, N in all; columns 2 and 3 should be factors (or will be coerced to factors) showing levels of the two treatments. If rows are named, then for ident= TRUE, points can be identified with those labels, otherwise the row number of data is used. Note that factor levels will (generally) be reordered.
formula: Optional formula used by aov to produce the summary 2-way ANOVA table provided as output. Not used in the scatterplot.
fit: Defines whether the fitted surface will be linear (default) or some more complicated surface, e.g., quadratic, or smooth; see below.
ident: Logical, if TRUE allows interactive identification of individual points using rownames of data on graphic. If rownames are not provided then 1:N is used. Click and hold right mouse button while dragging over point. Right click white space to end.
offset: Number; if NULL then default for identify3d is used.
...: Optional arguments to be passed to scatter3d.

Details

The function depicts data points graphically in a window using the row by column set-up for a two-way ANOVA; the graphic is rotatable, controlled by the mouse. Data-based contrasts (cf. description for one-way ANOVA: granova.1w) are used to ensure a flat surface – corresponding to an additive fit (if fit = linear; see below) – for all cells. Points are displayed vertically (initially) with respect to the fitting surface. In particular, (dark blue) spheres are used to show data points for all groups. The mean for each cell is shown as a white sphere. The graphic is based on rgl and scatter3d; the graphic display can be zoomed in and out by scrolling, where the mouse is used to rotate the entire figure in a 3d representation. The row and column (factor A and B) effects have been used for spacing of the cells on the margins of the fitting surface. As noted, the first column of the input data frame must be response values (scores); the second and third columns should be integers that identify levels of the A and B factors respectively. Based on the row and column means, factor levels are first ordered (from small to large) separately for the row and column means; levels are assumed not to be ordered at the outset.

The function scatter3d is used from car (thanks, John Fox). The value of fit is passed to scatter3d and determines the surface fit to the data. The default value of fit is linear, so that interactions may be seen as departures of the cell means from a flat surface. It is possible to replace linear with any of quadratic, smooth, or additive; see help for scatter3d for details. Note in particular that a formula specified by the user (or the default) has no direct effect on the graphic, but is reflected in the console output.

For data sets above about 300 or 400 points, the default sphere size (set by sphere.size) can be quite small. The optional argument sphere.size = 2 or a similar value will increase the size of the spheres. However, the sphere sizes possible are discrete.

The table of counts for the cell means is printed (with respect the the reordered rows and columns); similarly, the table of cell means is printed (also, based on reordered rows and columns). Finally, numerical summary results derived from function aov are also printed. Although the function accommodates the case where cell counts are not all the same, or when the data are unbalanced with respect to the A & B factors, the surface can be misleading, especially in highly unbalanced data. Machine memory for this function has caused problems with some larger data sets. The authors would appreciate reports of problems or successes with larger data sets.

Note

Right click on the graphic to terminate identify and return the output from the function.

Value

Returns a list with four components:

A.effects: Reordered factor A (second column of data) effects (deviations of A-level means from grand mean)
B.effects: Reordered factor B (third column of data) effects (deviations of B-level means from grand mean)
CellCounts.Reordered: Cell sizes for all A-level, B-level combinations, with rows/columns reordered according to A.effects and B.effects.
CellMeans.Reordered: Means for all cells, i.e., A-level, B-level combinations, with rows/columns reordered according to A.effects and B.effects
anova.summary: Summary aov results, based on input data

References

Fundamentals of Exploratory Analysis of Variance, Hoaglin D., Mosteller F. and Tukey J. eds., Wiley, 1991.

Author

Robert M. Pruzek RMPruzek@yahoo.com

James E. Helmreich James.Helmreich@Marist.edu

Examples

 
# using the R dataset warpbreaks; see documentation 
#(first surface flat since fit = 'linear' (default); 
#second surface shows curvature)
granova.2w(warpbreaks)
#> Loading required namespace: rgl
#> Loading required namespace: mgcv
#> Loading required namespace: MASS
#> $wool.effects
#>     B     A 
#> -2.89  2.89 
#> 
#> $tension.effects
#>     H     M     L 
#> -6.48 -1.76  8.24 
#> 
#> $CellCounts.Reordered
#>     tension
#> wool H M L
#>    B 9 9 9
#>    A 9 9 9
#> 
#> $CellMeans.Reordered
#>     tension
#> wool    H    M    L
#>    B 18.8 28.8 28.2
#>    A 24.6 24.0 44.6
#> 
#> $aov.summary
#>              Df Sum Sq Mean Sq F value   Pr(>F)    
#> wool          1    451   450.7   3.765 0.058213 .  
#> tension       2   2034  1017.1   8.498 0.000693 ***
#> wool:tension  2   1003   501.4   4.189 0.021044 *  
#> Residuals    48   5745   119.7                     
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
granova.2w(warpbreaks, formula = breaks ~ wool + tension)
#> $wool.effects
#>     B     A 
#> -2.89  2.89 
#> 
#> $tension.effects
#>     H     M     L 
#> -6.48 -1.76  8.24 
#> 
#> $CellCounts.Reordered
#>     tension
#> wool H M L
#>    B 9 9 9
#>    A 9 9 9
#> 
#> $CellMeans.Reordered
#>     tension
#> wool    H    M    L
#>    B 18.8 28.8 28.2
#>    A 24.6 24.0 44.6
#> 
#> $aov.summary
#>             Df Sum Sq Mean Sq F value  Pr(>F)   
#> wool         1    451   450.7   3.339 0.07361 . 
#> tension      2   2034  1017.1   7.537 0.00138 **
#> Residuals   50   6748   135.0                   
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
granova.2w(warpbreaks, formula = breaks ~ wool + tension, 
fit = 'quadratic')
#> $wool.effects
#>     B     A 
#> -2.89  2.89 
#> 
#> $tension.effects
#>     H     M     L 
#> -6.48 -1.76  8.24 
#> 
#> $CellCounts.Reordered
#>     tension
#> wool H M L
#>    B 9 9 9
#>    A 9 9 9
#> 
#> $CellMeans.Reordered
#>     tension
#> wool    H    M    L
#>    B 18.8 28.8 28.2
#>    A 24.6 24.0 44.6
#> 
#> $aov.summary
#>             Df Sum Sq Mean Sq F value  Pr(>F)   
#> wool         1    451   450.7   3.339 0.07361 . 
#> tension      2   2034  1017.1   7.537 0.00138 **
#> Residuals   50   6748   135.0                   
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 

# Randomly generated data
resp <- rnorm(80, 0, .25) + rep(c(0, .2, .4, .6), ea = 20)
f1 <- rep(1:4, ea = 20)
f2 <- rep(rep(1:5, ea = 4), 4)
rdat1 <- cbind(resp, f1, f2)
granova.2w(rdat1)
#> $f1.effects
#>      1      2      3      4 
#> -0.362 -0.123  0.134  0.350 
#> 
#> $f2.effects
#>       2       4       1       5       3 
#> -0.1190 -0.0214  0.0251  0.0570  0.0585 
#> 
#> $CellCounts.Reordered
#>    f2
#> f1  2 4 1 5 3
#>   1 4 4 4 4 4
#>   2 4 4 4 4 4
#>   3 4 4 4 4 4
#>   4 4 4 4 4 4
#> 
#> $CellMeans.Reordered
#>    f2
#> f1         2       4       1      5      3
#>   1 -0.16200 -0.1200 -0.0421 0.0288 0.0318
#>   2 -0.00569  0.0618  0.2770 0.2120 0.3860
#>   3  0.32700  0.4620  0.5810 0.4160 0.4270
#>   4  0.59900  0.7460  0.5200 0.8070 0.6240
#> 
#> $aov.summary
#>             Df Sum Sq Mean Sq F value   Pr(>F)    
#> f1           3  5.730  1.9098  25.026 1.26e-10 ***
#> f2           4  0.352  0.0879   1.152    0.341    
#> f1:f2       12  0.524  0.0437   0.573    0.855    
#> Residuals   60  4.579  0.0763                     
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#
rdat2 <- cbind(rnorm(64, 10, 2), sample(1:4, 64, repl = TRUE), 
   sample(1:3, 64, repl = TRUE))
granova.2w(rdat2)
#> $X2.effects
#>      2      3      4      1 
#> -0.377 -0.118 -0.046  0.603 
#> 
#> $X3.effects
#>      3      2      1 
#> -0.278 -0.127  0.485 
#> 
#> $CellCounts.Reordered
#>    X3
#> X2  3 2 1
#>   2 4 8 5
#>   3 4 9 3
#>   4 5 7 4
#>   1 6 3 6
#> 
#> $CellMeans.Reordered
#>    X3
#> X2      3     2     1
#>   2  7.96 10.10 11.00
#>   3  9.59  9.99 11.10
#>   4  9.47 10.80  9.93
#>   1 11.90  8.76 10.80
#> 
#> $aov.summary
#>             Df Sum Sq Mean Sq F value Pr(>F)  
#> X2           3   8.13   2.711   0.830 0.4834  
#> X3           2   5.57   2.784   0.853 0.4322  
#> X2:X3        6  45.75   7.625   2.335 0.0452 *
#> Residuals   52 169.81   3.266                 
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#
#
# \donttest{
data(poison)
#Raw Survival Time as outcome measure:
granova.2w(poison[, c(4, 1, 2)])
#> $Poison.effects
#>    III     II      I 
#> -0.203  0.065  0.138 
#> 
#> $Treatment.effects
#>       A       C       D       B 
#> -0.1650 -0.0869  0.0548  0.1970 
#> 
#> $CellCounts.Reordered
#>       Treatment
#> Poison A C D B
#>    III 4 4 4 4
#>    II  4 4 4 4
#>    I   4 4 4 4
#> 
#> $CellMeans.Reordered
#>       Treatment
#> Poison     A     C     D     B
#>    III 0.210 0.235 0.325 0.335
#>    II  0.320 0.375 0.668 0.815
#>    I   0.412 0.568 0.610 0.880
#> 
#> $aov.summary
#>                  Df Sum Sq Mean Sq F value   Pr(>F)    
#> Poison            2 1.0330  0.5165  23.222 3.33e-07 ***
#> Treatment         3 0.9212  0.3071  13.806 3.78e-06 ***
#> Poison:Treatment  6 0.2501  0.0417   1.874    0.112    
#> Residuals        36 0.8007  0.0222                     
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
# Now with quadratic surface (helpful for this poor metric):
granova.2w(poison[, c(4, 1, 2)], fit = 'quadratic') 
#> $Poison.effects
#>    III     II      I 
#> -0.203  0.065  0.138 
#> 
#> $Treatment.effects
#>       A       C       D       B 
#> -0.1650 -0.0869  0.0548  0.1970 
#> 
#> $CellCounts.Reordered
#>       Treatment
#> Poison A C D B
#>    III 4 4 4 4
#>    II  4 4 4 4
#>    I   4 4 4 4
#> 
#> $CellMeans.Reordered
#>       Treatment
#> Poison     A     C     D     B
#>    III 0.210 0.235 0.325 0.335
#>    II  0.320 0.375 0.668 0.815
#>    I   0.412 0.568 0.610 0.880
#> 
#> $aov.summary
#>                  Df Sum Sq Mean Sq F value   Pr(>F)    
#> Poison            2 1.0330  0.5165  23.222 3.33e-07 ***
#> Treatment         3 0.9212  0.3071  13.806 3.78e-06 ***
#> Poison:Treatment  6 0.2501  0.0417   1.874    0.112    
#> Residuals        36 0.8007  0.0222                     
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#
#Inverse of Survival Time as outcome measure 
#(actually rate of survival, a better version of response, clearly):
granova.2w(poison[, c(5, 1, 2)])
#> $Poison.effects
#>      I     II    III 
#> -0.822 -0.353  1.170 
#> 
#> $Treatment.effects
#>      B      D      C      A 
#> -0.760 -0.461  0.325  0.897 
#> 
#> $CellCounts.Reordered
#>       Treatment
#> Poison B D C A
#>    I   4 4 4 4
#>    II  4 4 4 4
#>    III 4 4 4 4
#> 
#> $CellMeans.Reordered
#>       Treatment
#> Poison    B    D    C    A
#>    I   1.16 1.69 1.86 2.49
#>    II  1.39 1.70 2.71 3.27
#>    III 3.03 3.09 4.26 4.80
#> 
#> $aov.summary
#>                  Df Sum Sq Mean Sq F value   Pr(>F)    
#> Poison            2  34.88  17.439  72.637 2.31e-13 ***
#> Treatment         3  20.41   6.805  28.344 1.38e-09 ***
#> Poison:Treatment  6   1.57   0.262   1.091    0.387    
#> Residuals        36   8.64   0.240                     
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#Now curvature is minimal (confirming adequacy of 
#linear model fit for this metric):
granova.2w(poison[, c(5, 1, 2)], fit = 'quadratic') 
#> $Poison.effects
#>      I     II    III 
#> -0.822 -0.353  1.170 
#> 
#> $Treatment.effects
#>      B      D      C      A 
#> -0.760 -0.461  0.325  0.897 
#> 
#> $CellCounts.Reordered
#>       Treatment
#> Poison B D C A
#>    I   4 4 4 4
#>    II  4 4 4 4
#>    III 4 4 4 4
#> 
#> $CellMeans.Reordered
#>       Treatment
#> Poison    B    D    C    A
#>    I   1.16 1.69 1.86 2.49
#>    II  1.39 1.70 2.71 3.27
#>    III 3.03 3.09 4.26 4.80
#> 
#> $aov.summary
#>                  Df Sum Sq Mean Sq F value   Pr(>F)    
#> Poison            2  34.88  17.439  72.637 2.31e-13 ***
#> Treatment         3  20.41   6.805  28.344 1.38e-09 ***
#> Poison:Treatment  6   1.57   0.262   1.091    0.387    
#> Residuals        36   8.64   0.240                     
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#
#Ranked Version of Inverse:
granova.2w(poison[, c(6, 1, 2)])
#> $Poison.effects
#>      I     II    III 
#> -0.741 -0.259  1.000 
#> 
#> $Treatment.effects
#>      B      D      C      A 
#> -0.650 -0.337  0.284  0.704 
#> 
#> $CellCounts.Reordered
#>       Treatment
#> Poison B D C A
#>    I   4 4 4 4
#>    II  4 4 4 4
#>    III 4 4 4 4
#> 
#> $CellMeans.Reordered
#>       Treatment
#> Poison    B    D    C    A
#>    I   1.11 1.67 1.82 2.39
#>    II  1.36 1.69 2.72 3.15
#>    III 3.04 3.10 3.78 4.04
#> 
#> $aov.summary
#>                  Df Sum Sq Mean Sq F value   Pr(>F)    
#> Poison            2 25.868  12.934  67.098 7.18e-13 ***
#> Treatment         3 13.349   4.450  23.085 1.65e-08 ***
#> Poison:Treatment  6  1.452   0.242   1.256    0.302    
#> Residuals        36  6.939   0.193                     
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
# }