Even though psychologists are often interested in comparing mean values between conditions, it can also be of interest whether the dispersion, i.e., the variance of the measurements differ—even if just to test the validity of the assumption of homogeneous variances that is made by many statistical procedures. As far as I can tell, there is not yet an elaborate and accessible solution to compare variances using a Bayes factor.1 Here, I discuss two such options.

Independent variances

The Levene test is a well-known test to compare several variances because it is commonly used to test the assumption of “homogeneity” in variances before conducting an ANOVA. However, the logic of significance tests forbids that the output of the test can be interpreted as evidence of variance equality; we may only conclude the absence of variance homogeneity if the p value is small. Wouldn’t it be nice to actually have a test that can yield direct evidence in favor of variance homogeneity. This is what a Bayes factor can do.

Apparently, the output of a Levene test can conveniently be converted to a Bayes factor.2 According to Wikipedia, the “Levene test is equivalent to a 1-way between-groups […] (ANOVA) with the dependent variable being the absolute value of the difference between a score and the mean of the group to which the score belongs […].” Thus, the Levene test is an ANOVA, and there are known methods to compute Bayes factors for ANOVA designs. Because I think this functionality is needed, I implemented it in R. You can install my small package varBF as follows:

# if the package `remotes` is not available, type:
# install.packages("remotes")
remotes::install_github("m-Py/varBF")

The package varBF calls the function oneWayAOV.Fstat() from the awesome R package BayesFactor (Morey & Rouder, 2015) that converts an F value from a one way ANOVA to a Bayes factor.

As an example, lets assume we have collected lots of data, e.g., 2,000 normally distributed observations across two independent groups:

# Generate data from a normal distribution with mean = 100 
# and standard deviation = 15
group1 <- rnorm(1000, mean = 100, sd = 15)
group2 <- rnorm(1000, mean = 100, sd = 15)
sd(group1)
## [1] 15.61635
sd(group2)
## [1] 14.82636

Here, the true variances (i.e., standard deviations) are the same, and the standard deviations observed in the two samples are also quite similar to each other. Let’s test if we also obtain convincing statistical evidence favoring the null hypothesis by computing a Bayes factor:

library(varBF)
indepvarBF(group1, group2)
## [1] 0.07301679

The result is the Bayes factor \(BF_{10}\). That is, the probability of the data (i.e., the dissimilarity of the variances) under an alternative hypothesis—assuming there is a difference—as compared to the probability of the data under the null hypothesis of equal variances.3 To assess the degree to which the data favor the null hypothesis, we compute the inverse Bayes factor \(BF_{01}\) as follows:

1 / indepvarBF(group1, group2)
## [1] 13.69548

Ergo, the data favors the null hypothesis by a factor of approximately 14 as compared to the alternative hypothesis. The data yield evidence that the variances are equal; the classical Levene test could not have given this output.

Next, let’s check out an example of how the function behaves when the true variances are different:4

group1 <- rnorm(200, mean = 100, sd = 15)
group2 <- rnorm(200, mean = 100, sd = 20)
sd(group1)
## [1] 15.06566
sd(group2)
## [1] 21.60039
indepvarBF(group1, group2)
## [1] 2351.344

Here, the true standard deviations are 15 and 20, and I generated 200 data points from each distribution. The Bayes factor \(BF_{10}\) favoring the alternative hypothesis is approximately 2351. Note that the Bayes factor may vary considerably when repeating the process.

I found this approach to testing the equality in variances interesting and useful. There is (at least) one catch, however, as the conversion from F value to Bayes factor is only expected to work safely for balanced designs having the same group size for each sample, according to the BayesFactor documentation.

Dependent variances

When comparing any measures resulting from paired data, the dependence in data should be taken into account. For example, we have the great paired t test to test if dependent means differ. As I learned very recently, there is a simple frequentist test to assess whether variances from paired data differ, the “Morgan-Pitman” test (Wilcox, 2015). The Morgan-Pitman test reduces to testing the “nullity” of a correlation, which I found quite odd. What should be correlated when we actually compare variances? If we have paired vectors of data x and y, we test the correlation between \(x + y\) and \(x - y\). That is, for each pair of observations, we compute the sum and the difference, and across all observations, we correlate the sum with the difference. If the correlation significantly differs from zero, we conclude that the variances of \(x\) and \(y\) differ.

The Morgan-Pitman approach can be extended into a Bayesian method,5 because Bayes factors assessing the nullity of a correlation exist. My implementation (the function depvarBF()) in the package varBF wraps the correlationBF() function from the BayesFactor package. To illustrate its functionality, I have to generate some correlated paired data. Therefore, I define a function that generates normal bivariate paired data by calling the R package MASS:

# Generate bivariate normal data with specified correlation
#
# param n: how many data points
# param m1: the mean of the first variable
# param m2: the mean of the second variable
# param sd1: the standard deviation of the first variable
# param sd2: the standard deviation of the second variable
# param r: the »true« correlation between the two measures
# 
# return: the data set (two columns with random normal data generated 
#     via MASS::mvrnom)
paired_data <- function(n, m1, m2, sd1, sd2, r) {
  cor_matrix <- matrix(c(1, r, r, 1), ncol = 2)
  sds <- c(sd1, sd2)
  vars <- sds %*% t(sds)
  cov_matrix <- vars * cor_matrix
  MASS::mvrnorm(n, mu = c(m1, m2), Sigma = cov_matrix)
}

Now let’s generate data 200 correlated paired data points (\(r = .3\)) with different standard deviation for the measures, and then check the Bayes factor \(BF_{10}\):

pairs <- paired_data(
  n = 200, 
  m1 = 100, 
  m2 = 100, 
  sd1 = 15, 
  sd2 = 20, 
  r = 0.3
)
# check out the observed correlation (true correlation is .3)
cor(pairs[, 1], pairs[, 2])
## [1] 0.3537796
# check out the Bayes factor: 
depvarBF(pairs[, 1], pairs[, 2])
## [1] 49.74107

The data favors the alternative hypothesis by a factor of approximately 50 as compared to the null hypothesis. Let’s repeat the procedure, but this time we assume that the true variances are equal; we then check out the inverse Bayes factor \(BF_{01}\):

pairs <- paired_data(
  n = 200, 
  m1 = 100, 
  m2 = 100, 
  sd1 = 15, 
  sd2 = 15, 
  r = 0.3
)
1 / depvarBF(pairs[, 1], pairs[, 2])
## [1] 4.046134

The data favors the null hypothesis by a factor of approximately 4 as compared to the alternative hypothesis. It is often more difficult to obtain evidence in favor of a null hypothesis.

Conclusion

I believe that there is a need for Bayes factors comparing variances, but so far no accessible solution seems to exist. I discussed two “non-standard” Bayes factors for the comparison of variances. As I am not an expert in coming up with Bayes factors, this approach should be assessed by other people before we all start using it. The code (it is really not that much) for the two functions depvarBF() and indepvarBF() is available from here and here.

Update: As discussed in my later posts, there is now the R package BFpack that enables the computation of Bayes factors for the comparison of independent variances. As far as I know, however, an option for dependent variances is still lacking.

References

Morey, R. D., & Rouder, J. N. (2015). BayesFactor: Computation of Bayes factors for common designs. Retrieved from https://CRAN.R-project.org/package=BayesFactor

Rouder, J. N., Morey, R. D., Speckman, P. L., Province, J. M., (2012) Default Bayes Factors for ANOVA Designs. Journal of Mathematical Psychology. 56, 356–374.

Wilcox, R. (2015). Comparing the variances of two dependent variables. Journal of Statistical Distributions and Applications, 2, 1–8.


Last updated: 2020-10-22


  1. According to E.J. Wagenmakers who commented here, earlier this year, the Bayesian flagship software JASP does not yet have an option for testing the equality of variances, but apparently their group is on it.

  2. I did come up with this myself, but I stole the idea from this forum thread

  3. Note that I do not talk about the nature of the competing alternative hypothesis that is being used. See Rouder, Morey, Speckman and Province (2012) and the R help page for the BayesFactor function anovaBF() for more information. My function uses the default prior option "medium", which can however be adjusted using the argument rscale.

  4. To evaluate the method, a more systematic simulation assessing the performance under varying circumstances (such as sample size, effect size, different distributions etc) is needed. I conducted two such simulations here and here.

  5. In this case, I thought of the approach myself, but the “Schöpfunghöhe” of the idea is not that great, I guess.