3

I am actually working on an exercise for my students dealing with multiple comparisons. However, I was having trouble trying to simulate some data in R to demonstrate the problem with multiple comparisons. I was hoping someone might have some R code or at least give some guidance on how to generate a dataset in R to demonstrate the multiple comparisons problem. So first I would like to generate some data (rnorm(), runif, etc.). Then I would like to show how if I pairwise compare the parameters for 2 levels of a multiple level predictor, I am more likely to get a significant result than the standard alpha level of 0.5, or 95% confidence.

Thanks.

CC BY-SA 3.0

    2 Answers 2

    Reset to default
    5

    Let us simulate 100realisations of N(0,1)and test the null hypothesis H0:μ=0with the t-test. If this is done many times, H0should be rejected approximately 5%of the times.

    pval <- rep(NA, 1e4)
    for(k in 1:1e4)
    {
      x <- rnorm(100)
      pval[k] <- t.test(x=x, mu=0)$p.value
    }
    
    > mean(pval < 0.05)
    [1] 0.0495
    

    If we now simulate 5random variables and test the null hypothesis that all means are simultaneously 0, then the probability of at least one significant result is larger than 5%; actually it is 1(10.05)5=0.226.

    n <- 5
    pval <- matrix(NA, nrow=1e4, ncol=n)
    for(k in 1:1e4)
    {
      X <- replicate(n=n, expr=rnorm(100))
      pval[k, ] <- apply(X=X, MARGIN=2, FUN=function(x){
        t.test(x=x, mu=0)$p.value
      })  
    }
    
    > mean(sapply(X=1:nrow(pval), FUN=function(i){
    +   any(pval[i, ] < 0.05)
    + }))
    [1] 0.2203
    

    But if you use the Bonferroni correction, then

    > mean(sapply(X=1:nrow(pval), FUN=function(i){
    +   any(pval[i, ] < 0.05 / n)
    + }))
    [1] 0.0487
    
    CC BY-SA 3.0
      2

      You are in Luck I had the same question yesterday myself. I build a function that lets you visualize the issue.

      # Say you have a range of tests and you would want to see the probabilty of multiplicity
      ntests <- (1:50)
      
        multicomp.issue <- function(a,k) { # a is my alpha level, k is how many different tests i am doing
          b <- rep(NA,length(k))
          for (i in k) { 
            b[i] <-1-(1-a)^k[i] # multicomp issue formula
          } 
      
          plot(b, ylim = 0:1, 
               ylab = "Probability",
               main ="Multiplicity error", type = "p", pch = 15)
          abline(h=0.5, col = "red") # 50% 
          abline(h=0.05, col = "green") # what people usually use as significance level 
      
          print("The family wise error ratre is equal to") # this shows you the numbers in the console. 
          print(b)}
      multicomp.issue(0.05, tries)  # Test
      

      When you run that code in your R you should get the following plot.
      Of course I advise you play with different a values and different amounts of tests.
      I hope this is usefull to you.

      enter image description here

      CC BY-SA 4.0

        Your Answer

        By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

        Not the answer you're looking for? Browse other questions taggedor ask your own question.