In the crucible of thesis writing, even the most diligent graduate students might overlook the significance of reporting effect sizes alongside statistical significance in their analyses. While it's universally acknowledged that p values are indispensable for confirming the statistical significance of your results beyond descriptive statistics, the role of effect size, particularly in the context of ANOVA (Analysis of Variance), warrants equal attention. In this blog, we look at ANOVA effect sizes using practical R examples and scenarios commonly encountered in graduate theses, advanced undergraduate projects, and PhD research.
Consider a scenario where you're exploring the impact of different study environments on the test performance of students diagnosed with ADHD. You categorize the environments into three groups: Listening to synthwave music, classical music, and no music (silence) before a test. To assess the effectiveness of these conditions, you conduct an ANOVA with 60 students, randomly assigned to the three groups, and measure their test performance on a scale of 0 to 100. Here’s how you can simulate this study and analyze the data in R:
# Load necessary library
library(ggplot2)
# Set seed for reproducibility
set.seed(123)
# Generate mock data for ANOVA
n_group <- 20 # Number of participants in each group
performance_synth <- rnorm(n_group, mean = 80, sd = 10) # Synthwave group
performance_classical <- rnorm(n_group, mean = 75, sd = 10) # Classical music group
performance_silence <- rnorm(n_group, mean = 70, sd = 10) # Silence group
data <- data.frame(
environment = factor(rep(c("Synthwave", "Classical", "Silence"), each = n_group)),
performance = c(performance_synth, performance_classical, performance_silence)
)
# Conduct ANOVA to compare group means
anova_result <- aov(performance ~ environment, data = data)
# Display ANOVA summary
anova_summary <- summary(anova_result)
print(anova_summary)
# Calculate Eta-squared (η²) for effect size
model_ss <- anova_summary[[1]]$"Sum Sq"[1] # Sum of squares for the model (effect)
residual_ss <- anova_summary[[1]]$"Sum Sq"[2] # Sum of squares for residuals (error)
total_ss <- model_ss + residual_ss # Total sum of squares
eta_squared <- model_ss / total_ss # Calculate Eta-squared
print(paste("Eta-squared (η²):", eta_squared))
# Visualize data
ggplot(data, aes(x = environment, y = performance, fill = environment)) +
geom_boxplot() + # Create boxplots for each group
labs(title = "Test Performance by Study Environment",
x = "Study Environment",
y = "Test Performance") +
theme_minimal() # Use a minimal theme for a clean look
Upon analyzing the data, suppose the ANOVA indicates a statistically significant difference among the study environments in influencing test performance.
Great, your model is significant, but what does it mean practically? If someone inquires about the real-world impact of your findings, how would you respond? It's entirely feasible for results to be statistically significant yet possess a negligible practical effect.
Enter effect size for ANOVA, commonly measured by Eta-squared (η²) or Partial Eta-squared (η²), offering a remedy to this predicament. Unlike Cohen’s d for t-tests, which measures the effect size between two means, η² in the context of ANOVA quantifies the proportion of the total variance in the dependent variable that is attributable to the factor (the independent variable).
Interpret η² as follows:
- Small effect: η² = 0.01.
- Medium effect: η² = 0.06.
- Large effect: η² = 0.14.
Here was the Eta-squared from your model:
Therefore, you can describe the effect of synthwave on performance as being large. Isn’t that a lot more informative than just pointing out that it was statistically significant?
BridgeText can help you with all of your statistics needs.