Even otherwise conscientious and skilled graduate students can forget the need to provide effect sizes as a complement to statistical significance. Of course, everyone knows that you need a p value to accompany your statistical results, outside of descriptive statistics, but what about effect size? In this blog, we’ll walk you through what effect sizes are through some practical examples using R and scenarios that arise in many graduate theses and PhDs (and even in more advanced undergraduate work).
Let’s say you’re interested in the relationship between the diversity of companies and their financial performance. Assume that you have a single diversity variable, called dive; and a single financial performance, called fp. You run an ordinary least squares regression model in which diversity is a predictor of financial performance. Let’s assume you did so for 1,000 companies, and let’s create some mock data in R that result in a statistically significant result that can benefit from a presentation and discussion of effect size.
Start with looking at the output:
Great, your model is significant, but what does it mean? What if someone asks you about the practical effect of your findings? After all, it’s possible for findings to be statistically significant but too weak to matter much in practice, which is the case with these mock findings.
You want to pay attention to the R-squared (R2) if you have only one predictor and the adjusted R2 if you have more than predictor. R2 goes up automatically when you add predictors, even if they aren’t significant, and adjusted R2 corrects for this tendency, so that’s the effect size measure you should look at if your regression model has 2 or more predictors.
You can also generate another measure of effect size called Cohen’s f squared:
R2 provides insight into the proportion of variance in the dependent variable that can be predicted from the independent variables. For instance, if your R2 value of 0.03 suggests that 3% of the variability in financial performance can be explained by diversity. This measure is a direct reflection of your model's ability to capture the underlying patterns in your data. The higher R2 is, the better your model explains the dependent variable, guiding researchers in evaluating the effectiveness of their model. However, if R2 is 1, you might have picked a predictor that is somehow the same as the outcome!
Cohen’s f2 is a measure of effect size used to quantify the importance of the relationship between predictors and the outcome variable in a regression model. It is calculated as the ratio of the variance explained by the model to the variance unexplained. This metric helps in understanding the impact of adding predictors to your model. Cohen also suggested thresholds for interpreting f2 as small (0.02), medium (0.15), and large (0.35) effects, providing a scale for evaluating the impact of predictors.
Both R2 and Cohen’s f2 offer critical insights but from slightly different perspectives. R2 is particularly useful for assessing how well the entire model fits the data. In contrast, Cohen’s f2 offers a way to evaluate the relative impact of individual predictors or groups of predictors within the model, making it indispensable for refining models and understanding which variables contribute most to explaining the variance in the outcome.
For graduate students, these measures are not just numbers but tools that offer a deeper understanding of the data and the relationships within. They guide the interpretation of statistical models, inform decisions on model adjustment, and ultimately contribute to the robustness of research findings. Mastery of R2 and Cohen’s f2 empowers researchers to present their findings confidently, knowing they have quantitatively assessed the strength and significance of their models’ predictors.
Here’s a handy summary of these two measures of effect size:
BridgeText can help you with all of your statistics needs.