Paired T Test in R

Jan 2

Introduction

A paired t test in R compares vectors or values that represent matched subjects who are measured on a continuous variable. In this blog entry, we’ll show you how to run a paired t test in R and combine with it appropriate graphics.

Enter Data

Let’s use the concept of 31 people who went out on dates in 2021, then again in 2022, after getting the help of a dating coach. Try entering these data into R using:

data <- data.frame(dates = c(12, 15, 0, 23, 18, 10, 10, 9, 8, 10,
  25, 21, 13, 18, 3, 21, 22, 12, 12, 11,
  3, 15, 28, 41, 0, 0, 7, 10, 12, 12,
  12, 15, 18, 2, 23, 19, 11, 11, 7, 10, 12,
  35, 31, 11, 16, 7, 23, 22, 12, 12, 11,
  7, 12, 30, 50, 4, 4, 8, 9, 11, 18,
  22),
coach = c(rep('pre', 31), rep('post', 31)))
print(data)

Run the T Test

You can use the following code for a paired-samples t test, building on the code provided above:

t.test(dates~coach, data=data, paired=TRUE)

Here’s what you get:

By default, R subtracts the first vector (pre, meaning the number of dates before working with the dating coach) from the second vector (post, meaning the number of dates after working with the dating coach). Therefore, the mean difference (2.26) means that people in this dataset had an average of 2.26 more dates after working with the dating coach. Note that the statistical significance of this difference is discussed below.

Interpret the Results

In a paired samples t test, there is one p value for a two-tailed hypothesis and another for a one-tailed hypothesis. The results above (including p = .001492) are for a two-tailed test in which your hypotheses are as follows:

H0: The mean dates before working with the dating coach are equal to the mean dates after working with the dating coach.
HA: The mean dates before working with the dating coach are not equal to the mean dates after working with the dating coach.

If you were hypothesizing that dates are greater after the intervention of the dating coach, you could also utilize:

t.test(dates~coach, data=data, paired=TRUE, alternative='greater')

Here, your p value is half of what it was for the two-tailed version of the hypothesis test: