Introduction
Logistic regression is conducted when an outcome or dependent variable can have only two values, typically 0 or 1. This type of statistical procedure is often utilized in academic essays, research papers, and theses that are quantitative and focused on the analyses of binary outcomes (for example, wins or losses, heads or tails, etc.).
Example Scenario and Stata Code
Let’s imagine a scenario in which, in the context of hospital research, you want to study the relationship between the age of a patient and their odds of experiencing a fall in the hospital. You can code falling, as the dependent variable, as 0 = did not fall, 1 = fell. Age can be a whole number. The following code generates a mock dataset in Stata that lets us work with these data in a way that can build your intuition around logistic regression. Note that the very last line of code below (ciplot falls) generates a handy 95% confidence interval chart to add visual support.
set obs 400
gen age1 = runiform(73,98)
gen age2 = runiform(7,75)
replace age1 = . in 260/400
replace age2 = . in 1/259
egen age3 = rowmax (age1 age2)
drop age1 age2
gen age = round(age3)
drop age3
label variable age "Patient Age"
gen falls1 = runiform(0,1)
gen falls = round(falls1)
drop falls1
replace falls = 1 in 1/129
label variable falls "Patient Falls"
label define falls 0 "Did Not Fall" 1 "Fell"
label value falls falls
logistic falls age, or
ciplot falls
Logistic Regression Model
The model is statistically significant, p = .004. In order to interpret the odds ratio (OR) associated with age, begin by recalling that a fall is 1, whereas the absence of a fall is 0. Given this coding approach, if the OR were exactly 1, age would not alter the odds of falling. If the OR had been below 1, and statistically significant, age would reduce the odds of falling. However, what we observe is that the OR is over 1 (p = .004), meaning that age increases the odds of falling.
To interpret the OR,
1.011943 – 1.000000 = 0.011943
0.011943 * 100 = 1.1943%
Therefore, every added year of patient age increases the odds of falling by 1.1943%.
95% CI Reporting
You will want to conclude that every added year of patient age increases the odds of falling by 1.1943%, p = .004, but it would be even better to add the 95% confidence intervals (CIs). Stata gives you the CI in its readout. Note that the 95% CI range is [1.003821, 1.02013]. You could therefore write that every added year of age patient age increases the odds of falling by 1.1943%, p = .004, 95% CI = 0.3821%, 2.013%.
BridgeText can help you with all of your statistical analysis needs.