Introduction
The Cox proportional hazards model can be understood simply in terms of calculating the risk of an adverse event (hazard) as a function of some set of predictors. For example, death might be the hazard, and the use of a drug and the age of a patient could be predictors. Quite often, in survival analysis, the goal is to understand the contribution of such predictors to the likelihood of the hazard. In particular, we might be interested in the effectiveness of treatments in reducing hazards.
In the Cox model, the hazard ratio has a protective function if under 1 and an exacerbating function if over 1. Let’s see the model in action.
Load Data
Try the following code to load your data:
use https://www.stata-press.com/data/r17/drugtr
describe
As you can see, these are the results from a drug trial:
Prepare Data
These data are already prepared for survival analysis, but you have to enter the following code first:
st
Here is what you get:
If you have to prepare your own data for survival analysis, you can contact BridgeText for personalized help.
Run and Interpret the Model
In the model as structured, the hazard (or failure, as described in Stata code) is dying. We also have information encompassing 1 to 39 months of outcomes (studytime), drug trial information (0 = no drug taken, 1 = drug taken), and the age of the individual in the trial, ranging from 47 to 67.
To run the Cox proportional hazards model, enter the following code:
stcox drug age
Note that drug and age are the two predictors. Because of the structure of the dataset and the use of st, the model already knows to treat death as the hazard / failure event, and the model is also tracking time elapsed. Here are the results of the model:
Thus, the hazard ratio of people who take the drug is roughly 10.49% of those who do not take the drug. Or, understood in another way, taking the drug reduces the hazard of dying by about 89.51%, which is a powerful argument in favor of taking the drug.
Note that the hazard ratio for drug is automatically adjusted for the simultaneous impact of age. Of course, age could be an independent factor in the hazard of dying, so this variable needs to be included in the model. What we see is that age exacerbates the hazard; every added year of age increases the hazard of dying by about 12%, independently of drug use.
Graphical Support
A natural next step in the analysis is to graph survival time as a function of drug use, which we can do with the Kaplan-Meier estimate and the following code:
sts graph, by(drug) ci
On the resulting graph, you can see that people who took the drug survived longer than people who did not take the drug. Adding ci to the code generates a more precise graphic, as you can see the superimposed 95% confidence intervals:
BridgeText can help you with all of your statistical analysis needs.