When working with data as a data science or data analyst, survival analysis is very common and something that many industries and companies utilize to understand the expected time and probabilities of some event occurring.
There are many major companies and industries which use SAS (banking, insurance, etc.), but with the rise of open source and the popularity of languages such as Python and R, these companies are exploring converting their code to Python.
A commonly used procedure for survival analysis in SAS is the PROC PHREG procedure. In this article, you’ll learn the Python equivalent of PROC PHREG.
PROC PHREG Equivalent in Python
In SAS, when we are looking at doing survival analysis on continuous variables, we use PROC PHREG. PROC PHREG performs regression analysis of survival data based on the Cox proproptional hazards model.
Let’s say we have data such as the following:
In SAS, if we wanted to fit a Cox model on this data, we could do something like the following:
The output from running the code above is below:
With an outputted dataset using the ODS output PARAMETERESTIMATES statement here:
To get the PROC PHREG equivalent in Python, we will use the the CoxPHFitter class from the lifelines package.
Fitting a Cox model using the CoxPHFitter class is very easy.
Below gives us the equivalent output as SAS:
import pandas as pd import numpy as np from lifelines import CoxPHFitter cph = CoxPHFitter() cox1 = cph.fit(example_data, duration_col="time", event_col="event", formula="weight") cox1.print_summary() #output: #<lifelines.CoxPHFitter: fitted with 40 total observations, 9 right-censored observations> # duration col = 'time' # event col = 'event' # baseline estimation = breslow # number of observations = 40 #number of events observed = 31 # partial log-likelihood = -89.50 # time fit was run = 2020-12-17 00:26:36 UTC # #--- # coef exp(coef) se(coef) coef lower 95% coef upper 95% exp(coef) lower 95% exp(coef) upper 95% #covariate #weight 0.21 1.24 0.08 0.05 0.37 1.05 1.45 # # z p -log2(p) #covariate #weight 2.62 0.01 6.82 #--- #Concordance = 0.65 #Partial AIC = 181.00 #log-likelihood ratio test = 6.61 on 1 df #-log2(p) of ll-ratio test = 6.63
If we want to work with the estimates like with the ODS output parameterestimates dataset, we can use the CoxPHFitter summary DataFrame as below:
print(cox1.summary) #output: # coef exp(coef) se(coef) coef lower 95% coef upper 95% exp(coef) lower 95% exp(coef) upper 95% z p -log2(p) #covariate #weight 0.211464 1.235485 0.080773 0.053151 0.369777 1.054589 1.447412 2.617988 0.008845 6.820924 print(cox1.summary[coef].iloc) #output: # 0.211464
I hope that this article has been beneficial for you and helped you learn how to get the PROC PHREG equivalent in Python.