The R packages needed for this chapter are the survival package and car package. We currently use R 2.0.1 patched version. You may want to make sure that packages on your local machine are up to date. You can perform update in R using update.packages() function.
Table 4.2 on page 119 using data set hmohiv.
hmohiv<-read.table("https://stats.idre.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE) attach(hmohiv) library(survival) drug.coxph
n= 100 coef exp(coef) se(coef) z p drug 0.78 2.18 0.242 3.22 0.0013
exp(coef) exp(-coef) lower .95 upper .95 drug 2.18 0.459 1.36 3.50
Rsquare= 0.097 (max possible= 0.997 ) Likelihood ratio test= 10.2 on 1 df, p=0.00141 Wald test = 10.4 on 1 df, p=0.00130 Score (logrank) test = 10.7 on 1 df, p=0.00105
Table 4.3, Table 4.4, Table 4.5 and Table 4.6 from page 121 to page 125. We will use the recode function to create a categorical version of age and define it to be a factor variable. Function recode comes with package car.
library(car) agecat<-recode(age, "20:29='A'; 30:34='B'; 35:39='C';40:54='D'", as.factor=T) agecat.ph
n= 100 coef exp(coef) se(coef) z p agecatB 1.20 3.31 0.451 2.65 8.0e-03 agecatC 1.31 3.72 0.459 2.86 4.2e-03 agecatD 1.86 6.43 0.469 3.96 7.4e-05
exp(coef) exp(-coef) lower .95 upper .95 agecatB 3.31 0.302 1.37 8.01 agecatC 3.72 0.269 1.51 9.14 agecatD 6.43 0.156 2.56 16.12
Rsquare= 0.178 (max possible= 0.997 ) Likelihood ratio test= 19.6 on 3 df, p=0.000209 Wald test = 16.6 on 3 df, p=0.000875 Score (logrank) test = 18.3 on 3 df, p=0.000389 names(agecat.ph) [1] "coefficients" "var" "loglik" [4] "score" "iter" "linear.predictors" [7] "residuals" "means" "method" [10] "n" "terms" "assign" [13] "wald.test" "y" "formula" [16] "call" agecat.ph$var [,1] [,2] [,3] [1,] 0.2034328 0.1636590 0.1704751 [2,] 0.1636590 0.2105861 0.1666165 [3,] 0.1704751 0.1666165 0.2202500
Table 4.7 on page 127 using deviation coding scheme. We recode variable agecat so that the first group will be “D” for the last group as the reference. Then we specify that we use the deviation coding via contrasts function.
agecat<-recode(age, "20:29='D'; 30:34='B'; 35:39='C';40:54='A'", as.factor=T) contrasts(agecat) <- contr.sum(levels(agecat)) agecat.ph <- coxph( Surv(time, censor)~agecat, method="breslow") summary(agecat.ph)
n= 100 coef exp(coef) se(coef) z p agecat1 0.768 2.15 0.209 3.668 0.00024 agecat2 0.104 1.11 0.192 0.543 0.59000 agecat3 0.221 1.25 0.206 1.072 0.28000
exp(coef) exp(-coef) lower .95 upper .95 agecat1 2.15 0.464 1.430 3.25 agecat2 1.11 0.901 0.762 1.62 agecat3 1.25 0.802 0.833 1.87
Rsquare= 0.178 (max possible= 0.997 ) Likelihood ratio test= 19.6 on 3 df, p=0.000209 Wald test = 16.6 on 3 df, p=0.000875 Score (logrank) test = 18.3 on 3 df, p=0.000389
Table 4.8 on page 129 using age as a continuous predictor.
age.ph <- coxph( Surv(time, censor)~age, method="breslow") summary(age.ph)
n= 100 coef exp(coef) se(coef) z p age 0.0814 1.08 0.0174 4.67 3e-06
exp(coef) exp(-coef) lower .95 upper .95 age 1.08 0.922 1.05 1.12
Rsquare= 0.192 (max possible= 0.997 ) Likelihood ratio test= 21.4 on 1 df, p=3.82e-06 Wald test = 21.8 on 1 df, p=3.03e-06 Score (logrank) test = 22 on 1 df, p=2.72e-06
Table 4.9 on page 133 using data set uis. We will have to create a variable drug.
rm(list=ls()) #cleaning up uis<-read.table("https://stats.idre.ucla.edu/stat/r/examples/asa/uis.csv", sep=",", header = TRUE) attach(uis) drug<-recode(ivhx, "1=0; 2:3=1") table(drug) drug 0 1 233 377 crude.ph
n=610 (18 observations deleted due to missing) coef exp(coef) se(coef) z p drug 0.326 1.39 0.0946 3.44 0.00057 exp(coef) exp(-coef) lower .95 upper .95 drug 1.39 0.722 1.15 1.67 Rsquare= 0.02 (max possible= 1 ) Likelihood ratio test= 12.2 on 1 df, p=0.000476 Wald test = 11.9 on 1 df, p=0.000571 Score (logrank) test = 12.0 on 1 df, p=0.00054
adjust.ph <- coxph( Surv(time, censor)~drug+age, method="breslow") summary(adjust.ph)
n=605 (23 observations deleted due to missing) coef exp(coef) se(coef) z p drug 0.4394 1.552 0.10072 4.36 1.3e-05 age -0.0264 0.974 0.00784 -3.37 7.7e-04 exp(coef) exp(-coef) lower .95 upper .95 drug 1.552 0.644 1.27 1.89 age 0.974 1.027 0.96 0.99 Rsquare= 0.038 (max possible= 1 ) Likelihood ratio test= 23.3 on 2 df, p=8.65e-06 Wald test = 23.1 on 2 df, p=9.72e-06 Score (logrank) test = 23.2 on 2 df, p=9.21e-06
Table 4.10 on page 135. We first have to make sure that all the models are run on the same observations. To this end, we use function complete.cases to subset the data set.
detach(uis) touse<-uis[complete.cases(time, censor, treat, age), ] attach(touse) crude.ph n= 623 coef exp(coef) se(coef) z p treat -0.220 0.803 0.0893 -2.46 0.014 exp(coef) exp(-coef) lower .95 upper .95 treat 0.803 1.25 0.674 0.956 Rsquare= 0.01 (max possible= 1 ) Likelihood ratio test= 6.05 on 1 df, p=0.0139 Wald test = 6.05 on 1 df, p=0.0139 Score (logrank) test = 6.07 on 1 df, p=0.0138 adjust.ph n= 623 coef exp(coef) se(coef) z p treat -0.2230 0.800 0.08933 -2.50 0.013 age -0.0133 0.987 0.00721 -1.84 0.066 exp(coef) exp(-coef) lower .95 upper .95 treat 0.800 1.25 0.672 0.953 age 0.987 1.01 0.973 1.001 Rsquare= 0.015 (max possible= 1 ) Likelihood ratio test= 9.48 on 2 df, p=0.00876 Wald test = 9.42 on 2 df, p=0.00903 Score (logrank) test = 9.44 on 2 df, p=0.00892 inter.ph n= 623 coef exp(coef) se(coef) z p treat 0.52272 1.687 0.4745 1.102 0.27 age -0.00177 0.998 0.0101 -0.175 0.86 treat:age -0.02319 0.977 0.0145 -1.600 0.11 exp(coef) exp(-coef) lower .95 upper .95 treat 1.687 0.593 0.665 4.27 age 0.998 1.002 0.979 1.02 treat:age 0.977 1.023 0.950 1.01 Rsquare= 0.019 (max possible= 1 ) Likelihood ratio test= 12.0 on 3 df, p=0.00724 Wald test = 11.2 on 3 df, p=0.0106 Score (logrank) test = 11.3 on 3 df, p=0.0101
Table 4.11 on page 136 based on the model with interaction of treat and age from previous example. One way of producing Table 4.11 is to simply center age and rerun the model as shown below.
inter.ph <- coxph( Surv(time, censor)~treat+I(age-25)+treat:I(age-25), method="breslow") summary(inter.ph)
n= 623 coef exp(coef) se(coef) z p treat -0.05714 0.944 0.1369 -0.417 0.68 I(age - 25) -0.00177 0.998 0.0101 -0.175 0.86 treat:I(age - 25) -0.02319 0.977 0.0145 -1.600 0.11
exp(coef) exp(-coef) lower .95 upper .95 treat 0.944 1.06 0.722 1.24 I(age - 25) 0.998 1.00 0.979 1.02 treat:I(age - 25) 0.977 1.02 0.950 1.01
Rsquare= 0.019 (max possible= 1 ) Likelihood ratio test= 12.0 on 3 df, p=0.00724 Wald test = 11.2 on 3 df, p=0.0106 Score (logrank) test = 11.3 on 3 df, p=0.0101
inter.ph <- coxph( Surv(time, censor)~treat+I(age-30)+treat:I(age-30), method="breslow") summary(inter.ph)
n= 623 coef exp(coef) se(coef) z p treat -0.17311 0.841 0.0949 -1.825 0.068 I(age - 30) -0.00177 0.998 0.0101 -0.175 0.860 treat:I(age - 30) -0.02319 0.977 0.0145 -1.600 0.110
exp(coef) exp(-coef) lower .95 upper .95 treat 0.841 1.19 0.698 1.01 I(age - 30) 0.998 1.00 0.979 1.02 treat:I(age - 30) 0.977 1.02 0.950 1.01
Rsquare= 0.019 (max possible= 1 ) Likelihood ratio test= 12.0 on 3 df, p=0.00724 Wald test = 11.2 on 3 df, p=0.0106 Score (logrank) test = 11.3 on 3 df, p=0.0101
inter.ph <- coxph( Surv(time, censor)~treat+I(age-35)+treat:I(age-35), method="breslow") summary(inter.ph)
n= 623 coef exp(coef) se(coef) z p treat -0.28908 0.749 0.0989 -2.924 0.0035 I(age - 35) -0.00177 0.998 0.0101 -0.175 0.8600 treat:I(age - 35) -0.02319 0.977 0.0145 -1.600 0.1100
exp(coef) exp(-coef) lower .95 upper .95 treat 0.749 1.34 0.617 0.909 I(age - 35) 0.998 1.00 0.979 1.018 treat:I(age - 35) 0.977 1.02 0.950 1.005
Rsquare= 0.019 (max possible= 1 ) Likelihood ratio test= 12.0 on 3 df, p=0.00724 Wald test = 11.2 on 3 df, p=0.0106 Score (logrank) test = 11.3 on 3 df, p=0.0101
inter.ph <- coxph( Surv(time, censor)~treat+I(age-40)+treat:I(age-40), method="breslow") summary(inter.ph)
n= 623 coef exp(coef) se(coef) z p treat -0.40505 0.667 0.1451 -2.791 0.0053 I(age - 40) -0.00177 0.998 0.0101 -0.175 0.8600 treat:I(age - 40) -0.02319 0.977 0.0145 -1.600 0.1100
exp(coef) exp(-coef) lower .95 upper .95 treat 0.667 1.50 0.502 0.886 I(age - 40) 0.998 1.00 0.979 1.018 treat:I(age - 40) 0.977 1.02 0.950 1.005
Rsquare= 0.019 (max possible= 1 ) Likelihood ratio test= 12.0 on 3 df, p=0.00724 Wald test = 11.2 on 3 df, p=0.0106 Score (logrank) test = 11.3 on 3 df, p=0.0101
Figure 4.2 on page 139 using data set hmohiv. After running the model, we create a new small data set for prediction purpose.
rm(list=ls()) #cleaning up library(survival) hmohiv<-read.table("https://stats.idre.ucla.edu/stat/r/examples/asa/hmohiv.csv", sep=",", header = TRUE) attach(hmohiv) fig4_2.ph
Figure 4.3 on page 141.
Table 4.12 on page 143 with centered age variable.
table4_12.ph <- coxph( Surv(time, censor)~drug+I(age-35), method="breslow") summary(table4_12.ph)
n= 100 coef exp(coef) se(coef) z p drug 0.9414 2.56 0.2555 3.68 2.3e-04 I(age - 35) 0.0915 1.10 0.0185 4.95 7.4e-07
exp(coef) exp(-coef) lower .95 upper .95 drug 2.56 0.390 1.55 4.23 I(age - 35) 1.10 0.913 1.06 1.14
Rsquare= 0.295 (max possible= 0.997 ) Likelihood ratio test= 35 on 2 df, p=2.53e-08 Wald test = 32.5 on 2 df, p=8.76e-08 Score (logrank) test = 34.3 on 2 df, p=3.56e-08
Figure 4.4 on page 144 based on the model in the previous example.
Figure 4.5 on page 146 based on the age-adjusted models at four different age points. We will skip model (b) and (c) since the R code is the same for all of the four panels.
Panel (a) Variable age is centered around 30.
drug.new<-data.frame(drug=c(0,1), age=c(30,30)) age30.ph <- coxph( Surv(time, censor)~drug+I(age-30), method="breslow") plot(survfit(age30.ph, newdata=drug.new),xlab="Survival Time (Months)", ylab="Survival Probability")
points(survfit(age30.ph, newdata=drug.new),pch=c(1,2)) legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))
Panel (d) Variable age is centered around 45.
drug.new<-data.frame(drug=c(0,1), age=c(45,45)) age45.ph <- coxph( Surv(time, censor)~drug+I(age-45), method="breslow") plot(survfit(age45.ph, newdata=drug.new),xlab="Survival Time (Months)", ylab="Survival Probability")
points(survfit(age45.ph, newdata=drug.new),pch=c(1,2)) legend(40, .8, c("Drug Absent", "Drug Present"), pch=c(1,2))
Table 4.13 on page 148 using data set uis.
rm(list=ls()) #cleaning up library(survival) uis<-read.table("https://stats.idre.ucla.edu/stat/r/examples/asa/uis.csv", sep=",", header = TRUE) attach(uis) drug<-(ivhx==1) agec<-age-30 ndrugtxc<-ndrugtx-3
full.model <- coxph( Surv(time, censor)~treat+agec+drug+ndrugtxc, method="breslow") summary(full.model)
n=593 (35 observations deleted due to missing) coef exp(coef) se(coef) z p treat -0.2271 0.797 0.09158 -2.48 0.01300 agec -0.0307 0.970 0.00794 -3.87 0.00011 drugTRUE -0.3426 0.710 0.10426 -3.29 0.00100 ndrugtxc 0.0309 1.031 0.00799 3.87 0.00011
exp(coef) exp(-coef) lower .95 upper .95 treat 0.797 1.25 0.666 0.954 agec 0.970 1.03 0.955 0.985 drugTRUE 0.710 1.41 0.579 0.871 ndrugtxc 1.031 0.97 1.015 1.048
Rsquare= 0.067 (max possible= 1 ) Likelihood ratio test= 41.1 on 4 df, p=2.57e-08 Wald test = 43.2 on 4 df, p=9.26e-09 Score (logrank) test = 43.4 on 4 df, p=8.38e-09
Figure 4.6 on page 148 based on the model in the model above.
fit <- survfit(full.model, conf.type="none") fit$surv2 <- fit$surv^exp(-0.2271) plot( fit$time, fit$surv, xlab="Time to Drug Use From Admission (Days)", ylab="Survival Probability", ylim=c(0, 1.0), xlim=c(0, 1000), type="s") points(fit$time, fit$surv2, type="s")