class: center, middle, inverse, title-slide # MULTIVARIJATNE STATISTIČKE METODE ## Predavanje 9: Survival analiza ### dr.sc. Luka Šikić ### Fakultet hrvatskih studija |
Github MV
--- <style type="text/css"> @media print { .has-continuation { display: block !important; } } remark-slide-content { font-size: 22px; padding: 20px 80px 20px 80px; } .remark-code, .remark-inline-code { background: #f0f0f0; } .remark-code { font-size: 16px; } .mid. remark-code { /*Change made here*/ font-size: 60% !important; } .tiny .remark-code { /*Change made here*/ font-size: 40% !important; } /* custom.css */ .left-code { color: #777; width: 38%; height: 92%; float: left; } .right-plot { width: 60%; float: right; padding-left: 1%; } .plot-callout { height: 225px; width: 450px; bottom: 5%; right: 5%; position: absolute; padding: 0px; z-index: 100; } .plot-callout img { width: 100%; border: 4px solid #23373B; } </style> # Pregled predavanja <br> <br> <br> 1. [Karakteristike survival analize](#kar) 2. [Procjena survival krivulja](#proc) 3. [Weibull model](#wei) 4. [Cox model](#cox) --- class: inverse, center, middle name: kar # KARAKTERISTIKE SURVIVAL ANALIZE <html><div style='float:left'></div><hr color='#EB811B' size=1px width=796px></html> (O čemu se tu radi!?) --- # Koncept <br> <br> <img src="../Foto/surv_1.png" width="500px" style="display: block; margin: auto;" /> --- # Primjer 1 <br> <br> <img src="../Foto/surv2.png" width="500px" style="display: block; margin: auto;" /> --- # Primjer 2 <br> <br> <img src="../Foto/surv3.png" width="500px" style="display: block; margin: auto;" /> --- # Primjer 3 <br> <br> <img src="../Foto/surv4.png" width="500px" style="display: block; margin: auto;" /> --- # Podatci <br> <br> ```r #install.packages("TH.data") #install.packages("Ecdat") data(GBSG2, package = "TH.data") data(UnempDur, package = "Ecdat") ``` --- # Zašto posebna vrsta analize? - Periodi (vrijeme) je uvijek **pozitivno** - Različite **mjere** nas zanimaju - **Cenzori** predstavljaju problem <img src="../Foto/surv5.png" width="350px" style="display: block; margin: auto;" /> --- # Napravi **Surv** objekt <img src="../Foto/surv6.png" width="400px" style="display: block; margin: auto;" /> ```r time <- c(5, 6, 2, 4, 4) event <- c(1, 0, 0, 1, 1) library(survival) Surv(time, event) ``` ``` ## [1] 5 6+ 2+ 4 4 ``` --- # Paketi <br> <br> ```r # Za analizu #install.packages("survival") #install.packages("survminer") # Za vizualizacije #library("survminer") ``` .footnote[Postoji i CRAN Task View za Survival analizu](https://cran.r-project.org/web/views/Survival.html)]] --- # Pitanja za survival analizu <br> <br> - Koja je vjerojatnost da će pacijent preživjeti više od XY godina? - Koje je tipično vrijeme čekanja taksija? - Od 100 nezaposlenih, koji broj možemo čekivati da će ponovno naći posao nakon 2 mjeseca? --- # Survival funkcija <img src="../Foto/surv7.png" width="400px" style="display: block; margin: auto;" /> - Vjerojatnost da je trajanje dulje od *t*. - npr. vjerojatnost preživljavanja nakon *t* - npr. vjerojatnost da će taksi trebati više od *t* minuta do dolaska --- # Survival funkcija <img src="../Foto/surv8.png" width="400px" style="display: block; margin: auto;" /> - Medijansko vrijeme je *t*. - npr. medijansko vrijeme preživljavanja je 3.7 godina - npr. medijansko vrijeme do dolaska taksija je 3.7 minuta --- # Survival funkcija <img src="../Foto/surv9.png" width="400px" style="display: block; margin: auto;" /> - 100*S(t) posto duracija je dulje od vremena t - 37% svih pacijenata preživi dulje od 4 godine, a 63% umre unutar prve 4 godine - 37 od 100 taksija treba više od 4 minute do dolaska --- class: inverse, center, middle name: proc # PROCJENA SURVIVAL KRIVULJA <html><div style='float:left'></div><hr color='#EB811B' size=1px width=796px></html> (Načini procjene) --- # Survival krivulja <br> <br> <img src="../Foto/surv10.png" width="600px" style="display: block; margin: auto;" /> --- # Procjena survival krivulje <br> <br> <img src="../Foto/surv11.png" width="600px" style="display: block; margin: auto;" /> --- # Izračun *korak po korak* <br> <br> <img src="../Foto/surv12.png" width="600px" style="display: block; margin: auto;" /> --- # Izračun pomoću funkcije <br> <br> ```r # Kaplan-Meier km <- survfit(Surv(time, event) ~ 1) ggsurvplot(km, conf.int = FALSE, risk.table = "nrisk_cumevents", legend = "none") # [1] 5 6+ 2+ 4 4 ``` --- # Vizualizacija <br> <br> ```r ggsurvplot( fit = km, palette = "blue", linetype = 1, surv.median.line = "hv", risk.table = TRUE, cumevents = TRUE, cumcensor = TRUE, tables.height = 0.1 ) ``` --- class: inverse, center, middle name: wei # WEIBULL MODEL <html><div style='float:left'></div><hr color='#EB811B' size=1px width=796px></html> --- # Zašto koristimo Weibull model <br> <br> <img src="../Foto/surv13.png" width="600px" style="display: block; margin: auto;" /> --- # Primjeni Weibull model u R <br> <br> <br> ```r # Webull model wb <- survreg(Surv(time, event)~1,data) # Kaplan-Meier procjena km <- survfit(Surv(time, event)~1,data) ``` --- # Izračun ```r wb <- survreg(Surv(time, cens) ~ 1, data = GBSG2) ``` ```r # Predviđanje ## 90% pacijenata prežive dulje od: predict(wb, type = "quantile", p = 1 - 0.9, newdata = data.frame(1)) ``` ``` ## 1 ## 384.9947 ``` ```r surv <- seq(.99, .01, by = -.01) t <- predict(wb, type = "quantile", p = 1 - surv, newdata = data.frame(1)) head(data.frame(time = t, surv = surv)) ``` ``` ## time surv ## 1 60.6560 0.99 ## 2 105.0392 0.98 ## 3 145.0723 0.97 ## 4 182.6430 0.96 ## 5 218.5715 0.95 ## 6 253.3125 0.94 ``` --- # Vizualiziraj Weibull model <br> <br> ```r # Weibull wb <- survreg(Surv(time, cens) ~ 1) # Survival krivulja surv <- seq(.99, .01, by = -.01) t <- predict(wb, type = "quantile", p = 1 - surv, newdata = data.frame(1)) surv_wb <- data.frame(time = t, surv = surv, upper = NA, lower = NA, std.err = NA) # Prikaz ggsurvplot_df(fit = surv_wb, surv.geom = geom_line) ``` --- # Vizualiziraj Weibull model <br> <br> <img src="../Foto/surv14.png" width="500px" style="display: block; margin: auto;" /> --- # Dodatne kontrolne varijable <br> <br> <img src="../Foto/surv15.png" width="500px" style="display: block; margin: auto;" /> --- # Provedi model <br> <br> ```r wbmod <- survreg(Surv(time, cens) ~ horTh + tsize, data = GBSG2) coef(wbmod) ``` ``` ## (Intercept) horThyes tsize ## 7.96069769 0.31175602 -0.01218073 ``` --- # Procedura za vizualizaciju ```r # Korak 1 wbmod <- survreg(Surv(time, cens) ~ horTh + tsize, data = GBSG2) # Provedi model ## Odredi kombinacije kontrolnih varijabli newdat <- expand.grid( horTh = levels(GBSG2$horTh), tsize = quantile(GBSG2$tsize, probs = c(0.25, 0.5, 0.75)) ) newdat ``` ``` ## horTh tsize ## 1 no 20 ## 2 yes 20 ## 3 no 25 ## 4 yes 25 ## 5 no 35 ## 6 yes 35 ``` --- # Procedura za vizualizaciju ```r # Korak 2 ## Izračunaj survival krivulje surv <- seq(.99, .01, by = -.01) t <- predict(wbmod, type = "quantile", p = 1 - surv, newdata = newdat) dim(t) ``` ``` ## [1] 6 99 ``` ```r t[, 1:7] ``` ``` ## [,1] [,2] [,3] [,4] [,5] [,6] [,7] ## [1,] 65.86524 112.54061 154.2116 193.0603 230.0268 265.6298 300.1952 ## [2,] 89.96016 153.71037 210.6256 263.6858 314.1755 362.8029 410.0131 ## [3,] 61.97352 105.89102 145.0999 181.6531 216.4354 249.9348 282.4579 ## [4,] 84.64477 144.62823 198.1805 248.1057 295.6121 341.3663 385.7870 ## [5,] 54.86634 93.74733 128.4597 160.8209 191.6144 221.2720 250.0653 ## [6,] 74.93762 128.04211 175.4530 219.6526 261.7110 302.2180 341.5445 ``` --- # Procedura za vizualizaciju <br> <br> ```r # Korak 3 ## Napravi data.frame sa survival krivuljom surv_wbmod_wide <- cbind(newdat, t) # Promijeni oblik library("reshape2") surv_wbmod <- melt(surv_wbmod_wide, id.vars = c("horTh", "tsize"), variable.name = "surv_id", value.name = "time") surv_wbmod$surv <- surv[as.numeric(surv_wbmod$surv_id)] surv_wbmod[, c("upper", "lower", "std.err", "strata")] <- NA ``` --- # Procedura za vizualizaciju ```r # Korak 3 str(surv_wbmod) ``` ``` ## 'data.frame': 594 obs. of 9 variables: ## $ horTh : Factor w/ 2 levels "no","yes": 1 2 1 2 1 2 1 2 1 2 ... ## $ tsize : num 20 20 25 25 35 35 20 20 25 25 ... ## $ surv_id: Factor w/ 99 levels "1","2","3","4",..: 1 1 1 1 1 1 2 2 2 2 ... ## $ time : num 65.9 90 62 84.6 54.9 ... ## $ surv : num 0.99 0.99 0.99 0.99 0.99 0.99 0.98 0.98 0.98 0.98 ... ## $ upper : logi NA NA NA NA NA NA ... ## $ lower : logi NA NA NA NA NA NA ... ## $ std.err: logi NA NA NA NA NA NA ... ## $ strata : logi NA NA NA NA NA NA ... ``` --- # Procedura za vizualizaciju ```r # Korak 4 ## Prikaži grafički ggsurvplot_df(surv_wbmod, surv.geom = geom_line, linetype = "horTh", color = "tsize", legend.title = NULL) ``` <img src="09_SURV_files/figure-html/unnamed-chunk-32-1.png" style="display: block; margin: auto;" /> --- # Postoje i druge distribucije <br> <br> <img src="../Foto/surv16.png" width="600px" style="display: block; margin: auto;" /> --- # Postoje i druge distribucije <br> <br> ```r survreg(Surv(time, cens) ~ horTh, data = GBSG2) survreg(Surv(time, cens) ~ horTh, data = GBSG2, dist = "exponential") survreg(Surv(time, cens) ~ horTh, data = GBSG2, dist = "lognormal") ``` --- class: inverse, center, middle name: cox # Cox MODEL <html><div style='float:left'></div><hr color='#EB811B' size=1px width=796px></html> (Semiprametrijski model sa manje strogim pretpostavkama) --- # Zašto Cox model <br> <br> <img src="../Foto/surv17.png" width="600px" style="display: block; margin: auto;" /> --- # Provedi Cox model ```r # Cox model cxmod <- coxph(Surv(time, cens) ~ horTh, data = GBSG2) coef(cxmod) ``` ``` ## horThyes ## -0.3640099 ``` ```r # Weibull model wbmod <- survreg(Surv(time, cens) ~ horTh, data = GBSG2) coef(wbmod) ``` ``` ## (Intercept) horThyes ## 7.6084486 0.3059506 ``` --- # Vizualizacija Cox modela ```r # Korak 1 cxmod <- coxph(Surv(time, cens) ~ horTh + tsize, data = GBSG2) # Provedi model # Odredi kombinacije regresora newdat <- expand.grid( horTh = levels(GBSG2$horTh), tsize = quantile(GBSG2$tsize, probs = c(0.25, 0.5, 0.75)) ) rownames(newdat) <- letters[1:6] newdat ``` ``` ## horTh tsize ## a no 20 ## b yes 20 ## c no 25 ## d yes 25 ## e no 35 ## f yes 35 ``` --- # Vizualizacija Cox modela ```r # Korak 2 # Izračunaj survival krivulje cxsf <- survfit(cxmod, data = GBSG2, newdata = newdat, conf.type = "none") str(cxsf) ``` ``` ## List of 10 ## $ n : int 686 ## $ time : num [1:574] 8 15 16 17 18 29 42 46 57 63 ... ## $ n.risk : num [1:574] 686 685 684 683 681 680 679 678 677 676 ... ## $ n.event : num [1:574] 0 0 0 0 0 0 0 0 0 0 ... ## $ n.censor: num [1:574] 1 1 1 2 1 1 1 1 1 1 ... ## $ surv : num [1:574, 1:6] 1 1 1 1 1 1 1 1 1 1 ... ## ..- attr(*, "dimnames")=List of 2 ## .. ..$ : NULL ## .. ..$ : chr [1:6] "a" "b" "c" "d" ... ## $ cumhaz : num [1:574, 1:6] 0 0 0 0 0 0 0 0 0 0 ... ## $ std.err : num [1:574, 1:6] 0 0 0 0 0 0 0 0 0 0 ... ## $ logse : logi TRUE ## $ call : language survfit(formula = cxmod, newdata = newdat, conf.type = "none", data = GBSG2) ## - attr(*, "class")= chr [1:2] "survfitcox" "survfit" ``` --- # Vizualizacija Cox modela ```r # Korak 3 surv_cxmod0 <- surv_summary(cxsf) head(surv_cxmod0) ``` ``` ## time n.risk n.event n.censor surv std.err upper lower strata ## 1 8 686 0 1 1 0 NA NA a ## 2 15 685 0 1 1 0 NA NA a ## 3 16 684 0 1 1 0 NA NA a ## 4 17 683 0 2 1 0 NA NA a ## 5 18 681 0 1 1 0 NA NA a ## 6 29 680 0 1 1 0 NA NA a ``` ```r ## Napravi data.frame surv_cxmod <- cbind(surv_cxmod0, newdat[as.character(surv_cxmod0$strata), ]) ``` --- # Vizualizacija Cox modela ```r ggsurvplot_df(surv_cxmod, linetype = "horTh", color = "tsize", legend.title = NULL, censor = FALSE) ``` <img src="09_SURV_files/figure-html/unnamed-chunk-40-1.png" style="display: block; margin: auto;" /> --- class: inverse, center, middle # Hvala na pažnji <html><div style='float:left'></div><hr color='#EB811B' size=1px width=796px></html> (Sljedeće predavanje: Network analiza)