STEM_compatibility.Rmd
This vignette describes the use of the convSTEM function which enables backwards compatibility with STEM economic models which use an alternative formulation of parametric survival models. For most users this vignette can be safely ignored. The following packages are required to run this example:
To perform survival analyses, patient level data is required for the survival endpoints.
This example uses a standard simulated data set (adtte). There is no standard naming that is needed for this package however, there are some set variables that are needed:
In this example, we use only progression-free survival (PFS).
# simulate data with a medium correlation between PFS & OS on the patient level
adtte <- sim_adtte(seed = 2020, # for reproducibility
rho = 0.6 # defines a correlation on underlying survival times
)
# subset PFS data and rename
PFS_data <- adtte %>%
filter(PARAMCD=="PFS") %>%
transmute(USUBJID, ARMCD,
AVAL,
event = 1 - CNSR #needs to be coded as 1 for event
)
This section uses runPSM as described in other vignettes to estimate survival models. As described in other vignettes flexsurvPlus primarily uses bootstrapping to handle uncertainty with the bootPSM function in conjunction with boot.
# For speed only 4 models are considered for this example but
# all models could be considered
psm_PFS_expweib <- runPSM(data = PFS_data,
time_var = "AVAL",
event_var ="event",
model.type = c("Common shape",
"Separate"),
distr = c('exp',
'weibull'),
strata_var = "ARMCD",
int_name = "A",
ref_name = "B")
# Apply bootstrapping for illustration 100 samples are taken
# In practice a larger number maybe preferable.
psm_boot_PFS_expweib <- do.call(boot, args = c(psm_PFS_expweib$config, statistic = bootPSM, R = 100))
This is done using convSTEM. Depending on if a runPSM object or a bootPSM object or both are provided different quantities will be estimated.
convSTEM returns a list of four data frames containing parameters converted to match those reported by an existing SAS macro.
stem_param contains the parameter estimates and is calculated if either x or samples is provided.
names(PFS_conv_x$stem_param)
#> [1] "Model" "ModelF" "Dist" "DistF" "Param" "ParamF" "Estimate"
PFS_conv_x$stem_param %>%
transmute(Model, Dist, Param, Estimate)
#> Model Dist Param Estimate
#> 1 Common shape Exponential INTERCEPT 6.1145985
#> 2 Common shape Exponential TX(Intervention) -0.7675910
#> 3 Common shape Weibull INTERCEPT 6.0800704
#> 4 Common shape Weibull TX(Intervention) -0.6623183
#> 5 Common shape Weibull SCALE 0.7204249
#> 6 Separate - Reference Exponential INTERCEPT 6.1145985
#> 7 Separate - Reference Weibull INTERCEPT 6.0798340
#> 8 Separate - Reference Weibull SCALE 0.7171585
#> 9 Separate - Intervention Exponential INTERCEPT 5.3470075
#> 10 Separate - Intervention Weibull INTERCEPT 5.4170457
#> 11 Separate - Intervention Weibull SCALE 0.7224962
PFS_conv_samples$stem_param %>%
transmute(Model, Dist, Param, Estimate)
#> Model Dist Param Estimate
#> 1 Common shape Exponential INTERCEPT 6.1145985
#> 2 Common shape Exponential TX(Intervention) -0.7675910
#> 3 Common shape Weibull INTERCEPT 6.0800704
#> 4 Common shape Weibull TX(Intervention) -0.6623183
#> 5 Common shape Weibull SCALE 0.7204249
#> 6 Separate - Reference Exponential INTERCEPT 6.1145985
#> 7 Separate - Reference Weibull INTERCEPT 6.0798340
#> 8 Separate - Reference Weibull SCALE 0.7171585
#> 9 Separate - Intervention Exponential INTERCEPT 5.3470075
#> 10 Separate - Intervention Weibull INTERCEPT 5.4170457
#> 11 Separate - Intervention Weibull SCALE 0.7224962
stem_cov contains estimates for a covariance matrix for the parameters and is only derived if samples is provided.
names(PFS_conv_x$stem_cov)
#> [1] "Model" "ModelF" "Dist" "DistF" "rowParam" "colParam"
#> [7] "rowParamF" "colParamF" "rowNum" "colNum" "CovEst"
# note is empty here as samples were not provided
nrow(PFS_conv_x$stem_cov)
#> [1] 0
PFS_conv_samples$stem_cov %>%
transmute(Model, Dist, rowParam, colParam, rowNum, colNum, CovEst)
#> # A tibble: 23 × 7
#> Model Dist rowParam colParam rowNum colNum CovEst
#> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 Common shape Exponential INTERCEPT INTERCEPT 1 1 4.54e-3
#> 2 Common shape Exponential INTERCEPT TX(Interven… 1 2 -4.86e-3
#> 3 Common shape Exponential TX(Intervention) INTERCEPT 2 1 -4.86e-3
#> 4 Common shape Exponential TX(Intervention) TX(Interven… 2 2 7.93e-3
#> 5 Common shape Weibull INTERCEPT INTERCEPT 1 1 2.98e-3
#> 6 Common shape Weibull INTERCEPT TX(Interven… 1 2 -3.18e-3
#> 7 Common shape Weibull INTERCEPT SCALE 1 3 8.86e-5
#> 8 Common shape Weibull TX(Intervention) INTERCEPT 2 1 -3.18e-3
#> 9 Common shape Weibull TX(Intervention) TX(Interven… 2 2 5.80e-3
#> 10 Common shape Weibull TX(Intervention) SCALE 2 3 -7.51e-5
#> # … with 13 more rows
#> # ℹ Use `print(n = ...)` to see more rows
The matrix is described by row and column numbers so can be back converted to a matrix if desired.
# select a single model from the data frame
weibull_common_data <- PFS_conv_samples$stem_cov %>%
filter(Model == "Common shape", Dist == "Weibull") %>%
transmute(rowParam, colParam, rowNum, colNum, CovEst)
weibull_common_data
#> # A tibble: 9 × 5
#> rowParam colParam rowNum colNum CovEst
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 INTERCEPT INTERCEPT 1 1 0.00298
#> 2 INTERCEPT TX(Intervention) 1 2 -0.00318
#> 3 INTERCEPT SCALE 1 3 0.0000886
#> 4 TX(Intervention) INTERCEPT 2 1 -0.00318
#> 5 TX(Intervention) TX(Intervention) 2 2 0.00580
#> 6 TX(Intervention) SCALE 2 3 -0.0000751
#> 7 SCALE INTERCEPT 3 1 0.0000886
#> 8 SCALE TX(Intervention) 3 2 -0.0000751
#> 9 SCALE SCALE 3 3 0.000750
# can see is ordered by row so can convert to a matrix
weibull_common_matrix <- matrix(data = weibull_common_data$CovEst,
nrow = 3, ncol = 3, byrow = TRUE)
# applying names
rownames(weibull_common_matrix) <- colnames(weibull_common_matrix) <- weibull_common_data$colParam[1:3]
weibull_common_matrix
#> INTERCEPT TX(Intervention) SCALE
#> INTERCEPT 2.976351e-03 -3.184138e-03 8.859245e-05
#> TX(Intervention) -3.184138e-03 5.803708e-03 -7.510364e-05
#> SCALE 8.859245e-05 -7.510364e-05 7.501754e-04
stem_modsum contains a summary of the model fit statistics and is only produced if x is provided.
names(PFS_conv_x$stem_modsum)
#> [1] "Model" "ModelF" "Dist" "DistF" "Status" "AIC" "AIC_SAS"
#> [8] "BIC" "BIC_SAS"
# note is empty here as x was not provided
nrow(PFS_conv_samples$stem_modsum)
#> [1] 0
PFS_conv_x$stem_modsum %>%
transmute(Model, Dist, Status, AIC, AIC_SAS, BIC, BIC_SAS)
#> Model Dist Status AIC AIC_SAS BIC
#> 1 Common shape Exponential Converged 5600.660 1326.3479 5609.089
#> 2 Common shape Weibull Converged 5545.222 1270.9094 5557.866
#> 3 Separate - Intervention Exponential Converged 3023.176 671.5601 3026.697
#> 4 Separate - Intervention Weibull Converged 2990.943 639.3272 2997.986
#> 5 Separate - Reference Exponential Converged 2577.485 654.7878 2581.006
#> 6 Separate - Reference Weibull Converged 2556.271 633.5741 2563.314
#> BIC_SAS
#> 1 1334.7771
#> 2 1283.5533
#> 3 675.0816
#> 4 646.3702
#> 5 658.3092
#> 6 640.6170
stem_boot contains a data frame of converted boot strap estimates and is only derived if samples is provided.
names(PFS_conv_samples$stem_boot)
#> [1] "Model" "ModelF" "Dist" "DistF" "BootSample"
#> [6] "Param" "ParamF" "Estimate"
# note is empty here as samples were not provided
nrow(PFS_conv_x$stem_boot)
#> [1] 0
PFS_conv_samples$stem_boot %>%
transmute(Model, Dist, BootSample, Param, Estimate) %>%
head()
#> Model Dist BootSample Param Estimate
#> 1 Common shape Exponential 1 INTERCEPT 6.1289156
#> 2 Common shape Exponential 1 TX(Intervention) -0.7908096
#> 3 Common shape Exponential 2 INTERCEPT 6.0663108
#> 4 Common shape Exponential 2 TX(Intervention) -0.6928566
#> 5 Common shape Exponential 3 INTERCEPT 6.0120960
#> 6 Common shape Exponential 3 TX(Intervention) -0.6119670
To illustrate the conversions will fit a full set of models to the data.
psm_PFS_all <- runPSM(data=PFS_data,
time_var="AVAL",
event_var="event",
model.type= c("Common shape",
"Independent shape",
"Separate"),
distr = c('exp',
'weibull',
'gompertz',
'lnorm',
'llogis',
'gengamma',
'gamma',
'genf'),
strata_var = "ARMCD",
int_name="A",
ref_name = "B")
# only supplying x
PFS_conv_all <- convSTEM(x = psm_PFS_all)
# To see what the converted parameters are can look at stem_param
PFS_conv_all$stem_param %>%
group_by(Model, Dist) %>%
summarise(Params = paste(Param, collapse = ", ")) %>%
knitr::kable()
Model | Dist | Params |
---|---|---|
Common shape | Exponential | INTERCEPT, TX(Intervention) |
Common shape | Gamma | INTERCEPT, TX(Intervention), SCALE |
Common shape | Generalized Gamma | INTERCEPT, TX(Intervention), SCALE, SHAPE |
Common shape | Gompertz | INTERCEPT, TX(Intervention), SCALE |
Common shape | Log Logistic | INTERCEPT, TX(Intervention), SCALE |
Common shape | Log Normal | INTERCEPT, TX(Intervention), SCALE |
Common shape | Weibull | INTERCEPT, TX(Intervention), SCALE |
Independent shape - Intervention | Exponential | INTERCEPT |
Independent shape - Intervention | Gamma | INTERCEPT, SCALE |
Independent shape - Intervention | Generalized Gamma | INTERCEPT, SCALE, SHAPE |
Independent shape - Intervention | Gompertz | INTERCEPT, SCALE |
Independent shape - Intervention | Log Logistic | INTERCEPT, SCALE |
Independent shape - Intervention | Log Normal | INTERCEPT, SCALE |
Independent shape - Intervention | Weibull | INTERCEPT, SCALE |
Independent shape - Reference | Exponential | INTERCEPT |
Independent shape - Reference | Gamma | INTERCEPT, SCALE |
Independent shape - Reference | Generalized Gamma | INTERCEPT, SCALE, SHAPE |
Independent shape - Reference | Gompertz | INTERCEPT, SCALE |
Independent shape - Reference | Log Logistic | INTERCEPT, SCALE |
Independent shape - Reference | Log Normal | INTERCEPT, SCALE |
Independent shape - Reference | Weibull | INTERCEPT, SCALE |
Separate - Intervention | Exponential | INTERCEPT |
Separate - Intervention | Gamma | INTERCEPT, SCALE |
Separate - Intervention | Generalized Gamma | INTERCEPT, SCALE, SHAPE |
Separate - Intervention | Gompertz | INTERCEPT, SCALE |
Separate - Intervention | Log Logistic | INTERCEPT, SCALE |
Separate - Intervention | Log Normal | INTERCEPT, SCALE |
Separate - Intervention | Weibull | INTERCEPT, SCALE |
Separate - Reference | Exponential | INTERCEPT |
Separate - Reference | Gamma | INTERCEPT, SCALE |
Separate - Reference | Generalized Gamma | INTERCEPT, SCALE, SHAPE |
Separate - Reference | Gompertz | INTERCEPT, SCALE |
Separate - Reference | Log Logistic | INTERCEPT, SCALE |
Separate - Reference | Log Normal | INTERCEPT, SCALE |
Separate - Reference | Weibull | INTERCEPT, SCALE |
Note in general independent shape models were not estimated by STEM macro so are converted for backwards compatibility as if if they were estimated by separate models. This has implications for how AIC and BIC should be interpreted in these cases.
flexsurvPlus uses the parameter rate (\(b\)). For common shape, independent shape and separate models these can be denoted as \(b_R\) for reference and \(b_I\) for intervention.
The SAS STEM macro reports INTERCEPT and TX(INTERVENTION) which are converted as shown below.
flexsurvPlus uses the parameters scale (\(b\)) and shape (\(a\)). For common shape, independent shape and separate models these can be denoted as \(b_R, a_R\) for reference and \(b_I, a_I\) for intervention. For common shape models \(a_R=a_I\).
The SAS STEM macro reports INTERCEPT, TX(INTERVENTION) and SCALE which are converted as shown below.
flexsurvPlus uses the parameters rate (\(b\)) and shape (\(a\)). For common shape, independent shape and separate models these can be denoted as \(b_R, a_R\) for reference and \(b_I, a_I\) for intervention. For common shape models \(a_R=a_I\).
The SAS STEM macro reports INTERCEPT, TX(INTERVENTION) and SCALE which are converted as shown below.
flexsurvPlus uses the parameters scale (\(b\)) and shape (\(a\)). For common shape, independent shape and separate models these can be denoted as \(b_R, a_R\) for reference and \(b_I, a_I\) for intervention. For common shape models \(a_R=a_I\).
The SAS STEM macro reports INTERCEPT, TX(INTERVENTION) and SCALE which are converted as shown below.
flexsurvPlus uses the parameters meanlog (\(\mu\)) and sdlog (\(\sigma\)). For common shape, independent shape and separate models these can be denoted as \(\mu_R, \sigma_R\) for reference and \(\mu_I, \sigma_I\) for intervention. For common shape models \(\sigma_R=\sigma_I\).
The SAS STEM macro reports INTERCEPT, TX(INTERVENTION) and SCALE which are converted as shown below.
flexsurvPlus uses the parameters mu (\(\mu\)), sigma (\(\sigma\)) and \(Q\). For common shape, independent shape and separate models these can be denoted as \(\mu_R, \sigma_R, Q_R\) for reference and \(\mu_I, \sigma_I, Q_I\) for intervention. For common shape models \(\sigma_R=\sigma_I\) and \(Q_R=\)Q_I$.
The SAS STEM macro reports INTERCEPT, TX(INTERVENTION), SCALE and SHAPE which are converted as shown below.
To derive survival in Excel the following formula are used:
lambda = (EXP(-INTERCEPT)^(SHAPE/SCALE))/(SHAPE^2)
and/orlambda = (EXP(-(INTERCEPT+TX(INTERVENTION))^(SHAPE/SCALE))/(SHAPE^2)
gamma = 1/(SHAPE^2)
delta = SHAPE/SCALE
Survival = IF(gamma>0, 1-GAMMADIST(lambda*time^(delta),gamma,1,TRUE), 1-( 1-GAMMADIST(lambda*time^(delta),gamma,1,TRUE)))
flexsurvPlus uses the parameters Shape (\(a\)) and Rate (\(b\)). For common shape, independent shape and separate models these can be denoted as \(a_R, b_R\) for reference and \(a_I, b_I\) for intervention. For common shape models \(a_R=a_I\).
The SAS STEM macro reports INTERCEPT, TX(INTERVENTION) and SCALE which are converted as shown below. This is equivalent to the parameters used for the Generalized Gamma when the constraint that \(\sigma = Q\) or identically that SCALE = SHAPE is applied.
As the existing SAS STEM macro estimated models with log(time) as a response variable while flexsurv and flexsurvPlus estimate models with time as the response variable there are differences in the log-likelihood between the two software packages. These differences do not affect the parameter estimates or ranking of fit but do affect the AIC and BIC reported by a constant.
This constant is the sum of the events log(time). As such can convert by adding this constant to the maximum log-likelihood and by extension double the sum of the events log(time) to AIC or BIC.
# This constant is estimated from the sum of the log of times for patients who have an event
PFS_data %>%
filter(event == 1) %>% # select events only
transmute(log_event_time = log(AVAL)) %>% # difference in log time
summarise(2 * sum(log_event_time))
#> 2 * sum(log_event_time)
#> 1 4274.312
# So for common shape models (using all data) the difference is 4274.312
# Can be seen in results from convSTEM here
modsum <- PFS_conv_all$stem_modsum %>%
transmute(Model, Dist, AIC, AIC_SAS, BIC, BIC_SAS)
# note as the separate and common shape models use different data these are not comparable
modsum %>%
mutate(delta_AIC = AIC - AIC_SAS,
delta_BIC = AIC - AIC_SAS)
#> Model Dist AIC AIC_SAS BIC
#> 1 Common shape Exponential 5600.660 1326.3479 5609.089
#> 2 Common shape Weibull 5545.222 1270.9094 5557.866
#> 3 Common shape Gompertz 5564.003 1289.6905 5576.647
#> 4 Common shape Log Normal 5606.451 1332.1382 5619.094
#> 5 Common shape Log Logistic 5566.512 1292.1995 5579.156
#> 6 Common shape Generalized Gamma 5546.841 1272.5287 5563.699
#> 7 Common shape Gamma 5546.203 1271.8911 5558.847
#> 8 Separate - Intervention Exponential 3023.176 671.5601 3026.697
#> 9 Separate - Intervention Weibull 2990.943 639.3272 2997.986
#> 10 Separate - Intervention Gompertz 3005.220 653.6042 3012.263
#> 11 Separate - Intervention Log Normal 3021.443 669.8278 3028.486
#> 12 Separate - Intervention Log Logistic 3002.733 651.1172 3009.776
#> 13 Separate - Intervention Generalized Gamma 2991.950 640.3347 3002.514
#> 14 Separate - Intervention Gamma 2990.152 638.5363 2997.195
#> 15 Separate - Reference Exponential 2577.485 654.7878 2581.006
#> 16 Separate - Reference Weibull 2556.271 633.5741 2563.314
#> 17 Separate - Reference Gompertz 2560.755 638.0579 2567.798
#> 18 Separate - Reference Log Normal 2582.939 660.2420 2589.982
#> 19 Separate - Reference Log Logistic 2563.469 640.7722 2570.512
#> 20 Separate - Reference Generalized Gamma 2558.162 635.4655 2568.727
#> 21 Separate - Reference Gamma 2557.768 635.0706 2564.810
#> 22 Independent shape Exponential 5600.660 1326.3479 5609.089
#> 23 Independent shape Weibull 5547.214 1272.9013 5564.072
#> 24 Independent shape Gompertz 5565.974 1291.6621 5582.833
#> 25 Independent shape Log Logistic 5566.202 1291.8894 5583.060
#> 26 Independent shape Gamma 5547.919 1273.6069 5564.778
#> 27 Independent shape Log Normal 5604.382 1330.0698 5621.241
#> 28 Independent shape Generalized Gamma 5550.112 1275.8001 5575.400
#> BIC_SAS delta_AIC delta_BIC
#> 1 1334.7771 4274.312 4274.312
#> 2 1283.5533 4274.312 4274.312
#> 3 1302.3343 4274.312 4274.312
#> 4 1344.7821 4274.312 4274.312
#> 5 1304.8433 4274.312 4274.312
#> 6 1289.3872 4274.312 4274.312
#> 7 1284.5349 4274.312 4274.312
#> 8 675.0816 2351.615 2351.615
#> 9 646.3702 2351.615 2351.615
#> 10 660.6471 2351.615 2351.615
#> 11 676.8707 2351.615 2351.615
#> 12 658.1602 2351.615 2351.615
#> 13 650.8991 2351.615 2351.615
#> 14 645.5792 2351.615 2351.615
#> 15 658.3092 1922.697 1922.697
#> 16 640.6170 1922.697 1922.697
#> 17 645.1009 1922.697 1922.697
#> 18 667.2849 1922.697 1922.697
#> 19 647.8151 1922.697 1922.697
#> 20 646.0299 1922.697 1922.697
#> 21 642.1136 1922.697 1922.697
#> 22 1334.7771 4274.312 4274.312
#> 23 1289.7598 4274.312 4274.312
#> 24 1308.5206 4274.312 4274.312
#> 25 1308.7478 4274.312 4274.312
#> 26 1290.4653 4274.312 4274.312
#> 27 1346.9282 4274.312 4274.312
#> 28 1301.0878 4274.312 4274.312