This vignette describes the different functions provided in ‘SynergyLMM’ package to analyze drug combination effects in preclinical in vivo studies.
We will follow the workflow for the analysis of a longitudinal tumor growth experiments for the evaluation of the effect of a drug combination and determine whether synergy (or antagonism) is present.
The following figure represent schematically ‘SynergyLMM’ workflow:
We will start by loading ‘SynergyLMM’ package:
For this tutorial, we will use the example data set that is included in the package:
Let’s have a look to these data:
> head(grwth_data)
#> subject Time Treatment TumorVolume
#> 1 1 0 Control 177.5810
#> 2 1 3 Control 242.5453
#> 3 1 6 Control 423.9744
#> 4 1 9 Control 416.6808
#> 5 1 12 Control 535.8457
#> 6 1 15 Control 891.7920
This is a typical example of long data format, in which each row is one time point per subject. So each subject (mouse) will have data in multiple rows. Any variables that don’t change across time will have the same value in all the rows.
In this case, column subject
indicates the sample ID for
each mouse, column Time
indicates the time points (days)
for each measurement, column Treatment
indicates the
treatment group to which the mouse belongs, and TumorVolume
indicates the tumor volume measured for that mouse at that specific time
point.
It is important to note that the TumorVolume
column
contains the raw measurements of tumor volume, in the sense
that they have not been relativized or transformed.
In this experiment there are 4 groups: the control group, each mono-therapy group, and the combination group. Each group is identified with the following names:
> unique(grwth_data$Treatment)
#> [1] Control DrugA DrugB Combination
#> Levels: Control DrugA DrugB Combination
The output shows the different names for each treatment. We can also
see that this column is stored as a factor
. This is
convenient, but not necessary. There are the four columns that the data
set has to have to be able to perform the analysis. The input data could
have additional columns, but they will be ignored.
Let’s start with the analysis by fitting the model. This is done with
lmmModel()
function:
> lmm_ex <- lmmModel(data = grwth_data, sample_id = "subject", time = "Time",
+ treatment = "Treatment", tumor_vol = "TumorVolume",
+ trt_control = "Control", drug_a = "DrugA",
+ drug_b = "DrugB", combination = "Combination")
This is the most basic use of the function to fit the model. The user needs to specify the names of the columns containing the information for the subject IDs, time points, treatment groups, tumor measurements, and the names used to refer the different groups.
The function fits the model, which we have stored in the
lmm_ex
object, and which we will use in the next steps. A
plot with the tumor growth data for the different groups is also
produced. The bold lines in the plots represent the fitted regression
lines for the fixed effects of each treatment estimated by the linear
mixed-effects model. The x-axis identifies the time since the start of
treatment, being 0 the time of treatment initiation. The y-axis
represent the \(\log\) of the relative
tumor volume (\(RTV\)).
There are several additional arguments in lmmModel()
function:
time_start
: can be used to change the day that should
be considered as the initial time point, which, by default, is the
minimum value in time
column.time_end
: can be used to indicate the last time point
to be included in the analysis, which, by default, is the maximum value
in time
column.min_observations
: can be used to define the minimum
number of observations (data points) that a subject should have to be
considered in the analysis.tum_vol_0
: this arguments allows to control the
behavior regarding measurements in which the tumor measurement is 0, and
therefore the logarithmic transformation is not possible. The user can
choose to to ignore these measurements, or transform them, by adding 1
unit to all measurements before the \(\log\) transformation.We can obtain the values of the model estimates
using the function lmmModel_estimates()
. This function
retrieves the information about the estimated coefficients (tumor growth
rates) for the different experimental groups, as well as the standard
deviation of the random effects (between-subject variance) and residuals
(within-subject variance).
Before performing the synergy analysis, it is important to evaluate the model diagnostics to check the main assumptions of the model: the normality of the random effects and the residuals.
It is important to verify that the model is appropriate for the data. Reliable results from statistical tests require that model assumptions—such as normality and homoscedasticity of random effects and residuals—are met. Therefore, users are recommended to thoroughly evaluate the estimated model, as certain experimental data sets may result in highly imprecise estimates.
To perform the diagnostics of the random effects we can use
ranefDiagnostics()
. This function provides plots and
results from statistical tests to address the normality of the random
effects. The random effects in our model correspond to a random slope
for each subject, which takes into account the longitudinal nature of
the data. Additional plots and tests related to the normality and
homoscedasticity of the residuals for the subjects are also provided,
which can help to identify subjects with extreme behaviors, and which
could potentially being responsible for departure of the normality of
random effects.
#>
#> Title:
#> Shapiro - Wilk Normality Test of random effects
#>
#> Test Results:
#> STATISTIC:
#> W: 0.9672
#> P VALUE:
#> 0.4269
#>
#>
#> Title:
#> D'Agostino Normality Test of random effects
#>
#> Test Results:
#> STATISTIC:
#> Chi2 | Omnibus: 2.2635
#> Z3 | Skewness: 0.6316
#> Z4 | Kurtosis: -1.3655
#> P VALUE:
#> Omnibus Test: 0.3225
#> Skewness Test: 0.5276
#> Kurtosis Test: 0.1721
#>
#>
#> Title:
#> Anderson - Darling Normality Test of random effects
#>
#> Test Results:
#> STATISTIC:
#> A: 0.3783
#> P VALUE:
#> 0.3867
#>
#>
#> Normalized Residuals Levene Homoscedasticity Test by Sample
#> Levene's Test for Homogeneity of Variance (center = median)
#> Df F value Pr(>F)
#> group 31 0.874 0.6633
#> 288
#>
#> Normalized Residuals Fligner-Killeen Homoscedasticity Test by Sample
#>
#> Fligner-Killeen test of homogeneity of variances
#>
#> data: normalized_resid by SampleID
#> Fligner-Killeen:med chi-squared = 28.521, df = 31, p-value = 0.5942
The most important plot is the Q-Q plot at the top left, which shows the normal Q-Q plot of the random effects. The console output shows the results of Shapiro - Wilk, D’Agostino, and Anderson - Darling normality tests for the random effects. The normality of the random effects cannot be rejected if the p-value of the normality tests is not significant, and/or the Q-Q plot shows that the data points approximately lie in the diagonal line. If some points lie significantly outside the line, they could be potential outliers that could be identify based on the residuals or in the Influential Diagnostics.
The other plots address the normalized residuals by subject. The top-right plot shows Q-Q plots of the normalized residuals by sample, the bottom-left boxplot show the distribution of the normalized residuals, to analyze if they are homoscedastic, and the bottom-right plot shows the normalized residuals versus fitted values by subject, to help to identify potential outlier observations, which are identified in the scatter plots labelled with the time point corresponding to the observation.
The console output also shows the results of Levene and Fligner-Killeen homoscedasticity test of the normalized residuals by sample. The homoscedasticity of the residuals among subjects cannot be rejected if the p-value of these tests is not significant.
In this example, it can be observed how the normality of the random effects cannot be rejected, although there are some outlier observations, that may deserve further examination.
To perform the diagnostics of the random effects we can use
residDiagnostics()
function. residDiagnostics
provides several plots as well as statistical test for the examination
of the normality and homoscedasticity of the normalized residuals of the
input model.
One of the assumption of the model fit by lmmModel()
is
that the residuals are normally distributed. For the evaluation of this
assumption, residDiagnostics()
provides Q-Q plots of the
normalized residuals together with statistical assessment of their
normality using Shapiro-Wilk, D’Agostini and Anderson-Darling normality
tests. Additionally, Q-Q plots of the normalized residuals by time point
and treatment group are provided to be able to detect time points or
treatment groups which could be notably different from the others and be
affecting the adequacy of the model.
Scatter plots of the normalized residuals versus fitted values and normalized residuals per time and per treatment are also provided to give information about variability of the residuals and possible outlier observations. These plots are accompanied by Levene and Fligner-Killend homogeneity of variance test results.
Finally, potential outlier observations are also returned.
#>
#> Title:
#> Shapiro - Wilk Normality Test
#>
#> Test Results:
#> STATISTIC:
#> W: 0.9895
#> P VALUE:
#> 0.02154
#>
#> Description:
#> Shapiro - Wilk Normality Test of normalized residuals
#>
#> Title:
#> D'Agostino Normality Test
#>
#> Test Results:
#> STATISTIC:
#> Chi2 | Omnibus: 9.4195
#> Z3 | Skewness: -2.704
#> Z4 | Kurtosis: 1.4519
#> P VALUE:
#> Omnibus Test: 0.009007
#> Skewness Test: 0.006851
#> Kurtosis Test: 0.1465
#>
#> Description:
#> D'Agostino Normality Test of normalized residuals
#>
#> Title:
#> Anderson - Darling Normality Test
#>
#> Test Results:
#> STATISTIC:
#> A: 0.6813
#> P VALUE:
#> 0.07456
#>
#> Description:
#> Anderson - Darling Normality Test of normalized residuals
#>
#> Normalized Residuals Levene Homoscedasticity Test by Time
#> Levene's Test for Homogeneity of Variance (center = median)
#> Df F value Pr(>F)
#> group 9 0.4714 0.8934
#> 310
#>
#> Normalized Residuals Fligner-Killeen Homoscedasticity Test by Time
#>
#> Fligner-Killeen test of homogeneity of variances
#>
#> data: normalized_resid by as.factor(Time)
#> Fligner-Killeen:med chi-squared = 3.8143, df = 9, p-value = 0.9232
#>
#>
#> Normalized Residuals Levene Homoscedasticity Test by Treatment
#> Levene's Test for Homogeneity of Variance (center = median)
#> Df F value Pr(>F)
#> group 3 0.5772 0.6304
#> 316
#>
#> Normalized Residuals Fligner-Killeen Homoscedasticity Test by Treatment
#>
#> Fligner-Killeen test of homogeneity of variances
#>
#> data: normalized_resid by Treatment
#> Fligner-Killeen:med chi-squared = 1.5405, df = 3, p-value = 0.673
#>
#>
#> Outlier observations
#> SampleID Time Treatment TV RTV logRTV TV0
#> 16 2 18 Control 512.1195 2.3886998 0.8707492 214.3926
#> 41 5 3 Control 197.1398 0.7938990 -0.2307990 248.3185
#> 51 6 3 Control 175.4958 0.6732778 -0.3955972 260.6588
#> 65 7 15 Control 357.3550 1.6397816 0.4945630 217.9284
#> 113 12 9 DrugA 514.0043 3.1708889 1.1540120 162.1010
#> 122 13 6 DrugA 179.3938 0.8896325 -0.1169469 201.6493
#> 135 14 15 DrugA 811.5854 5.8214346 1.7615467 139.4133
#> 149 15 27 DrugA 2182.0193 11.1771236 2.4138692 195.2219
#> 182 19 6 DrugB 425.3002 2.4226262 0.8848522 175.5534
#> 221 23 3 DrugB 187.7751 0.7226499 -0.3248305 259.8424
#> 243 25 9 Combination 185.9586 0.6374488 -0.4502813 291.7232
#> 272 28 6 Combination 317.8079 2.1773052 0.7780880 145.9639
#> 284 29 12 Combination 211.5939 0.8481839 -0.1646578 249.4670
#> 293 30 9 Combination 209.9097 0.8877339 -0.1190832 236.4557
#> 301 31 3 Combination 171.9011 0.7141648 -0.3366415 240.7023
#> 305 31 15 Combination 214.8480 0.8925879 -0.1136302 240.7023
#> 314 32 12 Combination 209.8804 0.8805177 -0.1272452 238.3602
#> normalized_resid
#> 16 -2.175456
#> 41 -2.107815
#> 51 -2.796859
#> 65 -2.948355
#> 113 2.013734
#> 122 -2.439570
#> 135 2.306236
#> 149 2.147614
#> 182 2.178362
#> 221 -2.256830
#> 243 -3.153712
#> 272 2.352650
#> 284 -2.467379
#> 293 -2.037821
#> 301 -2.012043
#> 305 -2.670916
#> 314 -2.113725
As we can see, there is some departure from the normality of the residuals, as indicated by the normality tests and the top-left Q-Q plot. Additionally, there are several potential outlier observations according to their normalized residuals.
When diagnostic plots and tests show evident violations of the model assumptions, there are several solutions that may help improving the model:
We will try to improved the model by defining unequal variances for
the errors. We can do this providing an additional argument,
weights
, when fitting the model, which will be passed to
nlme::lme()
function. For example, we can define the model
adding a variance function to represent a different variance per subject
using nlme::varIdent()
function:
> lmm_ex_var <- lmmModel(data = grwth_data, sample_id = "subject", time = "Time",
+ treatment = "Treatment", tumor_vol = "TumorVolume",
+ trt_control = "Control", drug_a = "DrugA",
+ drug_b = "DrugB", combination = "Combination",
+ weights = nlme::varIdent(form = ~1|SampleID))
If we obtain the model estimates from this new model:
> lmmModel_estimates(lmm_ex_var)
#> Control DrugA DrugB Combination sd_ranef sd_resid
#> estimate 0.07888867 0.07496401 0.06308966 0.03535055 0.0386823 0.2148223
We can observe that the values are slightly different from the
previous lmm_ex
model.
You can also try defining a different variance per time using
nlme::varIdent(form = ~1|Time)
, a different variance per
treatment group, using nlme::varIdent(form = ~1|Treatment)
,
or combinations of them using nlme::varComb()
function.
We can now evaluate the diagnostics of the new model. We will use the
option verbose = FALSE
to avoid printing all the results in
the console:
Random Effects
>
> # We can access to individual results of the diagnostics:
> ranefD$Normality
#> $Shapiro.test
#>
#> Title:
#> Shapiro - Wilk Normality Test of random effects
#>
#> Test Results:
#> STATISTIC:
#> W: 0.9671
#> P VALUE:
#> 0.423
#>
#>
#> $DAgostino.test
#>
#> Title:
#> D'Agostino Normality Test of random effects
#>
#> Test Results:
#> STATISTIC:
#> Chi2 | Omnibus: 1.9311
#> Z3 | Skewness: 0.8258
#> Z4 | Kurtosis: -1.1176
#> P VALUE:
#> Omnibus Test: 0.3808
#> Skewness Test: 0.4089
#> Kurtosis Test: 0.2637
#>
#>
#> $Anderson.Darling.test
#>
#> Title:
#> Anderson - Darling Normality Test of random effects
#>
#> Test Results:
#> STATISTIC:
#> A: 0.4127
#> P VALUE:
#> 0.3196
Residuals
> residD$Normality
#> $Shapiro.test
#>
#> Title:
#> Shapiro - Wilk Normality Test
#>
#> Test Results:
#> STATISTIC:
#> W: 0.9945
#> P VALUE:
#> 0.31
#>
#> Description:
#> Shapiro - Wilk Normality Test of normalized residuals
#>
#> $DAgostino.test
#>
#> Title:
#> D'Agostino Normality Test
#>
#> Test Results:
#> STATISTIC:
#> Chi2 | Omnibus: 1.4999
#> Z3 | Skewness: -1.1748
#> Z4 | Kurtosis: -0.346
#> P VALUE:
#> Omnibus Test: 0.4724
#> Skewness Test: 0.2401
#> Kurtosis Test: 0.7293
#>
#> Description:
#> D'Agostino Normality Test of normalized residuals
#>
#> $Anderson.Darling.test
#>
#> Title:
#> Anderson - Darling Normality Test
#>
#> Test Results:
#> STATISTIC:
#> A: 0.3193
#> P VALUE:
#> 0.5327
#>
#> Description:
#> Anderson - Darling Normality Test of normalized residuals
It can be observed there is no evidence of violation of the assumptions in the new model.
Another useful function to evaluate the model is
ObsvsPred()
. ObsvsPred
allows the user to have
a straight forward idea about how the model is fitting the data,
providing plots of the predicted regression lines versus the actual data
points.
#> # Indices of model performance
#>
#> AIC | AICc | BIC | R2 (cond.) | R2 (marg.) | RMSE | Sigma
#> -------------------------------------------------------------------
#> 57.844 | 67.816 | 197.272 | 0.896 | 0.896 | 0.203 | 0.215
This function provides visual and quantitative information about the performance of the model:
A layout of the observed and predicted values of \(\log\)(relative tumor volume) vs Time for each SampleID (i.e. subject). The actual measurements are shown as blue dots, the dashed orange line indicates the regression line for each subject, while the continuous blue line indicates the marginal, treatment-specific, regression line for each treatment group.
Performance metrics of the model calculated using
performance::model_performance()
. The maximum
likelihood-based Akaike’s Information Criterion (AIC), small sample AIC
(AICc), and Bayesian Information Criterion, and the Nakagawa’s r-squared
root mean squared error (RMSE) of the residuals, and the standard
deviation of the residuals (sigma) are provided.
Smaller values of AIC, AICc, BIC, RMSE, and sigma of the residuals indicate a better-fitting model. \(R^2\) values range from 0 to 1. An \(R^2\) value closer to 1 suggests that the model explains a larger proportion of the variance in the data, indicating a better fit.
As we can see in the plots and the metrics, the model fits quite well the data, obtaining a high \(R^2\) value.
Influential diagnostics allow the user to identify highly influential
subjects (animals) in the experiment. These options include detecting
those subjects with a significant influence on the estimation of control
and treatment group coefficients for growth rates, as evaluated using
Cook’s distances using CookDistances()
function, as well as
identifying subjects with a substantial impact on the overall model, as
assessed with log-likelihood displacements using
logLikSubjectDisplacements()
function.
In our data we don’t have subjects that seem to be outliers with a high influence in the model, but we can easily check the results of the influential diagnostics with these two functions:
> CookDistance(lmm_ex_var)
#> [1] "Subjects with Cook's distance greater than: 0.081"
#> 3 6 8 23
#> 0.08973034 0.10756403 0.18177839 0.08188931
It seems that there are 4 subjects with a higher influence in the group coefficients, being subject 8 the one with the most remarkable influence.
We can check the log-likelihood displacements. Since we have defined
a variance structure in the model, we need to specify an additional
argument in the function, var_name
:
> logLikSubjectDisplacements(lmm_ex_var, var_name = "SampleID")
#> [1] "Outliers with Log Likelihood displacement greater than: 0.435"
#> 3 6 8 23
#> 0.5012589 0.6465294 1.5824884 0.4416480
Again, subject 8 seems to be having a higher influence in the model.
In a real data analysis, this results suggest that maybe the researcher should review the data regarding subject 8, to try to identify any reasons for the different behavior of this subject.
After the careful examination of the model diagnostics and being sure
that our model is correctly specified, we can proceed to the synergy
analysis, using lmmSynergy()
function.
lmmSynergy()
allows for the calculation of synergy using
3 different references models: Bliss independence, highest single agent
and response additivity. The calculation of synergy is based on
hypothesis testing on the coefficient estimates from the model fitted by
lmmModel()
.
We will perform the analysis for the Bliss independence model
> bliss <- lmmSynergy(lmm_ex_var, method = "Bliss")
#> Warning in lmmSynergy(lmm_ex_var, method = "Bliss"): p-values below p<1e-18 are
#> approximated to 0. If you used method = 'RA' consider increasing ra_nsim value
#> for more precise p-values.
We can see that some p-values are extremely low, indicating synergy for all time points.
When a variance structure has been specified, it is recommended to
set the argument robust = TRUE
, to use cluster-robust
variance estimation using clubSandwich::vcovCR.lme()
.
> bliss <- lmmSynergy(lmm_ex_var, method = "Bliss", robust = TRUE)
#> Registered S3 method overwritten by 'clubSandwich':
#> method from
#> bread.mlm sandwich
#> Error in big_mat[ind, ind] <- small_mat[[i]] + big_mat[ind, ind]: replacement has length zero
However, as it can be seen, sometimes this can result in some errors,
due to the impossibility to fit the model at the earlier time points
(due to insufficient data). In these cases, an easy solution is to
simply increase the value of the min_time
argument, to
increase the minimum time for which to start calculating the
synergy:
Now we can observe the results, which indicate that the drug combination effect is synergistic through all the time points.
The synergy analysis results can be obtained from the object as:
> bliss$Synergy
#> Model Metric Estimate lwr upr pval Time
#> 1 Bliss CI 0.7668269 0.6754590 0.8705540 4.103824e-05 6
#> 2 Bliss CI 0.6714793 0.5083305 0.8869906 5.041554e-03 9
#> 3 Bliss CI 0.6711078 0.4856233 0.9274384 1.567663e-02 12
#> 4 Bliss CI 0.5701287 0.4040469 0.8044779 1.382189e-03 15
#> 5 Bliss CI 0.5420798 0.3818478 0.7695488 6.142903e-04 18
#> 6 Bliss CI 0.5451376 0.3879895 0.7659356 4.707607e-04 21
#> 7 Bliss CI 0.5229272 0.3662528 0.7466234 3.595666e-04 24
#> 8 Bliss CI 0.4820073 0.3346676 0.6942145 8.827451e-05 27
#> 9 Bliss CI 0.4894694 0.3357710 0.7135230 2.029750e-04 30
#> 10 Bliss SS 4.1015553 2.1415913 6.0615193 4.103824e-05 6
#> 11 Bliss SS 2.8043667 0.8444027 4.7643306 5.041554e-03 9
#> 12 Bliss SS 2.4163583 0.4563943 4.3763223 1.567663e-02 12
#> 13 Bliss SS 3.1983449 1.2383809 5.1583089 1.382189e-03 15
#> 14 Bliss SS 3.4252251 1.4652611 5.3851891 6.142903e-04 18
#> 15 Bliss SS 3.4968646 1.5369006 5.4568286 4.707607e-04 21
#> 16 Bliss SS 3.5681092 1.6081452 5.5280732 3.595666e-04 24
#> 17 Bliss SS 3.9207481 1.9607841 5.8807121 8.827451e-05 27
#> 18 Bliss SS 3.7152848 1.7553208 5.6752488 2.029750e-04 30
This data frame contains the synergy results, indicating the model of synergy (“Bliss”, “HSA” or “RA”), the metric (combination index and synergy score), the value of the metric estimate (with upper and lower confidence interval bounds) and the p-value, for each time.
We can repeat the process for the highest single agent (HSA) model:
As expected the results are similar, indicating synergy, but with smaller p-values compared to the Bliss model.
Finally, we can test the response additivity (RA)
model. The assessment of drug combination effect using this model is
based on simulations. We can choose the number of simulations to perform
with the ra_nsim
parameter. For this example, we will set
it to 1000:
> ra <- lmmSynergy(lmm_ex_var, method = "RA", robust = TRUE, min_time = 6, ra_nsim = 1000)
#> Warning in lmmSynergy(lmm_ex_var, method = "RA", robust = TRUE, min_time = 6, :
#> p-values below p<0.001 are approximated to 0. If you used method = 'RA'
#> consider increasing ra_nsim value for more precise p-values.
Since we performed 1000 simulations, the minimum p-value is 0.001. If we would like to obtain more precise p-values, we should run the analysis increasing the number of simulations.
‘SynergyLMM’ implements statistical power analysis to improve experimental designs.
The post hoc (retrospective) power analysis allows to evaluate the statistical power for the provided data set and fitted model. Traditionally, a threshold of 0.8 (80%) is used as an acceptable level of power (Serdar CC et al., 2021).
‘SynergyLMM’ allows for the post hoc power analysis of the
synergy hypothesis testing for Bliss and HSA reference models for a
given tumor growth data with PostHocPwr()
function.
Since the drug combination effect is time-dependent, the statistical
power vary depending on the time point being analyzed. We can choose
which time point to analyse using the time
argument.
The post hoc power analysis is based on simulations. We can specify
the number of simulations to run using the nsim
argument.
For this example, we will run only 100 simulations:
If we don’t specify any day, the last time point is used by default (in this case, day 30):
We can see that the analysis provides a good statistical power.
Prospective, a priori, statistical power analysis allows to assess how the statistical power of the analysis varies with variations of some parameters such as the sample size, or drug combination effect size, while keeping the other parameters constant. The a priori power analysis can help to implement improved experimental designs for in vivo experiments with minimal number of animals and measurements, yet still having enough statistical power to detect true synergistic effects.
PwrSampleSize()
allows to calculate the a
priori statistical power with different sample sizes. We can modify
any of the parameters, such as the time points of the experiment, the
growth rate of the different groups, or the variance in the model (given
by the standard deviation of the random effects and residuals).
However, usually the researcher will be interested in analyzing the
a priori power for an experiment, to check, for example, which
sample size should be used in order to reach an appropriate statistical
power. We can do this using the lmmModel_estimates()
function, to retrieve the model estimates from the experiment, and then
play with the different parameters:
First, lets extract the information from our experiment:
> # Vector with the time points
> days <- unique(grwth_data$Time)
>
> # Model estimates
> estimates <- lmmModel_estimates(lmm_ex_var)
Now, we can evaluate the a priori power for this experiment, varying the sample size per group. For example, let’s evaluate how the sample size varies from 1 to 10 subjects per group:
> PwrSampleSize(npg = 1:10,
+ time = days,
+ grwrControl = round(estimates$Control,3),
+ grwrA = round(estimates$DrugA,3),
+ grwrB = round(estimates$DrugB, 3),
+ grwrComb = round(estimates$Combination, 3),
+ sd_ranef = round(estimates$sd_ranef, 3),
+ sgma = round(estimates$sd_resid, 3),
+ method = "Bliss")
#> N Power
#> 1 1 0.2479771
#> 2 2 0.4491883
#> 3 3 0.6155323
#> 4 4 0.7414961
#> 5 5 0.8313399
#> 6 6 0.8926587
#> 7 7 0.9331019
#> 8 8 0.9590513
#> 9 9 0.9753250
#> 10 10 0.9853353
The plot on the left represents the hypothetical data that we have
provided, with the regression lines for each treatment group according
to grwrControl
, grwrA
, grwrB
, and
grwrComb
values. The values assigned to
sd_ranef
and sgma
are also shown.
The plot on the left shows the values of the power calculation
depending on the values assigned to npg
. We can observed
that, for this experiment, a sample size of 5 would be enough to reach
0.8 statistical power.
The number of measurement points per subject is another experimental
factor that can influence the experimental design and statistical power.
This depends on the duration of tumor growth follow-up and the frequency
of the measurements during that period. PwrTime()
function
allows for the a priori power calculation of a a hypothetical
two-drugs combination study of synergy depending on the time of
follow-up or the frequency of measurements.
For example, the researcher may be interested in studying the
statistical power of the experiment depending on the time of follow-up,
to decide the appropriate end point. If we set the argument
type = "max"
in PwrTime()
function, we can
evaluate how the power varies at different end times. For this, we need
to provide the function with a list of vectors with the times at which
the tumor measurements are performed. As we are interested in addressing
how the power varies for different study duration, the measurement
should be taken at the same intervals.
Let’s see an example. Using the estimates from the model we have fitted previously, we can explore how the power varies if the study ends in a range from 9 to 30 days, with measurements performed every 3 days.
The first step is to define the list of vectors tumor measurements time points:
> max_time <- list(seq(0,9,3), seq(0,12,3), seq(0,15,3),
+ seq(0,18,3), seq(0,21,3), seq(0,24,3),
+ seq(0,27,3), seq(0,30,3))
Another argument for the function is npg
, which
indicates the sample size per group. As we are interested in calculate
the a priori power for different study duration in our
experiment and model, we will use the mean sample size per group in the
experiment:
> # We can calculate the average sample size dividing the number of subjects
> # by the number of groups, in this case, 4 groups
> (npg <- round(length(unique(grwth_data$subject))/4,0))
#> [1] 8
Now, the a priori power for the Bliss independence model at different end points of the study is:
> PwrTime(npg = npg,
+ time = max_time,
+ type = "max",
+ grwrControl = round(estimates$Control,3),
+ grwrA = round(estimates$DrugA,3),
+ grwrB = round(estimates$DrugB, 3),
+ grwrComb = round(estimates$Combination, 3),
+ sd_ranef = round(estimates$sd_ranef, 3),
+ sgma = round(estimates$sd_resid, 3),
+ method = "Bliss")
#> Time Power
#> 1 9 0.3619997
#> 2 12 0.5820609
#> 3 15 0.7506471
#> 4 18 0.8506838
#> 5 21 0.9043890
#> 6 24 0.9332181
#> 7 27 0.9494116
#> 8 30 0.9590513
We can observe that a study with a follow-up time of 18 days would be enough to achieve a statistical power > 0.8, if all the other variables do not vary.
Now, we may also wonder how frequently the measurements should be taken. Is it necessary to take measurements every 3 days, or can we reduce the measurements to every 6 days?
We can do this by setting the type = "freq"
argument in
PwrTime()
function. In this case, we need to provide a list
of vectors with the tumor measurements with the same maximum time of
follow-up, but changing the intervals at which the measurements have
been taken. Let’s first define this list. We will evaluate how many
measurements are needed in an experiment with a follow-up period of 18
days.
Now, let’s analyze the a priori power:
> PwrTime(npg = npg,
+ time = freq_time,
+ type = "freq",
+ grwrControl = round(estimates$Control,3),
+ grwrA = round(estimates$DrugA,3),
+ grwrB = round(estimates$DrugB, 3),
+ grwrComb = round(estimates$Combination, 3),
+ sd_ranef = round(estimates$sd_ranef, 3),
+ sgma = round(estimates$sd_resid, 3),
+ method = "Bliss")
#> Time Power
#> 1 19 0.9416015
#> 2 7 0.8506838
#> 3 4 0.7512980
#> 4 3 0.6911254
#> 5 2 0.6123858
The plot at the right shows the power for different number of (evenly) distributed measurements. 2 measurements correspond to an experiment in which the tumor measurements were taken just at the initial time point (day 0), and at the final time point (day 18), while 19 measurements correspond to an experiment in which the measurements were taken every single day, including day 0.
The results show that with 4 measurements, which corresponds to take measurements every 6 days, the statistical power is close to 0.8.
A key determinant of statistical power is the effect size of the drug combination effect, given by the magnitude of the estimated coefficient for the growth rate after combination treatment, relative to that observed in the monotherapy and control groups. Treatments with larger effect sizes are easier to detect, even with smaller sample sizes, resulting in higher statistical power. Another critical factor is the variance components of the model, including residual variance (within-subject variance) and random effects variance (between-subject variance). High estimated coefficients for these variances reduce precision, thereby diminishing statistical power.
APrioriPwr()
function allows for total customization of
an hypothetical drug combination study and allows the user to define
several experimental parameters, such as the sample size, time of
measurements, or drug effect, for the power evaluation of synergy for
Bliss and HSA reference models.
Usually, we will define a experiment with the estimated parameters from the input data and fitted model, and then play with different values of the effect size of the drug combination effect, and/or values of residual and random effects variance.
From our model, the estimated parameters are:
> estimates
#> Control DrugA DrugB Combination sd_ranef sd_resid
#> estimate 0.07888867 0.07496401 0.06308966 0.03535055 0.0386823 0.2148223
Then, we can for example, evaluate how the statistical power varies if the drug combination group estimate ranges from -0.03 to 0.06, or if the standard deviation of the random effects ranges from 0.01 to 0.1 and the standard deviation of the residuals ranges from 0.01 to 1, using the Bliss independence model:
> APrioriPwr(npg = npg, # Sample size per group, calculated above
+ time = days, # Time points of measurements, calculated above
+ # Model estimates:
+ grwrControl = round(estimates$Control,3),
+ grwrA = round(estimates$DrugA,3),
+ grwrB = round(estimates$DrugB, 3),
+ grwrComb = round(estimates$Combination, 3),
+ sd_ranef = round(estimates$sd_ranef, 3),
+ sgma = round(estimates$sd_resid, 3),
+ sd_eval = seq(0.01, 0.1, 0.01),
+ sgma_eval = seq(0.01, 1, 0.01),
+ grwrComb_eval = seq(-0.05, 0.1, 0.001)
+ )
#> numDF denDF F-value nc Power
#> 1 1 316 13.77187 13.77187 0.9590513
The red dot and the vertical line in the two plots on the right
indicate the power result corresponding to the values assigned to
sd_ranef
, sgma
, and grwrComb
,
respectively.
The results demonstrate that the power increases as the residual and random effects variances decrease. Furthermore, the plot on the right reveals a U-shaped power profile as a function of the combination group growth rate. This indicates that with lower growth rates (i.e., higher effect size), the power is increased for detecting synergy, but it decreases as growth rates approach the expected additive effect under the Bliss independence model until it reaches its minimum. As growth rates exceed the expected additive effect, the power starts to increase again, reflecting the model’s ability to detect negative deviations from additivity (indicating antagonism).
All these functions constitute an interesting toolbox for the researcher. One may perform a first pilot experiment, and then optimize the study design based on the data from that experiment.