Simulation of Paitient Enrollment and Survival Data

Yuan Zhong

2024-08-01

Introduction

We offer various examples of data generation using the Simulation_Enroll function. Survival times are generated from an exponential distribution characterized by a hazard rate lambda, and the occurrence of events such as death, progression, and relapse is modelled using an independent binomial distribution with the rate event.

Patient enrollment is structured across multiple groups. Users of the function must define the duration of patient accrual, follow-up periods, and the maximum duration of the trial. The function allows for the specification of the number of groups, and the distribution of patients across these groups can be either uniform or varied. Below, we illustrate different scenarios utilizing this simulation function.

Simulation Examples

Equal numbers of patients enroll

Consider a clinical trial structured to enroll 100 patients over a three-year period. Details of the patient enrollment are outlined below:

Within each group, the simulation parameters are set as follows:

The enrollment times for patients in each group are generated using a uniform distribution that is between from the start of the enrollment period to the start of the next group’s enrollment. For instance, the enrollment times for the first group are simulated using runif(n = 20, min = 0, max = 0.6). Since the start times across groups are different and sequential, the maximum survival times are adjusted accordingly; the first group’s survival times are capped at 5 years, while the second group’s are capped at 4.4 years.

data <- Simulate_Enroll(n = 100,
                        lambda = 0.03,
                        event = 0.1,
                        M = 1,
                        group = 5,
                        maxt = 5,
                        accrual = 3,
                        censor = 0.9,
                        followup = 2,
                        partition = "Even")

head(data)
##          Time Censor     Enroll
## 1 0.003622395      1 0.38119674
## 2 0.931375698      1 0.02143771
## 3 3.341531660      0 0.16854679
## 4 5.000000000      0 0.45861611
## 5 5.000000000      0 0.08504683
## 6 5.000000000      0 0.49671817
tail(data)
##         Time Censor   Enroll
## 95  2.600000      0 2.911322
## 96  2.600000      0 2.770435
## 97  2.600000      0 2.830726
## 98  2.600000      0 2.650107
## 99  2.545639      0 2.650065
## 100 1.908765      0 2.989230

The first column Time indicates the survival times, which are truncated at their respective maximum values. The second column indicates the occurrence of events, with 1 representing an event occurrence and 0 indicating none. The final column records the enrollment time points for each patient. The data in the first two columns are utilized for survival analysis, while the last column is intended for conducting group sequential interim analysis.

Unequal numbers of patients enroll

In many trials, the patient enrollment can have different numbers of patients at different time points. The function can equally partition the enrollment period into 6 time intervals and allow different numbers of patients to enroll in the trial.

data <- Simulate_Enroll(n = c(30,20,20,15,10,5),
                        lambda = 0.05,
                        event = 0.2,
                        M = 1,
                        group = 6,
                        maxt = 4,
                        accrual = 3,
                        censor = 0.9,
                        followup = 1,
                        partition = "Uneven")

head(data)
##        Time Censor    Enroll
## 1 0.5587254      1 0.2196622
## 2 0.8747189      1 0.4217051
## 3 4.0000000      1 0.3404610
## 4 4.0000000      1 0.1403269
## 5 4.0000000      1 0.2692308
## 6 4.0000000      1 0.2867221
tail(data)
##          Time Censor   Enroll
## 95  1.8457387      0 2.417707
## 96  1.5000000      1 2.644800
## 97  1.5000000      0 2.934640
## 98  1.5000000      0 2.894264
## 99  1.5000000      0 2.890860
## 100 0.6652445      0 2.851738

Consider a clinical trial recruiting different numbers of patients in different time intervals over a three-year period. Details of the patient enrollment are outlined below:

Within each group, the simulation parameters are set as follows:

The enrollment time points and maximum values in each group are generated in the same process as the example above.

Multiple streams of data generating

The function can generate multiple survival outcomes based on the same parameter settings. M stands for the number of streams for MCMC. For example, we can set M = 4 to generate four independent simulation outputs.

data <- Simulate_Enroll(n = c(30,20,20,15,10,5),
                        lambda = 0.05,
                        event = 0.2,
                        M = 4,
                        group = 6,
                        maxt = 4,
                        accrual = 3,
                        censor = 0.9,
                        followup = 1,
                        partition = "Uneven")
head(data[[1]])
##        Time Censor      Enroll
## 1 0.1795778      1 0.459058912
## 2 0.7359768      1 0.107893938
## 3 1.5836122      1 0.004774509
## 4 2.3611913      1 0.245645842
## 5 3.6206806      1 0.473009109
## 6 3.6635945      1 0.233908861
head(data[[2]])
##        Time Censor     Enroll
## 1 0.5067142      1 0.38304947
## 2 2.7168873      1 0.05129808
## 3 3.1192227      1 0.18227654
## 4 4.0000000      1 0.33962788
## 5 4.0000000      1 0.36616630
## 6 4.0000000      1 0.08973113
head(data[[3]])
##        Time Censor    Enroll
## 1 0.4165583      1 0.1597006
## 2 2.4336370      1 0.0526736
## 3 3.5837114      1 0.3664232
## 4 4.0000000      1 0.1469325
## 5 4.0000000      1 0.4332290
## 6 4.0000000      1 0.3971429
head(data[[4]])
##        Time Censor     Enroll
## 1 0.2572423      1 0.16950544
## 2 3.2765855      1 0.04883483
## 3 3.3919331      1 0.46298125
## 4 4.0000000      1 0.06537026
## 5 4.0000000      1 0.04817034
## 6 4.0000000      1 0.15746224