Aim

The aim of this document is to clarify the modelling strategy that will be used to analyze the accumulated data in the ProBio platform study.
Similar analyses have been also implemented in the simulations for defining the operational characteristics of ProBio. I first describe the assumptions behind the chosen parametric model and then use fictitious data to exemplify the presented strategy.

Modelling strategy: the Weibull parametric model

We are going to implement Bayesian methods for survival analysis. In a Bayesian framework, a parametric distribution is oftentime selected for modelling a time to event variable, in our case progression free survival (PFS).

The Weibull distribution is typically adopted in many bio-medical contexts, given its flexibility in describing different shapes and phenomena. A Weibull distribution can be parameterized in terms of a scale (\(\lambda\)) and shape (\(k\)) parameters, is such a way that it density function assumes the following form:
\[T \sim \text{Weibull}(\lambda, k)\]
\[f(t;\lambda,k) = \lambda kt^{k-1}\exp(-\lambda t^k)\]

Under the previous parametrization, the mean PFS is defined as \(\lambda^{-\frac{1}{k}}\Gamma(1 + \frac{1}{k})\).

In a Bayesian perspective, our inference will be on the belief on the parameters of interest rather than the parameters themselves. Our belief (or historical data) in the parameters is represented by the definition of a distribution function.
In particular, we are going to assume a distribution for the scale parameter while fixing the shape parameter at 1.05, based on previous data from the BESENE study. Given that the scale parameter is strictly positive, a gamma distribution is typically used for this parameter, as it is also a conjugate model for the Weibull distribution: \[\lambda \sim \text{Gamma}(a, b)\] \[f(\lambda;a,b) = \frac{b^a\lambda^{a-1}\exp(-b\lambda)}{\Gamma(a)} \]

We are going to use \(a = 10\) and \(b = 80\) as apriori hyperparameters, that corresponds approximately to the information of 10 patient with a mean (rate) \(E[\lambda] = \frac{a}{b} = 1/8 = 0.125\), which then gives a mean PFS time equal to \(E[T] = 0.125^{-1/1.05}\Gamma(1 + \frac{1}{1.05}) = 7.1\).

Let’s compare how the distribution of PFS times changes as the \(b\) hyperparameter increases from 80 to 180:

As the \(b\) hyperparameter increases, the disitribution shifts towards the right, with grater PFS times.
Alternatively, we can let the other hyperparameter, \(a\), to change while fixing \(b\) = 80:

The behaviour is opposite, as \(a\) increases the distribution of PFS times shift towards smaller values. We can compare the distributions by comparing the respective PFS means (mean time in the table below) in the two settings (OBS mean gamma is the mean of the scale paramter of the Weibull distribution).

a b mean gamma mean time
10 80 0.1250000 7.106618
10 100 0.1000000 8.789380
10 120 0.0833333 10.456081
10 140 0.0714286 12.109545
10 160 0.0625000 13.751758
10 180 0.0555556 15.384200
a b mean gamma mean time
10 80 0.125 7.106618
12 80 0.150 5.973822
14 80 0.175 5.158144
16 80 0.200 4.542166
18 80 0.225 4.060190
20 80 0.250 3.672550

 

In addition, we can quantify the extent by which two distribution with different values for the hyperparameters differ from each other. For example, what is the probability that then mean PFS modelled with a Weibull distribution where the \(\lambda\) parameter has a gamma distribution with \(a = 10\) and \(b = 140\) is greater than the mean PFS in a similar distribution but with \(b = 80\)? This can be computed using Monte Carlo simulations

a b mean gamma mean time prob of superiority
10 80 0.1250000 7.106618 0.459
10 100 0.1000000 8.789380 0.659
10 120 0.0833333 10.456081 0.794
10 140 0.0714286 12.109545 0.863
10 160 0.0625000 13.751758 0.927
10 180 0.0555556 15.384200 0.954
a b mean gamma mean time prob of superiority
10 80 0.125 7.106618 0.481
12 80 0.150 5.973822 0.322
14 80 0.175 5.158144 0.196
16 80 0.200 4.542166 0.119
18 80 0.225 4.060190 0.072
20 80 0.250 3.672550 0.028

 

Exemplification of a fictitious clinical trial

Let’s use a fictitious example data set to illustrate how the hyperparameters changes throughout the trial based on the accumulated data, and how we can compute the quantities which let us to decide to ealier stop the trial or continue patients’ enrollment.

For sake of clarity, we consider one active treatment being compared to a control group. Each group consists of 25 patients, whose PFS time has been recorded in the first 20 months. The PFS times for those men still alive at the end of the follow-up are marked with a “+” in the table below

 

Control Treatment
2.55, 6.43, 2.87, 6.68, 11.91, 6.95, 3.08, 7.43, 10.29, 6.34, 7.99, 19.93, 1.15, 20.00+, 7.43, 5.49, 8.69, 0.93, 2.63, 10.88, 16.88, 5.81, 1.42, 3.97, 20.00+ 8.49, 20.00+, 3.18, 19.61, 20.00+, 15.35, 17.06, 20.00+, 10.51, 3.89, 20.00+, 8.12, 6.09, 20.00+, 2.66, 8.46, 0.48, 4.81, 5.26, 6.78, 5.62, 1.19, 20.00+, 0.31, 5.23

 

The hyperparameters of the Gamma distibution are updated monthly. In particular, the hyperparameter \(a\) is updated from month \(t-1\) to the next month \(t\) with the number of progressions that have been observed during the month (\(d_{(t)}\)): \(a_{(t)} = a_{(t-1)} + d_{(t)}\). Intuitevely, the distribution of PFS in the treatment arm shifts towards smaller times as the number of progressions increases (in the active arm). The other hyperparameter, instead, is updated with the amount of time the patients stayed in the trial during that month (\(\sum_{i = 1}^{n_{(t)}} PT_{i_{(t)}}^{k}\)): \(b_{(t)} = b_{(t-1)} + \sum_{i = 1}^{n_{(t)}} PT_{i_{(t)}}^{k}\)

For example, in the first month there have been 1 and 2 progressions in the control and treatment groups. After the first month \(a = 10 + 1\) in the control group, while \(a = 10 + 2\) in the treatment group. Similarly, the sum of the observed person times (elevated to the power of 1.05) in the first month were 24.93 and 23.76, so that \(b = 80 + 24.93\) and \(b = 80 + 23.76\) in the control and treatment group.

Given the hyperparameters it is possible to compare if the treatment is superior to control group using the quantities described in the previous section, i.e. the probabilities of superiority. This can be done monthly in the fictitious trial: