Skip to main content

Table 4 Case illustration (standard 3-tiered approach)

From: Determining sample size for progression criteria for pragmatic pilot RCTs: the hypothesis test strikes back!

A two-arm parallel design (1:1 allocation to intervention and control arms) with three key feasibility objectives, to assess (i) recruitment uptake (percent of screened patients recruited), (ii) treatment fidelity and (iii) participant retention (follow up). Hypothesis testing incorporates α (1-sided) = 5% and power = 90%. The normal approximation method is used.

Assume the progression criteria (and affiliated sample size requirements) for each are as follows:

(i) Recruitment uptake ≤ 20% (RED zone) and ≥ 35% (GREEN zone) {RUL = 20%, GLL = 35%}

→ Required sample size n = 78 [total screened patients]

(ii) Treatment fidelity ≤ 50% (RED zone) and ≥ 75% (GREEN zone) {RUL = 50%, GLL = 75%}

→ Required sample size n = 34 [intervention arm only]

(iii) Follow up: ≤ 65% (RED zone), ≥ 85% (GREEN zone) {RUL = 65%, GLL = 85%}

→ Required sample size n = 44 (total randomised participants with 22 per arm)

The sample sizes across criteria (i)-(iii) are at different levels—(i) is at the level of screened patients, whereas (ii)–(iii) are at the level of randomised patients. To meet criteria (i), we need ns ≥ 78 (although we will recruit ns = 200 (i.e. (1/0.35) × nr (rounded up to 200)) where 0.35 is the expected proportion uptake of the total number screened), and for (ii)–(iii), we need nr = 68 (34 per arm, based on (ii)).

Taking each of the objectives in turn (and the updated sample sizes to meet the multi-criteria objectives), we express progression criteria for the three objectives as follows:

(i) Recruitment uptake [required ns ≥ 78; expected ns = 200; maximum ns = 340 (i.e. (1/0.2)x nr)]

• E ≤ 0.2 [P ≥ 0.05] -> RED (STOP)

• 0.2 < E < 0.35 -> AMBER (AMEND)

• E ≥ 0.35 [P < 0.05] -> GREEN (GO)

Signals for expected ns = 200:

0 to 40 (RED), > 40 to < 70 (AMBER) and 70 to 200 (GREEN) {i.e. 0.2 × 200 = 40; 0.35 × 200 = 70}

(ii) Treatment fidelity [ni = 34 (intervention arm only)]

• E ≤ 0.5 [P ≥ 0.05] -> RED (STOP)

• 0.5 < E < 0.75 -> AMBER (AMEND)

• E ≥ 0.75 [P < 0.05] -> GREEN (GO)

Signals for ni = 34:

0 to 17 (RED), > 17 to < 25.5 (AMBER) and 25.5 to 34 (GREEN) {i.e. 0.5 × 34 = 17; 0.75 × 34 = 25.5}

(iii) Follow up [nr = 68 (intervention and control arms)]

• E ≤ 0.65 [P ≥ 0.05] -> RED (STOP)

• 0.65 < E < 0.85 -> AMBER (AMEND)

• E ≥ 0.85 [P < 0.05] -> GREEN (GO)

Signals for nr = 68:

0 to 44.2 (RED), > 44.2 to < 57.8 (AMBER) and 57.8 to 68 (GREEN) {i.e. 0.65 × 68 = 44.2; 0.85 × 68 = 57.8}

[Note: The continuity correction (− 0.5 deduction) needs to be applied to the observed count from the study for each criterion prior to assessing into which signal band it falls]

In accordance with the multi-criteria aim, the decision to proceed would be based on the worst signal

➢ If signal = RED for (i) or (ii) or (iii) -> overall signal is RED

➢ Else, if no signal is RED but signal = AMBER for (i) or (ii) or (iii) -> overall signal is AMBER

➢ Else, if signals = GREEN for (i) and (ii) and (iii) -> overall signal is GREEN

  1. RUL upper limit of RED zone, GLL lower limit of GREEN zone, ns number of screened patients who are eligible to being randomised, nr number of eligible patients randomised, ni number of patients randomised to the intervention arm