Empirical progression criteria thresholds for feasibility outcomes in HIV clinical trials: a methodological study
Pilot and Feasibility Studies volume 9, Article number: 96 (2023)
Pilot and feasibility trials use predetermined thresholds for feasibility outcomes to decide if a larger trial is feasible. These thresholds may be derived from the literature, observational data, or clinical experience. The aim of this study was to determine empirical estimates for feasibility outcomes to inform future HIV pilot randomized trials.
We conducted a methodological study of HIV clinical trials indexed in the past 5 years (2017–2021) in the PubMed database. We included trials of people living with HIV individually randomized to any type of intervention and excluded pilot trials and cluster randomized trials. Screening and data extraction were conducted in duplicate. We computed estimates for recruitment, randomization, non-compliance, lost to follow-up, discontinuation, and the proportion analyzed using a random effects meta-analysis of proportions and reported these estimates according to the following subgroups: use of medication, intervention type, trial design, income level, WHO region, participant type, comorbidities, and source of funding. We report estimates with 95% confidence intervals.
We identified 2122 studies in our search, of which 701 full texts were deemed relevant, but only 394 met our inclusion criteria. We found the following estimates: recruitment (64.1%; 95% CI 57.7 to 70.3; 156 trials); randomization (97.1%; 95% CI 95.8 to 98.3; 187 trials); non-compliance (3.8%; 95% CI 2.8 to 4.9; 216 trials); lost to follow-up (5.8%; 95% CI 4.9 to 6.8; 251 trials); discontinuation (6.5%; 95% CI 5.5 to 7.5; 215 trials); analyzed (94.2%; 95% CI 92.9 to 95.3; 367 trials). There were differences in estimates across most subgroups.
These estimates may be used to inform the design of HIV pilot randomized trials with careful consideration of variations due to some of the subgroups investigated.
Key messages regarding feasibility
1) What uncertainties existed regarding the feasibility?
2) What are the key feasibility findings?
3) What are the implications of the feasibility findings for the design of the main study?
Preliminary studies are commonly used to inform the design of clinical trials. In the past decade there has been an increasing emphasis on the importance of conducting preliminary trials prior to a definitive large-scale trial in order to increase efficiency and reduce research waste . These studies are often called “pilot studies” or “feasibility studies” and have been found to be very effective in reducing research waste such as over-spending . Although the terms “pilot studies” and “feasibility studies” are used interchangeably, there are some key differences. A study or trial can be labeled as “pilot” when it is a small-scale study conducted prior to the large-scale study, mimicking the design of the main study, and designed to test and refine a protocol (i.e., ensure recruitment protocols are efficient, provide training and experience in running randomization, treatments, and follow-up assessments). In contrast, feasibility studies are designed to evaluate whether a larger scale study could be performed and used to estimate important parameters required to design the main study (i.e., willingness of participants to be randomized, number of people eligible, response rates, follow-up rates, etc.) [3, 4].
Pilot and feasibility studies may use progression criteria to determine if a larger study is feasible. Progression criteria are one or more feasibility outcomes that must meet a pre-defined threshold for feasibility to be declared. They inform the decision to move forward to a larger trial, make modifications to the larger trial, or abandon altogether . For example, investigators could determine that a larger trial is feasible if they are able to recruit 75% of the people that are approached. Progression criteria are insufficiently used in pilot studies [6, 7], despite the requirement to declare progression criteria in the CONSORT extension for pilot randomized controlled trials (RCTs) . This creates challenges with how pilot studies are interpreted and how decisions are made with regards to a larger trial.
Pilot studies are particularly useful in HIV research due to the numerous challenges with recruiting and retaining participants, who may be experiencing social stigma and discrimination. Moreover, people living with HIV (PLWH) may belong to other minority groups associated with discrimination (i.e.; Black people, people who inject drugs [PWID] and men who have sex with men [MSM]) . Considering the over-representation of intersectional discrimination in HIV studies, pilot studies would provide an invaluable service in determining potential recruitment challenges in these specific population groups. In a sample of 248 pilot studies in HIV research, the authors noted that pilot studies are increasingly being used . However, several design, analysis, and reporting issues exist including limited use of progression criteria and lacking justifications for trial sample sizes .
Researchers may face challenges in defining feasibility outcomes and developing progression criteria due to the lack of empirical data on credible and reasonable thresholds for frequently used outcomes such as recruitment, compliance, and dropouts. There is limited guidance on how to set these thresholds. In a methodological study, only 28% of publications provided a rationale for their progression criteria . Existing guidance cites the use of prevalence or incidence rates and pre-existing observational data for recruitment rates . However, observational data may not always be available and, even if they are, they may not necessarily reflect estimates that would be true for a randomized trial. A potential solution to this issue is to summarize the estimates from completed full scale trials.
The purpose of this study is to inform the design of HIV clinical trials by providing credible evidence-based estimates to use in determining progression criteria thresholds when planning feasibility outcomes in HIV randomized clinical trials.
We conducted a methodological study of HIV clinical trials indexed in the past 5 years (2017–2021) in the PubMed database using the following search strategy (LM):
((((randomized controlled trial [pt]) OR (controlled clinical trial [pt]) OR (randomized [tiab]) OR (placebo [tiab]) OR (clinical trials as topic [mesh: noexp]) OR (randomly [tiab]) OR (trial [ti])) NOT (animals [mh] NOT humans [mh])) AND ((HIV) OR (human-immunodeficiency-virus) OR (human immunodeficiency virus)) NOT ((pilot [ti]) OR (feasibility [ti]) OR (protocol [ti])))
The results of our search were collected in EndNote reference manager. Reviewers working independently screened all the titles and abstracts for eligibility (LC, EA, MSU, ACJE, MCG, LS, TAJ, NR). To be eligible, a trial must include only people living with HIV individually randomized to any type of intervention. We excluded pilot or feasibility RCTs, trials with cluster randomization, trials in which participants were enrolled as couples (dyads) and trials published only as abstracts.
Full text articles were retrieved for potentially eligible articles and screened in duplicate. Data were extracted by one reviewer and verified by a second independent reviewer for quality control (LC, EA, MSU, ACJE, MCG, LS, TAJ, NR). The following data were extracted: basic bibliometric information (author name, author contact information, year of publication, and journal), country of origin, country’s income level (based on the World Bank Classification as high, upper middle, lower middle and low) , World Health Organization (WHO) region (Africa, Americas, Eastern Mediterranean, Europe, South East Asia, Western Pacific) , source of funding (industry, non-industry), trial duration in months, trial design (crossover, multi arm, factorial), follow-up duration, number of trial sites, use of medication (pharmacological versus non pharmacological), intervention type (Educational, Mobile health, Counselling, Electronic, Change in healthcare delivery, Incentives, Peer support, Psychotherapy, Outreach), population type known to be at higher risk of HIV infection and morbidity (Black people, MSM, women, youth, PWID, people in prisons, transgender people and children) [11, 12], comorbidities (tuberculosis, mental health, substance use, cancer). The following metrics were extracted from the CONSORT flow diagram, tables, or manuscript text: number of participants who were assessed for eligibility, recruited, randomized, who did not receive the intervention as planned, lost to follow-up, who discontinued intervention, and the number analyzed. Data extraction was conducted using DistillerSR (Evidence Partners, Ottawa, Canada).
We computed the following metrics as percentages:
Recruitment: number enrolled divided by the number approached or assessed
Randomization: number randomized divided by the number enrolled
Non-compliance: number who did not receive the intervention as planned divided by the number randomized
Lost to follow-up: number lost to follow-up divided by the number randomized
Discontinuation: number who discontinued the intervention divided by the number randomized
Proportion analyzed: number analyzed divided by the number randomized
The analysis was performed in StataCorp. 2021. Stata Statistical Software: Release 17. College Station, TX: StataCorp LLC. These proportions were pooled using random effects models. We used the Freeman-Tukey double arcsine transformation to stabilize the variances. The weighted pooled estimates were then back transformed, and using these transformed values and their variances, the pooled estimates were computed using the inverse variance method. Based on the binomial distribution, the exact 95% confidence intervals (CI) were calculated using the Clopper-Pearson approach. We conducted subgroup analyses based on the use of medication, intervention type, study design, country income level, WHO region, participant type, participant co-morbidities, and source of funding. These data are meant to be descriptive and therefore no interaction analyses were conducted. We also conducted a sensitivity analysis for the studies that reported on all the metrics. The number of studies, pooled estimates, and 95% confidence intervals (CI) are reported. Inferences for subgroups are made only when there are at least two studies.
Our search retrieved 2122 articles of which 83 were duplicates. Of these articles, 701 were deemed relevant after title and abstract screening. After full text screening, we included 394 articles. The flow of study selection is shown in Fig. 1.
About half of the included trials were of pharmaceutical interventions (212; 53.8%). The largest group of trials involved changes in healthcare delivery, such as changes in the number of pills, home-based care, and task-shifting (182; 46.2%). Seventy-nine (20.1%) were multi-arm trials. The majority were conducted in high income countries (164; 42.6%) and in the Africa region (127; 32.2%). The largest group of people studied were women (65; 16.5%) followed by Black people (39; 10.0%). The most common comorbidity studied was substance use (34; 8.6%). Most trials were non-industry funded (300; 76.1%). Two thirds (66.3%) were multicenter trials with a median number of sites of 3 (quartile 1: quartile 3; 1:7). The mean (standard deviation) duration of follow-up was 11.7 (9.2) months.
These results are summarized in Table 1.
One hundred and fifty-six studies (156) had sufficient data to compute recruitment. The overall recruitment rate was 64.1% (95% CI 57.7 to 70.3). The lowest recruitment rate was in the trials of participants with mental health comorbidities (42.9; 95% CI 22.9 to 64.3; 8 trials) and the highest in trials conducted in more than one WHO region (80.2% 95% CI 73.1 to 86.4; 20 trials).
One hundred and eighty-seven studies (187) had sufficient data to compute randomization. The overall randomization rate was 97.1 (95% CI 95.8 to 98.3). The lowest randomization rate was observed in the trials that used incentives as the intervention (86.8; 95% CI 54.5 to 100.0; 8 trials), and the highest was in the trials conducted in more than one WHO region (99.9; 95% CI 99.7 to 100.0; 27 trials).
Two-hundred and sixteen studies (216) had sufficient data to compute non-compliance. The overall non-compliance was 3.8% (95% CI 2.8 to 4.9). The lowest non-compliance was in factorial trials (0.5; 95% CI 0.0 to 1.6; 7 trials), and the highest non-compliance was in trials with a psychotherapy intervention (16.1%; 95% CI 5.9 to 30.0; 16 trials).
Lost to follow-up
Two hundred and fifty-one studies (n = 251) had sufficient data to compute lost to follow-up. The overall lost to follow-up was 5.8% (95% CI 4.9 to 6.8). The lowest lost to follow-up was in the trials conducted with industry funding (1.8%, 95% CI 1.1 to 2.7; 34 trials), and the highest lost to follow-up was in the trials with an educational intervention (15.0%, 95% CI 10.9 to 19.6; 29 trials).
Two hundred and fifteen (215) trials had sufficient data to compute discontinuation. The overall discontinuation was 6.5% (95% CI 5.5 to 7.5). The lowest discontinuation was in the trials conducted in South East Asia region (0.6%, 95% CI 0.0 to 2.5; 8 trials), and the highest discontinuation was in the trials with patients who had cancer (16.1; 95% CI 13.2 to 19.2; 2 trials).
Three hundred and sixty-seven (367) trials had sufficient data to estimate the proportion analyzed. The overall proportion analyzed was 94.2% (95% CI 92.9 to 95.3). The lowest proportion analyzed was in the studies with an electronic intervention (89.0; 95% CI 81.9 to 94.6; 15 trials), and the highest proportion analyzed was in studies with transgender people (99.6; 95% CI 98.8 to 100.0; 2 trials).
All the results are summarized in Table 2.
In our sensitivity analyses, 62 studies reported data on all the outcomes with the following estimates for recruitment (66.9%; 95% CI 58.5 to 74.8), randomization (97.3%; 95% CI 95.1 to 98.9), non-compliance (3.2%; 95% CI 1.3 to 5.6), lost to follow-up (4.9%; 95% CI 3.3 to 6.7), discontinuation (5.0%; 95% CI 3.4 to 6.9), and proportion analyzed (95.8%; 95% CI 93.7 to 97.5).
In this methodological study, we have provided empirical data to use in determining progression criteria thresholds when planning feasibility outcomes in HIV pilot randomized trials. We have also demonstrated that these estimates may vary based on the use of medication in the trials, the type of intervention, study design, income level of the countries in which the trial is conducted, region of the world, type of participants included, the comorbidities they may have, and the source of funding.
This is the first study of its kind to provide estimates intended to inform the design of pilot and feasibility trials in HIV. The estimates and their confidence intervals can be used for sample size calculations for feasibility outcomes and to set thresholds for feasibility. For example, in a study of an electronic intervention, the investigators can expect a lost to follow-up of 11.0%, which may be as low as 4.4% or as high as 20.1%. Likewise, for a non-pharmacological intervention, an investigator could estimate the sample size required to attain a recruitment rate of 59.3% with a margin of error about 16.5% wide.
Many of our findings are not surprising. It is reasonable to expect challenges in recruiting people with mental health issues. Other studies have highlighted these concerns and proposed solutions in the broader population  and for specific co-existing conditions .
In principle, if the study is carefully explained to participants, few enrolled participants would withdraw from the study before randomization. While randomization was generally high, one could speculate that in studies that used incentives, participants may have viewed a 50% chance of receiving the intervention unfavorably and chose to withdraw. Run-in periods might be an effective strategy to identify participants who are likely to drop out if they are used appropriately . Alternatively, investigators could identify the factors linked to pre-randomization withdrawals in the pilot trials and take measures to address them in the design of the larger trial .
Lower non-compliance in factorial trials, as we found, is not unexpected given that participants in factorial trials experience a higher burden especially if they are randomized to more than one active treatment . It is possible that the 7 trials included in these analyses had other characteristics that may have enhanced compliance. Other studies have reported low compliance with psychotherapy interventions, albeit in fragmented population groups. For example, in one systematic review, the authors report on adherence to online psychological interventions . In another, compliance is investigated only in group interventions in patients with psychosis . In another systematic review, compliance is explored from the therapists' perspective for children and adolescents .
With regard to loss to follow-up, other studies have found that industry-funded studies may be methodologically different from others [21, 22]. This may be linked to the level of resources available and which may be deployed in this case to enhance follow-up. Educational interventions may require more engagement from participants and therefore be more inconveniencing and challenging to accommodate in their broader lives , leading to higher rates of loss to follow-up.
Discontinuation was low in trials from certain regions. This may have to do with local factors such as proximity to the health facility or rural dwelling, which have been shown to be linked with discontinuation [24, 25]. Discontinuation may also be high in people with HIV who also have cancer owing to the higher burden of disease, burden of treatment, and risk of death before the trial end date .
Studies using electronic interventions had the lowest number of people analyzed. This may be because of challenges in ascertaining participants status (given the virtual nature of the interventions) and difficulties in determining the causes of missing data. In this context, discontinuation may be related to non-usage of the electronic device precluding further meaningful participation in the trial. In skin cancer prevention research, dropout rates are higher for digital interventions than others .
Region-specific differences in outcomes are not uncommon in methodological research but are sometimes challenging to explain. We found high recruitment and randomization rates in studies conducted in more than one (mixed) WHO region. Larger multicenter and multi-country studies are likely to have more resources including access to methodologists and the means to ensure higher participation in trials. If conducting a trial across multiple sites or countries is indicative of study size, the literature suggests that larger studies are reported more clearly [21, 22, 28] and may have additional methodological strengths. We also found the lowest discontinuation in the South East Asia region. The implications of this finding are unclear.
There are several caveats to the use of these data. First, the availability of data was not uniform across studies and therefore not all studies contributed to all the estimates. However, we conducted a sensitivity analysis pooling data from the 62 studies that contributed data to all the outcomes and found consistent results. The second caveat is that outcomes may have been defined differently, especially in the studies that did not display a CONSORT flow diagram, and in some instances, adjudication was required to determine if participants were lost to follow-up or had discontinued. Third, our measure of non-compliance does not capture the reason for non-compliance (it could be because the intervention was not delivered appropriately, the participants did not adhere to the intervention, or there were technical and logistic issues that precluded compliance). While the result is the same, the reasons may be of value to investigators of pilot and feasibility trials. Fourth, the numbers analyzed were extracted as reported by the authors and may reflect additional approaches used to ensure complete data, including imputation techniques. Fifth, some outcomes may be influenced by time. For example, it is possible that participants are more likely to drop out from or discontinue longer studies. We invite investigators to consider this as they use these data.
We have presented a large body of evidence on credible estimates for feasibility outcomes in HIV clinical trials and shown that key study characteristics may influence these estimates. These data should be used to inform the choice of thresholds for feasibility outcomes and the development of progression criteria in HIV pilot randomized trials.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Consolidated Standards of Reporting Trials
Human immunodeficiency virus
Men who have sex with men
People living with HIV
People who inject drugs
Randomized controlled trial
World Health Organization
Ioannidis JP, Greenland S, Hlatky MA, et al. Increasing value and reducing waste in research design, conduct, and analysis. Lancet. 2014;383(9912):166–75.
El-Khechen HA, Khan MIU, Leenus S, et al. Design, analysis, and reporting of pilot studies in HIV: a systematic review and methodological study. Pilot and Feasibility Studies. 2021;7(1):211.
Whitehead AL, Sully BG, Campbell MJ. Pilot and feasibility studies: is there a difference from each other and from a randomised controlled trial? Contemp Clin Trials. 2014;38(1):130–3.
Eldridge SM, Lancaster GA, Campbell MJ, et al. Defining feasibility and pilot studies in preparation for randomised controlled trials: development of a conceptual framework. PLoS ONE. 2016;11(3): e0150205.
Thabane L, Ma J, Chu R, et al. A tutorial on pilot studies: the what, why and how. BMC Med Res Methodol. 2010;10:1.
Mbuagbaw L, Kosa SD, Lawson DO, et al. The reporting of progression criteria in protocols of pilot trials designed to assess the feasibility of main trials is insufficient: a meta-epidemiological study. Pilot and Feasibility Studies. 2019;5(1):1–8.
Mellor K, Eddy S, Peckham N, et al. Progression from external pilot to definitive randomised controlled trial: a methodological review of progression criteria reporting. BMJ Open. 2021;11(6): e048178.
Avery KN, Williamson PR, Gamble C, et al. Informing efficient randomised controlled trials: exploration of challenges in developing progression criteria for internal pilot studies. BMJ Open. 2017;7(2): e013537.
The World Bank. World Bank Open Data. 2019. https://data.worldbank.org/ (accessed 21 June 2019).
WHO. Definition of regional groupings. 2020. http://www.who.int/healthinfo/global_burden_disease/definition_regions/en/ (accessed 25 April 2020).
NIDA. Who is at risk for HIV infection and which populations are most affected? 2020. https://nida.nih.gov/publications/research-reports/hivaids/who-risk-hiv-infection-which-populations-are-most-affected (accessed 5 June 2023).
HIV.gov. Who Is at Risk for HIV? 2022. https://www.hiv.gov/hiv-basics/overview/about-hiv-and-aids/who-is-at-risk-for-hiv/ (accessed 5 June 2023).
Liu Y, Pencheon E, Hunter RM, Moncrieff J, Freemantle N. Recruitment and retention strategies in mental health trials - a systematic review. PLoS ONE. 2018;13(8):e0203127.
Kanuch SW, Cassidy KA, Dawson NV, Athey M, Fuentes-Casiano E, Sajatovic M. Recruiting and retaining individuals with serious mental illness and diabetes in clinical research: lessons learned from a randomized, controlled trial. J Health Dispar Res Pract. 2016;9(3):115–26.
Laursen DRT, Paludan-Müller AS, Hróbjartsson A. Randomized clinical trials with run-in periods: frequency, characteristics and reporting. Clin Epidemiol. 2019;11:169–84.
Fukuoka Y, Gay C, Haskell W, Arai S, Vittinghoff E. Identifying factors associated with dropout during prerandomization run-in period from an mhealth physical activity education study: the mPED trial. JMIR Mhealth Uhealth. 2015;3(2): e34.
Merrill PD, McClure LA. Dichotomizing partial compliance and increased participant burden in factorial designs: the performance of four noncompliance methods. Trials. 2015;16(1):523.
Beatty L, Binnion C. A systematic review of predictors of, and reasons for, adherence to online psychological interventions. Int J Behav Med. 2016;23(6):776–94.
Sedgwick O, Hardy A, Newbery K, Cella M. A systematic review of adherence to group interventions in psychosis: do people attend? Psychol Med. 2021;51(5):707–15.
Collyer H, Eisler I, Woolgar M. Systematic literature review and meta-analysis of the relationship between adherence, competence and outcome in psychotherapy for children and adolescents. Eur Child Adolesc Psychiatry. 2020;29(4):417–31.
Balasubramanian SP, Wiener M, Alshameeri Z, Tiruvoipati R, Elbourne D, Reed MW. Standards of reporting of randomized controlled trials in general surgery: can we do better? Ann Surg. 2006;244(5):663–7.
de Vries TW, van Roon EN. Low quality of reporting adverse drug reactions in paediatric randomised controlled trials. Arch Dis Child. 2010;95(12):1023–6.
Zoë CS, Rumana N, Katie G. Exploring non-retention in clinical trials: a meta-ethnographic synthesis of studies reporting participant reasons for drop out. BMJ Open. 2019;9(6): e021959.
Gesesew HA, Ward P, Hajito KW, Feyissa GT, Mohammadi L, Mwanri L. Discontinuation from antiretroviral therapy: a continuing challenge among adults in HIV care in Ethiopia: a systematic review and meta-analysis. PLoS ONE. 2017;12(1): e0169651.
Kebede HK, Mwanri L, Ward P, Gesesew HA. Predictors of lost to follow up from antiretroviral therapy among adults in sub-Saharan Africa: a systematic review and meta-analysis. Infect Dis Poverty. 2021;10(1):33.
Casper C, Crane H, Menon M, Money D. HIV/AIDS comorbidities: impact on cancer, noncommunicable diseases, and reproductive health. 2018.
Hernández-Rodríguez J-C, García-Muñoz C, Ortiz-Álvarez J, Saigí-Rubió F, Conejo-Mir J, Pereyra-Rodriguez J-J. Dropout rate in digital health interventions for the prevention of skin cancer: systematic review, meta-analysis, and metaregression. J Med Internet Res. 2022;24(12): e42397.
Farrokhyar F, Chu R, Whitlock R, Thabane L. A systematic review of the quality of publications reporting coronary artery bypass grafting trials. Can J Surg. 2007;50(4):266–77.
No funding was used for this study.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Mbuagbaw, L., Chen, L.H., Aluko, E. et al. Empirical progression criteria thresholds for feasibility outcomes in HIV clinical trials: a methodological study. Pilot Feasibility Stud 9, 96 (2023). https://doi.org/10.1186/s40814-023-01342-x
- Pilot trial
- Feasibility trial
- Progression criteria
- Clinical trials