Skip to main content

Developing an implementation fidelity checklist for a vocational rehabilitation intervention



Despite growing numbers of studies reporting the efficacy of complex interventions and their implementation, many studies fail to report information on implementation fidelity or describe how fidelity measures used within the study were developed. This study aimed to develop a fidelity checklist for measuring the implementation fidelity of an early, stroke-specialist vocational rehabilitation intervention (ESSVR) in the RETAKE trial.


To develop the fidelity measure, previous checklists were reviewed to inform the assessment structure, and core intervention components were extracted from intervention descriptions into a checklist, which was ratified by eight experts in fidelity measurement and complex interventions. Guidance notes were generated to assist with checklist completion. To test the measure, two researchers independently applied the checklist to fifteen stroke survivor intervention case notes using retrospective observational case review. The scoring was assessed for interrater reliability.


A fidelity checklist containing 21 core components and 6 desirable components across 4 stages of intervention delivery was developed with corresponding guidance notes. Interrater reliability of each checklist item ranged from moderate to perfect (Cohen’s kappa 0.69–1).


The resulting checklist to assess implementation fidelity is fit for assessing the delivery of vocational rehabilitation for stroke survivors using retrospective observational case review. The checklist proved its utility as a measure of fidelity and may be used to inform the design of future implementation strategies.

Trial registration

ISRCTN, ISRCTN12464275. Registered on 13 March 2018.

Peer Review reports


Poorly implemented interventions threaten participant and trial outcomes and undermine confidence in research findings. In intervention studies, it can be difficult to know whether interventions have been delivered as intended; that is, with fidelity [1, 2]. However, despite the body of literature supporting the importance of fidelity, it is largely under-reported in studies of rehabilitation interventions [3,4,5]. Without information regarding the extent to which an intervention has been delivered with fidelity, it is difficult to know whether the treatment effect outcomes are masked by poor implementation of the intervention [6]. Fidelity data are necessary to interpret intervention outcomes [1]. This is especially true of interventions with many interacting parts that are influenced by different contexts and factors, also called ‘complex’ interventions [7].

Complex interventions usually contain several ‘core components’ that are essential for the intervention to have an effect and to be considered as delivered with fidelity [8, 9]. The higher the level of complexity and individual tailoring of the intervention and its components, the more difficult it may be to measure fidelity [3, 10, 11], ,thus requiring a more sophisticated method of measurement to avoid drawing inappropriate conclusions about an intervention that might have been improperly implemented and making type III errors [12]. Measuring fidelity provides insight into which components of an intervention are essential for a positive participant outcome [13] by establishing what key components were or were not delivered in cases of improved outcomes [14].

Fidelity measurement is underpinned by theoretical concepts that emanate from behaviour change theories [1, 15, 16]. Various frameworks have been developed to describe and define which aspects of intervention implementation should be considered and the methods to use when evaluating fidelity [8, 15, 17, 18]. One such framework is the Conceptual Framework for Implementation Fidelity (CFIF) [17], which describes two key concepts to understanding implementation fidelity: (1) adherence (whether the recipient has received the intervention as intended) and (2) moderating factors (factors affecting faithful intervention implementation). Due to the comprehensiveness of this framework and its demonstrated usefulness in other complex intervention studies [18, 19], CFIF was used to define and describe fidelity in this study.

Some studies use quantitative data collection methods to measure elements of fidelity, such as fidelity checklists, that assess therapist adherence to core processes and determine which core intervention components have been delivered [5, 20]. Others use qualitative data collection methods, such as interviews, capturing acceptability, or engagement with the intervention experienced by participants [11]. These studies often do not include sufficient information regarding either the development of the fidelity measure used or the psychometric properties of the measure, which invites scepticism [5, 21,22,23]. The lack of published studies detailing the development of fidelity measures emphasises the need for future research to make clear the processes used to assure good psychometric properties of the measure prior to its application.

There is a need for high-quality, psychometrically robust measures of fidelity, yet there is little agreement on how best to develop these measures [5, 23]. Recent guidance suggests that for a measure to be considered high quality, the psychometric (e.g., reliability and validity) and implementation properties (practicality) of the measure should be reported [23]. Evaluation of a measure’s psychometric properties can determine whether the scores consistently measure the intended constructs [3, 24]. The practicality of a fidelity measure, such as ease of use and time taken to complete, is also valuable for researchers to report as these are factors that other researchers and clinicians consider when choosing a measure [3, 25].

Fidelity checklists are developed by using instructional information (i.e., intervention manuals), which is then distilled into a shortened list of intervention components and used to assess the presence of the components during delivery [23, 26]. Checklists have the advantage of being simple and quick to administer by those without specialist training in the intervention itself, and in instances where study participants cannot be, or do not wish to be, recorded or interviewed [5]. Assessment of fidelity through video or audio recordings of intervention delivery is currently considered the gold standard of assessment [27], but is resource intensive, especially in studies with many participants receiving intervention over an extended time period [23]. The application of a fidelity checklist using a retrospective review of intervention records might be a way to reduce resource use. Fidelity checklists have been generated in occupational therapy [3, 28]; however, they are specific to the components of the various interventions they assess and inappropriate for use across studies of other interventions without adaptation [3, 13].

Vocational rehabilitation (VR) is an example of a complex intervention that helps someone with a health problem return to or remain in work [29]. VR involves helping people find work, helping those who are in work experiencing difficulties, and supporting career progression in spite of illness or disability [30]. VR is complex because it requires tailoring of the intervention to the individual receiving it, is sensitive to the behaviours of different stakeholders, and can produce a variety of different outcomes [13]. VR crosses organisational boundaries, involves interactions between multiple stakeholders, is highly individually tailored, and requires behavioural change by the patient, their family, and employer [31, 32]. Stroke is an example of a particularly complex condition because it often occurs with multiple comorbidities and results in numerous, unpredictable biopsychosocial impacts [33]. Delivering a particularly complex intervention (such as VR) in a complex patient group (such as stroke survivors) presents some challenges for intervention delivery and measurement of fidelity (such as tailoring and individualization) to meet the specific needs of the recipients [34,35,36]. A small number of studies describe VR for stroke survivors [37], but very few of these studies report whether VR was delivered with fidelity, which makes it difficult to draw firm conclusions about the effectiveness of VR after stroke [38] despite the existence of intervention non-specific fidelity measures [8, 39].

This study describes the development and testing of an intervention fidelity checklist for an early, stroke specialist vocational rehabilitation intervention (ESSVR) to support stroke survivors to return to work after stroke in the REurn To work After stroKE (RETAKE) trial [40] (ISRCTN12464275). ESSVR combines conventional VR with case management (see Fig. 1). It is delivered by a stroke-specialist occupational therapist (OT) trained to assess the impact of the stroke on the participant and their job; coordinate appropriate support from the UK National Health Service (NHS), employers and other stakeholders; negotiate workplace adjustments, monitor return to work, and explore alternatives where current work is not feasible. A more detailed description of the intervention can be found elsewhere [41]. ESSVR is delivered in four stages (early recovery, graded return to work, job retention, and discharge), each comprising several core and desirable components.

Fig. 1
figure 1

A brief description of ESSVR

Aims and objectives

This study aims to develop and test a checklist for measuring implementation fidelity of ESSVR delivery in RETAKE.


  1. (1)

    To identify and extract core ESSVR intervention components and generate guidance notes to assess the fidelity of their delivery within the RETAKE trial.

  2. (2)

    To ratify the checklist components and guidance notes against expert opinion, supporting the measure’s content validity.

  3. (3)

    To test the utility of the checklist for assessing fidelity of ESSVR delivery using a retrospective observational review of stroke survivors’ intervention records.

  4. (4)

    To assess interrater reliability in fidelity checklist completion.

Materials and methods

Ethical approval for the RETAKE trial and the studies within the trial was obtained through the East Midlands—Nottingham 2 Research Ethics Committee (REC) (Ref: 18/EM/0019).

Development of the fidelity checklist

The development of the fidelity checklist and its associated guidance notes was informed by Walton et al. [23] and distilled into five steps: (1) review previous measures of fidelity, (2) analyse and develop a framework for the content of the intervention, (3) develop a fidelity checklist and associated guidance for checklist completion, (4) obtain feedback regarding content and wording, and (5) pilot and refine the checklist.

The initial structure of the ESSVR fidelity checklist was based on a checklist developed for an earlier VR study [42] for people with traumatic brain injury. The logic model and intervention descriptions [43] provided the initial content for the development of the ESSVR fidelity checklist.

The fidelity assessment in RETAKE used an observational retrospective review of stroke survivor ESSVR intervention records that included session content case-report forms (CRFs), OT clinical notes, and correspondence between the OT, stroke survivor, and other key stakeholders to assess intervention fidelity (see Table 1). ESSVR was delivered to community-dwelling stroke survivors, their families, and their employers over a period of up to 12 months following randomisation.

Table 1 Detailed descriptions of the components of the participant ESSVR intervention records

Version 1

Version 1 of the checklist used the same format as a fidelity checklist created to assess a similar VR intervention delivered to people with traumatic brain injury [42] designed to be completed through observation of individual sessions. Both the VR intervention designed for people with traumatic brain injury and the VR intervention for stroke survivors require complex, highly individualised intervention that considers the patient’s individual, family, and work contexts. The VR in TBI checklist was developed to be completed through direct observation of a therapy session whereby the assessor recorded the extent of delivery (‘always’, ‘sometimes’, ‘seldom’, or ‘never’ delivered) for each of the 18 components of the intervention in the session. The assessor was also prompted to record moderating factors impacting intervention delivery or receipt, such as participant responsiveness and political, economical, and organisational context [42].

The checklist used for VR following traumatic brain injury was adapted for use in this study by modifying existing components and adding additional components identified in the ESSVR logic model. The process to complete the checklist was adapted to use observation of stroke survivor intervention records to assess the delivery of intervention components across the entire intervention delivery period (up to 12 months). It required the fidelity assessors (KP, RC), who were research assistants with a background in psychology and no training in OT or VR, to determine the frequency with which a component was delivered by the RETAKE OT (‘always’, ‘sometimes’, ‘seldom’, or ‘never’ delivered) and included a space for the assessor to record moderating factors that may have facilitated or prevented faithful delivery or receipt of each component. There were no accompanying guidance notes to aid interpretation or completion.

Piloting of version 1 and proposed changes

The research assistants (KP, RC) applied version 1 of the fidelity checklist to 8 sets of participant intervention records collected from the ESSVR feasibility study [41]. The intervention records were first read for familiarisation before data were extracted against the checklist components. Following piloting, changes were made to increase clarity and facilitate administration (see Table 2).

Table 2 Description of changes from previous versions

These proposed changes were discussed by members of the research team comprising an experienced stroke and OT researcher (KR), research OTs with experience in designing and implementing fidelity checklists (JH, JP), and research assistants with no clinical background who developed and implemented the fidelity checklist in this study (KP, RC, SC). Agreed changes were incorporated into a new version of the fidelity checklist (version 2).

Piloting of version 2 and production of guidance notes

The revised checklist was independently piloted against a further 10 sets of participant intervention records from the feasibility trial by two research assistants (KP, RC) who met to discuss discrepancies in administration and data extraction. Two clinical-academic OTs familiar with the intervention and responsible for training therapists in its delivery were consulted where there were discrepancies or questions regarding the intervention components. The ESSVR manual was also consulted for clarification. The piloting and consultation led to the development of version 3 of the checklist. Guidance notes for checklist administration were developed with reference to the intervention training manual and with input from the RETAKE OT training team.

The guidance notes explain each component of the intervention in detail, providing definitions of key phases and concepts to assist the person administrating the checklist. The guidance notes also give examples of where to find the evidence to support each component.

Expert panel

An expert panel was then formed to foster opinion from researchers with a clinical background and/or fidelity measurement expertise in relation to complex rehabilitation trials.

The expert panel consisted of eight researchers with both expertise in fidelity measurement and experience in measuring fidelity in complex rehabilitation trials. The purpose of the expert panel was to assist in (1) distinguishing between the ‘core’ and ‘desirable’ components of the intervention, (2) defining keywords and phrases within the fidelity checklist and guidance notes, and (3) assessing the suitability of the fidelity checklist and accompanying guidance notes.

Version 3 of the fidelity checklist and version 1 of the guidance notes were emailed to the expert panel members prior to the meeting. During the meeting, KP presented an anonymized participant intervention record from the feasibility study to the expert panel. The participant’s case was used to illustrate the application of the fidelity checklist and promote discussion of the components.

The panel discussed the core and desirable components of the intervention, practical application of the fidelity checklist, and the potential limitations of the methodology (e.g., method relies on OT record keeping), providing feedback and suggestions for amendments.

The feedback resulted in version 4 of the fidelity checklist and version 2 of the guidance notes.

Piloting of versions 4 and 5 of the fidelity checklist and version 2 of the guidance notes

Version 4 was independently piloted by two research assistants (KP, SC) on a further two cases from the RETAKE trial and the discrepancies discussed. No changes were made to the fidelity checklist and only minor changes were made to the guidance notes where further clarification was needed.

A digitised version of the checklist was created in Microsoft Excel and piloted by a third researcher with no clinical background, with no prior involvement in the fidelity checklist development to test the functionality of the digitised checklist. No further changes were made.

Interrater reliability

Participant intervention records for 15 ESSVR recipients were selected at random to assess interrater reliability. Treating OTs were asked to redact identifiable information and upload the anonymized intervention records to a secure file transfer service. Two independent researchers (KP and JP), one with no background in OT or VR (KP), and one expert in VR and OT who was instrumental in the development of the intervention (JP), independently applied the fidelity checklist assisted by the guidance notes.

A Cohen’s kappa statistic was calculated to assess interrater reliability. Based on guidelines for the interpretation of Kappa values, a value between 0 and 0.20 indicates no to slight agreement, 0.21 and 0.39 minimal agreement, 0.40 and 0.59 weak agreement, 0.60 and 0.79 moderate agreement, 0.80 and 0.90 strong agreement and 0.90 and above almost perfect agreement [44].


Development of fidelity checklist and guidance notes

Two materials were produced to aid in the assessment of fidelity in RETAKE: the fidelity checklist and its accompanying guidance notes (see Supplementary Files). The fidelity checklist was structured into the four stages of the intervention as described in the OTs’ intervention manual: early recovery, graded return to work, job retention, and discharge process.

To implement the checklist, the fidelity assessor was asked to review each participant’s intervention record. For each component, the assessor was asked whether there was sufficient evidence of the component’s delivery, where the assessor could select ‘Yes’, ‘No’, or ‘Not deliverable’ from a drop-down menu. The checklist provided space for the assessor to record details verbatim from the intervention record that would either evidence where the component had been delivered or provide evidence for why the component was not deliverable (moderating factors; e.g., where the OT did not have consent to contact an employer).

Piloting of version 1 and proposed changes

Across Versions 1–3 of the fidelity checklist, changes were made to the structure and content to best capture the core components of the intervention, increase clarity, and facilitate the administration of the checklist. Version 1 listed 10 core components. Proposed changes related to the evaluation of component delivery where ‘frequency’ was replaced with ‘no evidence’, ‘some evidence’, and ‘extensive evidence’, and a box was created to extract the supporting evidence verbatim into the checklist.

For full description of changes made to each version of the checklist, see Table 2.

Piloting of version 2 and production of guidance notes

During the piloting of version 2, the OT training manual was consulted. This provided the biggest structural difference in the checklist. Consulting the training manual resulted in the classification of intervention components into four phases (1, early recovery; 2, graded return to work; 3, job retention; and 4, discharge process) to mirror the information provided to the RETAKE OTs. Additional components specific to work monitoring and discharge processes were extracted from the RETAKE OT training manual. These components were highlighted as being essential to intervention delivery but were not explicitly listed in the logic model.

Expert panel

Version 3 of the checklist and version 1 of the guidance notes were taken to the expert panel. The expert panel facilitated discussion regarding the core components and their status as ‘core’ or ‘desirable’ to the intervention delivery. Based on feedback from the expert panel, the components and other key concepts and phrases were more clearly defined in the guidance notes. Jargon was minimised to improve the clarity and accessibility of the guidance notes.

The expert panel agreed that in addition to evidencing each component verbatim from the intervention records, the assessors should record the source of the evidence (e.g., correspondence, therapy notes). The expert panel also agreed that the assessor should record how long it takes to complete each fidelity assessment to evaluate the speed of checklist completion and compare it to other methods of fidelity assessment.

Versions 4 and 5

Version 4 of the fidelity checklist and version 2 of the guidance notes were produced which incorporated the recommendations from the expert panel. Following the application of the checklist to two further sets of ESSVR participant intervention records, the fidelity checklist was digitised into a Microsoft Excel spreadsheet to increase its utility. The spreadsheet contained a drop-down menu for the assessor to select whether there was sufficient evidence of the component or if the component was not deliverable. The assessor was then directed to provide evidence verbatim from the intervention record where possible in the next box where the assessor was also asked to select the source (CRF, clinical case notes, correspondence, etc.) from another drop-down menu.

Scoring of the fidelity checklist was written into a calculation which was automatically populated via the drop-down menu selection of ‘Yes’, ‘No’, and ‘Not deliverable’. The total overall fidelity score was calculated based on the number of delivered components divided by the number of components that were deliverable. Components that were classified as ‘desirable’ were only included in the calculation where they were delivered and were thus weighted differently than those classified as ‘core’, e.g.,

$$\left(\frac{n\ core\ components\ delivered+n\ desirable\ componets\ delivered}{N\ core\ components-n\ undelivereable\ core\ components+n\ desireable\ components\ delivered}\right)\times 100=\%\textrm{fidelity}$$

Interrater reliability

Assessment of 15 participant intervention records was completed by two independent assessors. The stroke survivors whose records were used to assess interrater reliability included six females (40%) and ages ranged from 33 to 61 years old (mean: 48.3 years, SD: 7.7). Cohen’s kappa ranged from 0.69 to 1 (See Table 3). Eleven items achieved 100% agreement, eight items achieved 90% agreement, and eight items achieved 80% agreement.

Table 3 Assessment of interrater reliability per checklist item

Time taken to complete

The time taken to complete the fidelity checklist ranged from 30 to 100 min (average 62 min). The average time taken to complete per assessor was 63.5 min (KP) and 57 min (JP).


An ESSVR-specific fidelity checklist with adequate interrater reliability, that is relatively quick to apply, and guidance notes to aid checklist completion were developed and piloted using the observational retrospective review of ESSVR participant intervention records. The checklist is adaptable to the specific contexts of the stroke survivors and other stakeholders and captures factors affecting the delivery of each component and facilitating identification and categorisation of implementation considerations. A future study will evaluate and report the fidelity of ESSVR delivery and factors affecting the delivery of individual components in RETAKE.

Application of the fidelity checklist to assess interrater reliability produced a Cohen’s kappa score ranging from 0.69 to 1, which indicates moderate to perfect interrater reliability [44]. Previous studies of fidelity checklist development report difficulties in obtaining high levels of agreement [2]. It is possible that this study achieved higher agreement through the information provided to the fidelity assessors through the guidance notes. It is also possible that this could be due to the involvement of the assessors with the ESSVR training team, which may have influenced the interpretation of the data in the ESSVR participant intervention records. Further research should explore whether other assessors with differing backgrounds would obtain the same high level of agreement. This study assessed interrater reliability using 15 stroke survivors’ intervention case notes, which is a small sample, but the results lend valuable information regarding how to improve the guidance notes to aid further understanding and agreement.

Of the eleven items within the checklist that yielded ‘moderate’ agreement, six of the items were core components and five were desirable components. The desirable components that produced ‘moderate’ agreement related to the OT’s delivery of an ESSVR component to the stroke survivor’s family or the delivery of emotional support to the employer. In exploring this further, the researchers completing the checklist disagreed on whether these components were deliverable or not as opposed to the presence of sufficient evidence. An example of where these components would not be deliverable is if a stroke survivor expressed, explicitly or implicitly, that they would prefer their family not be involved in their intervention. Future applications of the checklist should take this into account and guidance notes should be altered to provide further clarity. Of the six core items, three items asked the researchers to determine the delivery of a component to relevant ‘stakeholders’ or ‘sectors’. It is possible that the disagreement on these items was related to a lack of sufficient clarity in the guidance notes around the range of specific relevant stakeholders this might refer to. The other three core components that produced ‘moderate’ agreement all involved OT communication with the participant’s employer. The delivery of these components was impacted by factors outside of the OT’s control (e.g., employer engagement with the OT), which may explain discrepancies in the raters’ marking. Updates to the guidance notes to reflect this and support future applications are warranted.

Consultation with the expert panel provided a way to evaluate and establish the checklist’s content validity. Expert ratification of a measure’s components and scoring is a common way to evaluate content validity and confirm that the measure is assessing what it intends to assess [45, 46]. Recommendations for what constitutes as a suitable expert panel to establish content validity suggest that the members should be professionals with experience in the subject matter or clinical/research experience in the field [47]. This study’s expert panel comprised eight researchers with expertise in fidelity measurement within studies of complex interventions. Two of the researchers also had extensive knowledge and clinical experience of occupational therapy, vocational rehabilitation and ESSVR itself. By adopting the recommendations of the expert panel and adapting the checklist and its guidance notes, content validity was established. This study did not use a measure to quantify content validity, but this should be considered in future research to strengthen the measure [48]. Additionally, the expert panel did not include a representative from the trial’s Patient and Public Involvement group which would have provided added benefit in understanding what intervention components were of greater importance to those receiving it.

The time taken to apply the checklist ranged from 30 to 100 min. For context, a typical ESSVR session with a stroke survivor might be expected to last 30 to 60 min and a stroke survivor might expect have over a dozen sessions over the course of 12 months in some cases. The variation in time taken to complete the checklist was most likely due to the variation in the amount of information included within each ESSVR participant intervention record. Fidelity measurement research highlights the practicality of the measure (i.e., quick and easy use) as helpful for conserving resources [49] and reducing the burden within a study [5, 50]. The time to complete the measure in this study using observational retrospective case review provides a considerably quicker method to assessing an entire period of intervention delivery when compared with studies using more direct observational methods [5, 51].

The associated guidance notes facilitated the checklist’s use and provided a way to support the application of the checklist without having to provide additional training for future assessors. In the earlier stages of the checklist development, the research assistants initially applying the checklist frequently met with the research OTs responsible for training the RETAKE OTs to discuss discrepancies in the interpretation and adequate demonstrations of component delivery, which aided the development of the guidance notes. These were refined to thoroughly cultivate understanding and aid practicality, which might further explain the adequate level of agreement and interrater reliability between the raters (KP & JP). The thorough process used to create and refine the guidance notes facilitated ease of checklist administration, which is another important aspect of measure implementation that studies of fidelity measures often fail to report [5, 21]. With clear guidance notes facilitating sufficient levels of agreement, even where the person applying the checklist does not have a clinical background or experience in the intervention delivery, valuable study resources (e.g., clinical staff capacity and costings) may be conserved and may reduce bias. However, the results of the interrater reliability assessment are limited by the lack of a sensitivity analysis to determine what factors might have further influenced interrater reliability.

The intention of this study was to develop a checklist that could be applied by a research assistant in a trial, thereby reducing the risk of bias. Arguably, if the checklist is robust and guidance notes clear and the OTs adequately document the intervention, then a non-clinician should be able to extract the data and apply the checklist, saving valuable clinical and research study resources, particularly given the high costs and capacity issues associated with the use of clinical staff. This approach is in no way intended to devalue clinical experience or expertise in the delivery of this or any other complex interventions, but rather aims to provide an efficient way of measuring fidelity during a clinical trial. Experienced clinical mentors overseeing the clinical implementation of the intervention [52] could be informed of deviations from the process and address these in real time during the trial, further facilitating faithful delivery of the intervention.

There are some limitations to this method of assessing fidelity. Using the observational retrospective review of ESSVR participant intervention records in this study meant that fidelity checklist completion was dependent on the detailed record keeping of the RETAKE OTs. This limited the conclusions to whether there was sufficient evidence of the component’s delivery. In cases where there was not sufficient evidence of a component’s delivery, we could not confidently conclude that it had not been delivered. Direct observation of intervention delivery either in-person or via audio/video-recorded sessions is an effective way of confidently determining whether or not an intervention component has been delivered [23, 27]. However, whilst observation of intervention delivery is an established rigorous approach to assessing fidelity, it is not always possible or feasible as participants might not always give consent for session recording [5] and this approach also requires considerable staff and time resource [5, 53]. Direct observation of intervention delivery might also cause the person delivering the intervention to behave differently to when they are not being observed [26, 54, 55].

To help draw more confident conclusions about component delivery using observational retrospective case review, future research might include a method to further assist or encourage therapists in detailed record keeping through an electronic database, for example. Future implementation studies might also use this approach of assessing fidelity to support the faithful delivery of interventions whereby intervention records could be reviewed on an ongoing basis starting from the beginning of intervention delivery. This approach would enable researchers to identify intervention components that are not being consistently delivered and support those delivering the intervention to deliver these components in the future. Future research may also look to expand upon this method and the checklist, making it more robust by defining parameters for the amount of evidence present and assigning levels of sufficiency beyond ‘Yes’ ‘No’, and ‘Not deliverable’ with reference to the components. Lastly, future research should seek to involve clinicians in further development and testing of the checklist, where the rates of interrater reliability could then be compared with those of non-clinicians.

This approach to measuring fidelity allowed us to observe the intervention delivery over long periods of time over an unprescribed number of sessions across multiple study centres with multiple therapists. Whilst the checklist components are specific to the ESSVR intervention, the process followed to develop and apply the checklist is replicable and generalisable to studies of complex interventions. This approach may inform the design of implementation strategies in future studies of complex interventions.


The checklist and guidance notes developed in this study are fit for assessing the delivery of ESSVR components in the RETAKE trial, and their application will be essential in providing context for the interpretation of the results of the trial with regard to the effectiveness of the intervention. The process followed to create the fidelity checklist in this study will inform the design of future implementation strategies for complex rehabilitation interventions.

This study also considered the feasibility of using a retrospective review of intervention records to assess fidelity, which may facilitate robust longitudinal fidelity assessment procedures in future complex intervention studies. Establishing robust methods of assessing fidelity in complex rehabilitation interventions, such as ESSVR, will help researchers more confidently draw conclusions about the effectiveness of the interventions they seek to evaluate.

Availability of data and materials

On study completion, the final trial dataset will be archived at the University of Nottingham. Following completion of the RETAKE trial and publication of its effectiveness outcomes, any party may apply to the corresponding author for access to the dataset. Access will be governed by an information governance committee formed between the University of Nottingham and the University of Leeds.



Conceptual Framework for Implementation Fidelity


Case-report form


Early, stroke-specialist vocational rehabilitation


National Health Service


Occupational therapist


REturn To work After stroKE trial


Vocational rehabilitation


  1. Bellg AJ, Resnick B, Minicucci DS, Ogedegbe G, Ernst D, Borrelli B, et al. Enhancing treatment fidelity in health behavior change studies: best practices and recommendations from the NIH Behavior Change Consortium. Health Psychol. 2004;23:443–51.

    Article  PubMed  Google Scholar 

  2. Walton H, Tombor I, Burgess J, Groarke H, Swinson T, Wenborn J, et al. Measuring fidelity of delivery of the Community Occupational Therapy in Dementia-UK intervention. BMC Geriatr. 2019;19(1):364.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Hand BN, Darragh AR, Persch AC. Thoroughness and psychometrics of fidelity measures in occupational and physical therapy: a systematic review. Am J Occup Ther. 2018;72:7205205050p1 Available from: /pmc/articles/PMC6114192/. American Occupational Therapy Association, Inc [cited 8 Mar 2021].

    Article  PubMed  PubMed Central  Google Scholar 

  4. Lockett H, Waghorn G, Kydd R. A framework for improving the effectiveness of evidence-based practices in vocational rehabilitation. J Vocat Rehabil. 2018;49(1):15–31.

    Article  Google Scholar 

  5. Walton H, Spector A, Tombor I, Michie S. Measures of fidelity of delivery of, and engagement with, complex, face-to-face health behaviour change interventions: a systematic review of measure quality. Br J Health Psychol. 2017;22(4):872–903.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Borrelli B. The assessment, monitoring, and enhancement of treatment fidelity in public health clinical trials. J Public Health Dent. 2011;71(SUPPL. 1):S52–63 Available from: [cited 8 Feb 2021].

    Article  PubMed  PubMed Central  Google Scholar 

  7. Hart T. Treatment definition in complex rehabilitation interventions. Neuropsychol Rehabil. 2009;19(6):824–40 Available from: [cited 15 Nov 2020].

    Article  PubMed  Google Scholar 

  8. Hasson H. Systematic evaluation of implementation fidelity of complex interventions in health and social care. Implement Sci. 2010;5(1):67 Available from: [cited 15 Nov 2020].

    Article  PubMed  PubMed Central  Google Scholar 

  9. Lipsey MW, Cordray DS. Evaluation methods for social intervention. Annu Rev Psychol. 2000;51:345–75.

    Article  CAS  PubMed  Google Scholar 

  10. Hildebrand MW, Host HH, Binder EF, Carpenter B, Freedland KE, Morrow-Howell N, et al. Measuring treatment fidelity in a rehabilitation intervention study. Am J Phys Med Rehabil. 2012;91(8):715–24 Available from: /pmc/articles/PMC3967862/?report=abstract [cited 2 Feb 2021].

    Article  PubMed  PubMed Central  Google Scholar 

  11. Toglia J, Lee A, Steinberg C, Waldman-Levi A. Establishing and measuring treatment fidelity of a complex cognitive rehabilitation intervention: the multicontext approach. Br J Occup Ther. 2020;83(6):363–74 Available from: [cited 8 Mar 2021].

    Article  Google Scholar 

  12. Dusenbury L. A review of research on fidelity of implementation: implications for drug abuse prevention in school settings. Health Educ Res. 2003;18(2):237–56 Available from: [cited 8 Feb 2021].

    Article  PubMed  Google Scholar 

  13. Craig P, Dieppe P, Macintyre S, Mitchie S, Nazareth I, Petticrew M. Developing and evaluating complex interventions: the new Medical Research Council guidance. BMJ. 2008;337:979–83 Available from: [cited 15 Nov 2020].

    Google Scholar 

  14. Dane AV, Schneider BH. Program integrity in primary and early secondary prevention: are implementation effects out of control? Clin Psychol Rev. 1998;18(1):23–45.

    Article  CAS  PubMed  Google Scholar 

  15. Damschroder L, Aron D, Keith R, Kirsh S, Alexander J, Lowery J. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement Sci. 2009;4:50.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Cane J, O’Connor D, Michie S. Validation of the theoretical domains framework for use in behaviour change and implementation research. Implement Sci. 2012;7(1):37 Available from: [cited 8 Mar 2021].

    Article  PubMed  PubMed Central  Google Scholar 

  17. Carroll C, Patterson M, Wood S, Booth A, Rick J, Balain S. A conceptual framework for implementation fidelity. Implement Sci. 2007;2(1):1–9.

    Article  Google Scholar 

  18. Masterson-Algar P, Burton CR, Rycroft-Malone J, Sackley CM, Walker Catherine M.;, Walker, Marion F.; MFAI-O http://orcid. org/Sackle., P. M-A, et al. Towards a programme theory for fidelity in the evaluation of complex interventions. Argyris Bonell, Byng, Carroll, Cohen, Cook, Damschroder, Dane, Dusenbury, Finlay, Fletcher-Smith, Gibbs, Greenhalgh, Hasson, Hasson, Helitzer, Holliday, Kerns, Law, Marchal, Marchal, May, McCormack, McNulty, Novick, Pawson, Ramsay, Ross, Rubin, Rycroft-M B, editor. J Eval Clin Pract. 2014;20(4):445–52. Available from:

  19. Augustsson H, von Thiele SU, Stenfors-Hayes T, Hasson H. Investigating variations in implementation fidelity of an organizational-level occupational health intervention. Int J Behav Med. 2015;22(3):345–55 Available from: [cited 7 Mar 2021].

    Article  PubMed  Google Scholar 

  20. Lincoln NB, Bradshaw LE, Constantinescu CS, Day F, Drummond AE, Fitzsimmons D, et al. Assessing intervention fidelity in the CRAMMS trial. 2020; Available from: [cited 15 Nov 2020].

    Google Scholar 

  21. Rixon L, Baron J, McGale N, Lorencatto F, Francis J, Davies A. Methods used to address fidelity of receipt in health intervention research: a citation analysis and systematic review. BMC Health Serv Res. 2016;16(1):1–24.

    Article  Google Scholar 

  22. Schoenwald SK, Garland AF, Chapman JE, Frazier SL, Sheidow AJ, Southam-Gerow MA. Toward the effective and efficient measurement of implementation fidelity. Adm Policy Ment Health Ment Health Serv Res. 2011;38(1):32–43 Available from: [cited 2 Feb 2021].

    Article  Google Scholar 

  23. Walton H, Spector A, Williamson M, Tombor I, Michie S. Developing quality fidelity and engagement measures for complex health interventions. Br J Health Psychol. 2020;25(1):39–60.

    Article  PubMed  Google Scholar 

  24. Moncher FJ, Prinz RJ. Treatment fidelity in outcome studies. Clin Psychol Rev. 1991;11(3):247–66.

    Article  Google Scholar 

  25. Smart A. A multi-dimensional model of clinical utility. Int J Qual Health Care. 2006;18(5):377–82.

    Article  PubMed  Google Scholar 

  26. Breitenstein SM, Gross D, Garvey CA, Hill C, Fogg L, Resnick B. Implementation fidelity in community-based interventions. Res Nurs Health. 2010;33(2):164–73.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Lorencatto F, West R, Christopherson C, Michie S. Assessing fidelity of delivery of smoking cessation behavioural support in practice. Implement Sci. 2013;8(1):1–10.

    Article  Google Scholar 

  28. Parvaneh S, Cocks E, Buchanan A, Ghahari S. Development of a fidelity measure for community integration programmes for people with acquired brain injury. Brain Inj. 2015;29(3):320–8 Available from: [cited 2021 Mar 9].

    Article  PubMed  Google Scholar 

  29. Waddell G, Burton AK, Kendall NA. Vocational rehabilitation–what works, for whom, and when? (Report for the Vocational Rehabilitation Task Group). TSO; 2008.

  30. Frank AO, Thurgood J. Vocational rehabilitation in the UK: opportunities for health-care professionals. Int J Ther Rehabil. 2006;13(3):126–34.

    Article  Google Scholar 

  31. Loisel P, Buchbinder R, Hazard R, Keller R, Scheel I, Van Tulder M, et al. Prevention of work disability due to musculoskeletal disorders: the challenge of implementing evidence. J Occup Rehabil. 2005;15(4):507–24.

    Article  PubMed  Google Scholar 

  32. Cancelliere C, Cassidy JD, Colantonio A. Specific disorder-linked determinants: traumatic brain injury. In: Handbook of work disability. New York: Springer; 2013. p. 303–14.

  33. Nelson MLA, McKellar KA, Yi J, Kelloway L, Munce S, Cott C, et al. Stroke rehabilitation evidence and comorbidity: a systematic scoping review of randomized controlled trials. Top Stroke Rehabil. 2017;24(5):374–80.

    Article  PubMed  Google Scholar 

  34. Bragstad LK, Bronken BA, Sveen U, Hjelle EG, Kitzmüller G, Martinsen R, et al. Implementation fidelity in a complex intervention promoting psychosocial well-being following stroke: an explanatory sequential mixed methods study. BMC Med Res Methodol. 2019;19(1):1–18.

    Article  Google Scholar 

  35. Ntsiea MV, Van Aswegen H, Lord S, Olorunju S. The effect of a workplace intervention programme on return to work after stroke: a randomised controlled trial. Clin Rehabil. 2015;29(7):663–73.

    Article  CAS  PubMed  Google Scholar 

  36. Jones F, Gage H, Drummond A, Bhalla A, Grant R, Lennon S, et al. Feasibility study of an integrated stroke self-management programme: a cluster-randomised controlled trial. BMJ Open. 2016;6(1):e008900.

    Article  PubMed  PubMed Central  Google Scholar 

  37. Baldwin C, Brusco NK. The effect of vocational rehabilitation on return-to-work rates post stroke: a systematic review. Top Stroke Rehabil. 2011;18(5):562–72.

    Article  PubMed  Google Scholar 

  38. Walker MF, Hoffmann TC, Brady MC, Dean CM, Eng JJ, Farrin AJ, et al. Improving the development, monitoring and reporting of stroke rehabilitation research: consensus-based core recommendations from the Stroke Recovery and Rehabilitation Roundtable. Int J Stroke. 2017;12(5):472–9.

    Article  PubMed  Google Scholar 

  39. Hoffmann TC, Glasziou PP, Boutron I, Milne R, Perera R, Moher D, et al. Better reporting of interventions: template for intervention description and replication (TIDieR) checklist and guide. BMJ. 2014;348:g1687.

    Article  PubMed  Google Scholar 

  40. Radford KA, Craven K, McLellan V, Sach TH, Brindle R, Holloway I, et al. An individually randomised controlled multi-centre pragmatic trial with embedded economic and process evaluations of early vocational rehabilitation compared with usual care for stroke survivors: study protocol for the RETurn to work After stroKE (RETAKE). Trials. 2020;21(1):1–17.

    Article  Google Scholar 

  41. Grant M. Developing, delivering and evaluating stroke specific vocational rehabilitation: a feasibility randomised controlled trial: PQDT - UK Irel; 2016. Available from:

    Google Scholar 

  42. Holmes JA, Fletcher-Smith JC, Merchán-Baeza JA, Phillips J, Radford K. Can a complex vocational rehabilitation intervention be delivered with fidelity? Fidelity Assessment in the Fresh Feasibility Trial; 2020.

    Google Scholar 

  43. Radford KA, McKevitt C, Clarke S, Powers K, Phillips J, Craven K, et al. RETurn to work After stroKE (RETAKE) Trial: protocol for a mixed-methods process evaluation using normalisation process theory. BMJ Open. 2022;12(3):e053111 Available from: [cited 23 May 2022].

    Article  PubMed  PubMed Central  Google Scholar 

  44. McHugh ML. Interrater reliability: the kappa statistic. Biochem Med. 2012;22(3):276–82 Available from: /pmc/articles/PMC3900052/?report=abstract [cited 15 Nov 2020].

    Article  Google Scholar 

  45. Safikhani S, Sundaram M, Bao Y, Mulani P, Revicki DA. Qualitative assessment of the content validity of the Dermatology Life Quality Index in patients with moderate to severe psoriasis. J Dermatolog Treat. 2013;24:50–9 Available from: [cited 9 Mar 2021].

    Article  PubMed  Google Scholar 

  46. Zamanzadeh V, Ghahramanian A, Rassouli M, Abbaszadeh A, Alavi-Majd H, Nikanfar A-R. Design and implementation content validity study: development of an instrument for measuring patient-centered communication. J Caring Sci. 2015;4(2):165–78 Available from: /pmc/articles/PMC4484991/ [cited 9 Mar 2021].

    Article  PubMed  PubMed Central  Google Scholar 

  47. Davis LL. Instrument review: getting the most from a panel of experts. Appl Nurs Res. 1992;5(4):194–7.

    Article  Google Scholar 

  48. Froman RD, Schmitt MH. Thinking both inside and outside the box on measurement articles. Res Nurs Health. 2003;26:335–6 Available from: [cited 9 Mar 2021].

    Article  PubMed  Google Scholar 

  49. Bowen DJ, Kreuter M, Spring B, Cofta-Woerpel L, Linnan L, Weiner D, et al. How we design feasibility studies. Am J Prev Med. 2009;36:452–7 Available from: /pmc/articles/PMC2859314/. NIH Public Access [cited 9 Mar 2021].

    Article  PubMed  PubMed Central  Google Scholar 

  50. Glasgow RE, Ory MG, Klesges LM, Cifuentes M, Fernald DH, Green LA. Practical and relevant self-report measures of patient health behaviors for primary care research. Ann Fam Med. 2005;3:73–81 Available from: [cited 9 Mar 2021].

    Article  PubMed  PubMed Central  Google Scholar 

  51. Harting J, Van Assema P, Van Der Molen HT, Ambergen T, De Vries NK. Quality assessment of health counseling: performance of health advisors in cardiovascular prevention. Patient Educ Couns. 2004;54(1):107–18.

    Article  PubMed  Google Scholar 

  52. Craven K, Holmes J, Powers K, Clarke S, Cripps RL, Lindley R, et al. Embedding mentoring to support trial processes and implementation fidelity in a randomised controlled trial of vocational rehabilitation for stroke survivors. BMC Med Res Methodol. 2021;21(1):203.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Cochrane WS, Laux JM. A survey investigating school psychologists’ measurement of treatment integrity in school-based interventions and their beliefs about its importance. Psychol Sch. 2008;45(6):499–507.

    Article  Google Scholar 

  54. Dumas JE, Lynch AM, Laughlin JE, Smith EP, Prinz RJ. Promoting intervention fidelity: conceptual issues, methods, and preliminary results from the EARLY ALLIANCE prevention trial. Am J Prev Med. 2001;20(1 SUPPL):38–47 Available from: [cited 3 Feb 2021].

    Article  CAS  PubMed  Google Scholar 

  55. Eames C, Daley D, Hutchings J, Hughes JC, Jones K, Martin P, et al. The leader observation tool: a process skills treatment fidelity measure for the Incredible Years parenting programme. Child Care Health Dev. 2008;34(3):391–400 Available from: [cited 3 Feb 2021].

    Article  CAS  PubMed  Google Scholar 

Download references


The authors wish to thank Vicki McLellan, Marissa Arfan, Jade Kettlewell, Nicholas Behn, Jacqueline Mhizha-Murira, Vicky Booth, and the RETAKE Occupational Therapists for their support and contributions to the development of the fidelity evaluation process.


This article includes research funded by the National Institutes of Health Research Health Technology Assessment programme (NIHR- HTA; ref: 15/130/11). The views expressed in this article are those of the authors, not necessarily the NIHR, the Department of Health and Social Care, or the NHS. Katie Powers' PhD is funded by the Ossie Newell Foundation and the views expressed in this article are those of the authors, not necessarily the Ossie Newell Foundation.

Author information

Authors and Affiliations



KR is the chief investigator of the RETAKE study. JH, JP, and KR designed the intervention. KP, RC, and SC conducted the fidelity assessments. KP, KR, RN, AF, JH, and JP contributed to the development of this study’s design and analysis plan. KP conducted the analysis. KP drafted the manuscript. The authors read and approved the final version.

Corresponding author

Correspondence to Katie Powers.

Ethics declarations

Ethics approval and consent to participate

Ethical approval has been obtained through the East Midlands—Nottingham 2 Research Ethics Committee (REC) (Ref: 18/EM/0019).

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

ESSVR Fidelity Checklist. Digitised version of the ESSVR fidelity checklist developed in this study. ESSVR Fidelity Checklist Completion Guidance Notes. Guidance notes developed in this study to aid in checklist completion.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Powers, K., Clarke, S., Phillips, J. et al. Developing an implementation fidelity checklist for a vocational rehabilitation intervention. Pilot Feasibility Stud 8, 234 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: