Christina Talley, MS, RAC, CCRP, CCRC
Houston Methodist Research Institute, Office of Strategic Research Initiatives
Abstract: Detailed protocol analysis and feasibility objectively translated into a protocol grade or quantitative score is an effective way to manage overall workload distribution, personnel resource allocation, and financial management before the clinical trial is implemented. This article provides an overview of protocol areas for evaluation, scoring, and current effort tracking models used in clinical research. One model, the Protocol Acuity Rating Scale (PARS), is described in detail with examples of its application to different types of clinical research studies
Due to the increasing complexity of clinical research protocols and the need to justify staffing, it is critical for clinical research sites to have methods to evaluate, quantify, and document the amount of effort that will be required to effectively execute a clinical trial. It is also necessary to estimate the financial support needed to justify and to properly carry out the work. Determining equitable workload allocation among staff at a site is crucial. Large sites need accurate projections of total staff (research coordinators, clinical trial managers, data managers, etc.) and a balance of work between staff members. Smaller sites need to provide justification to hire additional personnel or provide decision points for accepting or rejecting a research project.
Protocol scoring, according to the author, is a detailed review of a clinical research protocol and activities needed to carry out that protocol, including moving the subject through the protocol during project execution. All of the activities are categorized and graded so that they can be assigned a point value. The points for the project are then totaled, and this determines the project’s overall difficulty “grade” or score. The most important benefit of protocol scoring is that it can then be translated into work effort projections or used to appropriately assign staff to various projects.
The balance between effective utilization of personnel (full-time equivalents or FTEs) to support the work and being overstaffed and having idle staff is very delicate. If the workload is overwhelming or a project is beyond the scope of the personnel, the participants, the project, and the data may all be at risk and the quality of the unit will diminish. If staff members are not fully utilized or there are unequal distributions in workload, financial resources may be deemed to be mismanaged. High error rates, deviations, or overspending on underperforming staff never goes unnoticed.
When evaluating deliverables and scoring a protocol, it is important to keep in mind the abilities of the personnel and the role each person will play in the research study. Questions to be answered include:
- Does the FTE directly manage participants?
- Are any procedures, assessments, or referrals to specialized services required?
- What is the FTE’s skill and education level?
- What is the FTE’s current standing workload?
How Analysis Benefits a Clinical Research Site: An Example
The Houston Methodist Research Institute conducts many first-in-human and early phase studies. The author needed to evaluate one of the research sections to determine the feasibility for the section to take on a high-profile, federally-funded, first-in-human trial that required effective management and aggressive recruitment to see the investigational product into the pivotal phase.
The evaluation consisted of a review of the number of FTEs, the number of treatment studies open within the past five years, and annual accrual for each study as reported in the continuing reviews. Over the review period, the research section had a total of 6 to 8 FTEs and 48 treatment trials, most of which were open 3 years or longer.
Average enrollment was low, ranging from 1.3 to 2.6 subjects per study per year for the research section. This indicated that something was amiss. Perhaps FTEs were overworked or only enrolling subjects to some trials. Other possible problems were that some investigators were very prolific and others were not. A review of the studies by specialty did not identify any prolific investigators and also showed low average enrollment per study per year. The specialty clinic has a targeted orphan population; however, a review of this population did not improve the results.
Thus, the review showed that the research section had issues with feasibility, and by the demonstration of many trials going annually with zero enrollments, failed to perform feasibility prior to commitment and initiation. The research section had allocated FTEs to executing these trials; however, it was unable to meet the minimum burden of enrollment, demonstrating an ineffective allocation of personnel.
Review of Protocol Scoring Tools
There are several available, well-published protocol scoring and workload tracking tools:
- National Cancer Institute (NCI) Trial Complexity and Elements Scoring Model
- University of Michigan Research Effort Tracking Application
- Ontario Protocol Assessment Level
- Wichita Community Clinical Oncology Program Protocol Acuity Tool.
Table 1 provides an overview of each model, most of which were developed in the field of oncology.
The NCI Trial Complexity and Elements Scoring Model was one of the earliest models for scoring protocols. It rates 10 elements, such as the length of the study and the complexity of the informed consent process. Each element is rated Level 0 (standard), Level 1 (moderate), or Level 2 (high). The model also provides an overall score for the protocol.
The University of Michigan Research Effort Tracking Application is a detailed effort tracking clinical trial management system. It is a Web-based service that tracks effort allocated to all clinical research activity and can be used to compare projections with actual personnel expenditures as well as to project feasibility or the FTEs needed to carry out future studies. This is a very effective tool.
Two more recent tools are the Ontario Protocol Assessment Level and the Wichita Community Clinical Oncology Program Protocol Acuity Tool. The widely used Ontario Protocol Assessment Level uses a pyramid rating scale ranked from levels 1 through 8. The base of the pyramid is Phase 1 highly interventional studies. Each increment represents increasing complexity. The score can be increased based on the number of subjects and contacts per subject. The Wichita Community Clinical Oncology Program Protocol Acuity Tool ranks protocols on six workload-related determinants and scores protocols according to their estimated workload using a range of one to four. These are good tools that have a great deal of utility.
Development of the Protocol Acuity Rating Scale Tool
When the author was working in pediatric clinical research in a previous position, she found that the oncology-based tools did not facilitate comprehensive analysis of non-oncology research. Additionally, although many of the scoring models provided ways to analyze and result in quantitative values, this was rarely translated back to the actual amount of estimated personnel effort or cost. The author and her team wanted to evaluate different portions of studies. Since the clinical research site conducted more industry-sponsored trials than most other academic clinical research sites, she wanted to be able to analyze the heavy data requirements of these studies.
The Protocol Acuity Rating Scale (PARS) was developed based on post initiation operational aspects of clinical trials and all participant management at the pediatric clinical research site. Table 2 highlights the criteria used to review protocols in PARS, which is expressed in a grid. Criteria included the phase and type of the study, the participant setting (inpatient or outpatient), and data reporting requirements (paper case report forms, whether it is a consortium study, and electronic data capture). Oversight and monitoring is another key criterion used in PARS. The amount of time required for preparation for oversight and monitoring by an external sponsor can sometimes be significantly more than expected. The encounter procedure and frequency as well as the duration of the study are also part of PARS.
Because rate of enrollment and total number of subjects placed on study can increase or decrease total clinical trial workload exponentially, the rate of accrual is deemed the “X” factor in any study. All previous criteria are evaluated and totaled, then multiplied by the rate of accrual score. If the rate of accrual is moderate or slow, study staff members can pace themselves, or the hours per week of effort may be lower. If, however, the rate of accrual is very fast, the study needs a higher FTE allocation.
It’s difficult for one method of protocol scoring to effectively evaluate everything required to conduct a clinical trial. Thus, PARS does not evaluate the regulatory effort required to draft, gain approval for, and process amendments. Although initial approval may be projected based upon protocol evaluation, changes throughout the study lifecycle can be unpredictable. The number of amendments can be highly variable. The author and her team reviewed more than 50 treatment protocols and found a mean of four amendments; however, the range was two to six amendments over an average of three years active study period. Additionally, the complexity of the required regulatory effort does not necessarily correlate with the trial phase or operational protocol complexity, nor does it relate to the scope and depth of the amendments. Cooperative grant or limited funding studies, for example, commonly arrive with amendments containing “tiny” changes that require 10-fold additional hours of FTE work.
PARS also does not evaluate:
- Finance and budget negotiation, including hospital pre-award procedures
- Study-specific billing review and compliance assessments
- Contract negotiation, processing, and amendments, and
- Additional committee reviews and approvals (such as a General Clinical Research Center review committee).
Other metrics could be used to evaluate these activities.
The PARS grid has the reference or categories (low, which is 1 point, medium, which is 2 points, and high, which is 3 points) in the first column. Across the top, the items are: phase, type of study, participant setting, data requirements, monitoring oversight, encounter procedure, lab/samples, encounter frequency, study duration, and rate of accrual. This is similar in some ways to the NCI Trial Complexity and Elements Scoring Model.
PARS Example #1
In Example #1, PARS is used on a non-interventional, cooperative group protocol with the objective of establishing and maintaining a disease history and outcomes database to evaluate this disease’s impact on health-related quality of life. The exploratory objective is to learn about genetic modifiers of clinical phenotypes of the disease (an optional sub-study).
There is a baseline exam assessment and then subjects come in every six months during the two-year study. The study includes questionnaires, basic exams, measurement of vital signs, medical history, and the same disease-specific clinical evaluation that the subjects would be getting in standard care. All exam results and questionnaires had to be entered into the central database. The monitoring burden is very low, with monitoring to be done about once a year.
The rate of accrual can dramatically change the protocol score for this study, as shown by this example of two different study populations. If the study population is a low-prevalence, rare disease such as Duchenne muscular dystrophy, cystic fibrosis, or Huntington’s disease, overall patient recruitment and rate of recruitment will be much lower. Data requirements are medium, based on the number of hours per week expected to be required to acquire the data, translate them, review the medical notes, and enter questionnaire and other data. Laboratory samples for the sub-study will require drawing the samples, and then prepping, packing, and shipping them. A two-year study requires some scheduling and follow up.
The rate of accrual will be low, even in a specialty clinic, and is projected to be about one or two subjects per month. The total study score for this study with a low-prevalence, rare disease study population is 14.
In a diabetes, atrial fibrillation, or obesity clinic, however, everything is different. The number of subjects are recruited into a study at a faster rate, resulting in more overall encounters and a higher data burden over the work week, demand more of an FTE. The total study score with a high-prevalence study population is 42. Carrying out the exact same study requires more personnel effort in a high-prevalence study population than in a low-prevalence, rare disease study population.
PARS Example #2
In Example #2, PARS is used on a Phase 3 industry-sponsored pivotal multicenter trial. Data from the study will be used in the New Drug Application. It is a three-arm randomized double-blind placebo-controlled study of the efficacy and safety of two doses of the study drug as adjunctive therapy in patients with genetically-induced catastrophic syndrome. The primary objective is to compare the clinical response to the different drug doses on the primary endpoint: reduction in the catastrophic effects of genetically-induced catastrophic syndrome. The study population is an orphan population, usually very ill at baseline and complicated medically. This increases the potential for adverse events and serious adverse events. This population has no viable treatment alternatives, and this is the first possible disease-modifying investigational product that has been identified.
The study includes pharmacokinetics to compare the efficacy of different trough levels versus placebo, as well as quality of life and developmental assessments. The three-part study consists of baseline/screening, blinded core, and an extension open label phase lasting until approval of the drug. The entire study period is expected to be about three to four years.
Many studies are like this. Sponsors, whether they are industry or academic, are trying to gather as much data as they can from studies, especially in orphan drug development programs where available patients to enroll in research studies are fewer.
This is an interventional, pivotal Phase 3 study. It is outpatient; however, it requires seeing subjects frequently, sometimes with only days between visits. Many laboratory samples are required, and the encounter frequency is high. Like many other studies, this one continues until marketing approval. The rate of accrual will be very high, since this is an orphan population with no standard alternative disease-modifying treatment. The total score for this study is 69.
Protocol Scores and FTE
When developing a protocol scoring tool or using one of the available tools, each clinical research site must compare the scores and the workload to determine the best way to translate the scores for that particular site. This can be done by modifying or changing the areas of protocol operation evaluation or by scaling point values to be appropriate to their staffing structure or job responsibilities.
Houston Methodist Research Institute translates the protocol score into FTEs, reviewing the total number of points and how that translates into carrying capacity in FTEs (Table 3). Carrying capacity, however, varies by knowledge and experience, which is not limited to only clinical research experience. It includes overall education level, prior medical training, professional maturity, social awareness, and the ability to handle stressful high-pressure work. The author has worked with junior employees who are new to research but are very socially and professionally mature and able to handle pressure. She has also worked with research nurses who have extensive disease and treatment area experience who can easily extrapolate their medical training and expertise to working with complicated research populations, allowing them to carry immense workloads.
Concurrent, overall workload is the other key factor in translating the protocol score into FTEs. This varies by the size of the overall research department and the segregation of work tasks. A one-man show will be very different than a multi-disciplinary department where work tasks are highly segregated. In a single-coordinator model, the research coordinator handles budgeting, regulatory work, interface with the U.S. Food and Drug Administration, and so forth, as well as coordinating the studies. This research coordinator cannot do as much operationally with subject movement as a research coordinator in a compartmentalized department, where she/he only coordinates the studies and the department has a regulatory coordinator, a finance manager, a clinical trial manager, and so forth. That changes the relevancy of those point values.
Generally speaking, in a compartmentalized department which has research coordinators (responsible for coordination and data management), a regulatory coordinator, a finance manager, and a clinical trial manager, the FTE ”carrying capacity” is as follows:
- Junior/entry level research coordinator:
- 50-100 points (this can vary a little due to scientific or previous medical experience)
- Mid-level research coordinator:
- 100-150 points
- Senior level research coordinator or study section research
coordinator/ project manager:
- Usually 200+.
The expectations for a given target capacity can be adjusted to suit the complexity of the studies seen in the department or for the particular interventions that are being tested. For example, if the clinical research site is the applicant and needs to interface directly with the U.S. Food and Drug Administration, the burdens of reporting and monitoring will be higher. Other types of studies that increase the protocol score include invasive, high-risk surgery protocols and translational, first-in-man studies where a high incidence of toxicity is expected.
Utilization of a protocol scoring model or tool can help clinical research sites objectively evaluate the requirements of carrying out a protocol. A protocol scoring model or tool also removes bias, which the author has found to be extremely effective in determining protocol acceptance, and workload leveling.
It is critical to evaluate protocol feasibility before implementation of the clinical trial; a protocol scoring model or tool facilitates this. Pre-implementation assessment aids in preventing the acceptance of protocols for which the procedures or subject recruitment is unattainable for the study site. Moreover, it assists in gauging the available capacity or “bandwidth” available in the workgroup to accept taking the additional workload. It prevents the problem of opening many clinical trials where no subjects are being enrolled and patients are missing out on participating in research because the feasibility was never worked out.
The protocol score can be extrapolated into estimating how many FTE employees are required to support the trial and the overall group workload. This can be used as an objective measure to justify personnel projections and costs per year, per project, per FTE, and anticipated growth needs. . This is not a one-time assessment. If there are amendments causing protocol modifications/extensionor staff changes or reassignments, protocol scoring should be done again. Protocol scoring does not take long to do and can easily be repeated.
Review of Available Protocol Scoring Tools
- NCI Trial Complexity and Elements Scoring Model:
- Released in 2009
- Cooperative effort: Trial Complexity Working Group
- Scores protocols on 10 elements:
- Level 0 (standard)
- Level 1 (moderate)
- Level 2 (high)
- University of Michigan Research Effort Tracking Application:
- Released in 2011
- Web-based service that tracks effort allocated to all clinical research activity
- Projections can be compared to actual personnel expenditures
- Provides information on effort for various tasks
- Ontario Protocol Assessment Level:
- Released in 2011
- Collaborative effort from experienced clinical trial managers from cancer centers across Ontario
- Pyramid rating scale ranked from levels 1 through 8
- Each increment represents increasing complexity
- Wichita Community Clinical Oncology Program Protocol Acuity Tool:
- Released in 2013
- Protocols ranked on 6 workload-related determinants
- Trials are assigned a score of 1-4 according to their estimated workload
Activities Evaluated in the Protocol Acuity Rating Scale
- Phase and type of study
- Participant setting
- Data reporting requirements
- Monitoring oversight
- Complexity of encounter procedures
- Encounter frequency
- Laboratory or sample collection and processing information
- Total anticipated study duration
- Rate of accrual
How Protocol Scores Relate to FTEs at Houston Methodist Research Institute
- Carrying capacity varies by knowledge and experience:
- Not limited to clinical research experience
- Institution, section, environment experience
- Disease/treatment area experience
- Patient/interpersonal area experience
- Current workload:
- Size of overall research department and segregation of work tasks
- The one-man show
- Multi-disciplinary department
- Compartmentalization of work