CRF Completion Guidelines

Authors: Kelly Hills (Horizon Pharma, Inc.) , Tara Bartlett (Roche Pharmaceuticals, Toronto, CA) , Isabelle Leconte (Johnson & Johnson, Allschwil, CH) , Dr. Meredith Nahm Zozus (University of Texas Health Sciences Center, San Antonio, TX, US)

  • GCDMP©

    CRF Completion Guidelines

    Authors: , , ,


Case Report Forms (CRFs) are a common data collection mechanism in clinical studies and are sometimes the original recording of study data. CRF completion is one of the earliest opportunities to assure accurate and complete data and to decrease downstream work associated with identification and resolution of data discrepancies. This chapter covers development, maintenance, and implementation of instructions for CRF completion, also called CRF Completion Guidelines (CCGs). Recommendations in this chapter are based on the International Council for Harmonisation (ICH) E6 addendum,1 the MHRA GXP Data Integrity Guidance and Definitions, review of the literature, and writing group consensus.

Keywords: CRF, Completion, Guidelines

How to Cite:

Hills, K. & Bartlett, T. & Leconte, I. & Zozus, M. N., (2021) “CRF Completion Guidelines”, Journal of the Society for Clinical Data Management 1(1). doi: https://doi.org/10.47912/jscdm.117



1) Learning Objectives

After reading this chapter, the reader should understand

  • The purpose of a regulatory basis for CRF completion guidelines

  • The contents and organization of CRF completion guidelines

  • Creation and maintenance of CRF completion guidelines

  • Training clinical investigational sites and CRAs on CCGs

2) Introduction

Data collection forms, commonly called Case Report Forms (CRFs) in clinical studies, have been used since the earliest studies. The main goal of paper and electronic CRFs alike is the consistent and accurate collection and recording of data. CRF Completion Guidelines (CCGs) support this by detailing the activities involved in CRF completion, correction, signing, and data handling.2 Problems in data collection may result in inaccurate, unusable, or lost data. Further, in some cases, after the time of data collection has passed, so may the opportunity to retrieve lost data or to correct inaccurate data.3,4 As such, CRFs and associated instructions are a critical tool in preserving and maintaining the quality and integrity of data.2 Activities to assure data quality should be implemented as early in the data collection process as possible.4 Form completion instruction and controls are one of these early opportunities for assuring human subject protection and data quality.

Lack of adequate instruction on data collection forms has been cited as a common problem in clinical studies.5,6,7 CRF Completion Guidelines (CCGs) provide field-specific instructions in support of correcting this problem and are ubiquitously recommended in the literature.2,3,5,6,7,8,9,10 Where a site manual of operations (also called manual of procedures) does not exist for a study, the CCGs are the most detailed specification of the procedures by which data from observations and measurements are to be obtained and recorded. The purpose of well-written and comprehensive CCGs is to increase data accuracy and consistency, provide traceability for decisions made during data collection, and decrease downstream work including data queries, monitoring questions, and audit findings. CCGs accomplish this by elaborating where needed on observation and measurement procedures defined in the study protocol as well as specifying and constraining decisions made during data collection and recording on study forms. Though, “Creation of a data collection form is often mistakenly viewed as a clerical rather than a scientific task”3 data observation, measurement, and collection are among the most scientifically important activities in a study. Anything short of scientifically rigorous treatment of these activities is ill advised.

Form completion instructions may include diagnostic criteria, definitions of terms used on the form, specifications of time points for observations, measurement methods and equipment, units, precision, and significant figures for continuous data elements, as well as guidelines for handling variability, uncertainty, inconsistency, and error found in source documents or encountered in measurement. When study conduct necessitates decisions such as coding, calculations, or classification of data by sites during data collection, these are specified in CCGs. As such, the CCGs establish traceability for data origination and collection activities.

3) Scope

This chapter describes creation and maintenance of CCGs, their format, content, and implementation toward the precise, accurate, and consistent capture of clinical study data. CRF completion guidelines may cover observation and measurement procedures, important relationships between data elements, instructions as to where data values are likely to be found in the medical record, and which data values to choose as well as how to record the data on collection forms.

4) Minimum Standards

CCGs specify operations performed on data during observation, measurement, abstraction from source documents, and form completion. Regulation and guidance also address these processes. The ICH E6(R2) Good Clinical Practice: Integrated Addendum contains several passages particularly relevant to CCGs.1

Section 2.10 states, “All clinical trial information should be recorded, handled, and stored in a way that allows its accurate reporting, interpretation, and verification.”

Section 4.9 covers site responsibilities with respect to records and reports.

Section 4.9.0 states that the investigator should maintain, “adequate and accurate source documents and trial records” and goes on to specify, “Source data should be attributable, legible, contemporaneous, original, accurate, and complete” and that “changes to source data should be traceable, should not obscure the original entry, and should be explained if necessary (e.g., via an audit trail).”

Section 4.9.1 places responsibility on the investigator for ensuring the “accuracy, completeness, legibility, and timeliness of the data reported to the sponsor in the CRFs and in all required reports.”

Section 4.9.2 states that data, “reported on the CRF, that are derived from source documents, should be consistent with the source documents or the discrepancies should be explained.”

Section 4.9.3 further emphasizes good documentation practices, stating, “any change or correction to a CRF should be dated, initialed, and explained (if necessary) and should not obscure the original entry (i.e., an audit trail should be maintained).”

Section 4.9.3 goes on to state, “Sponsors should provide guidance to investigators and/or the investigators’ designated representatives on making such corrections.” and that “Sponsors should have written procedures to assure that changes or corrections in CRFs made by sponsor’s designated representatives are documented, are necessary, and are endorsed by the investigator. The investigator should retain records of the changes and corrections.”

Section 5.0 states, “The sponsor should implement a system to manage quality throughout all stages of the trial process.” and goes on to specify that

  1. “Sponsors should focus on trial activities essential to ensuring human subject protection and the reliability of trial results” and that

  2. “The methods used to assure and control the quality of the trial should be proportionate to the risks inherent in the trial and the importance of the information collected.” Identification of “processes and data that are critical to ensure human subject protection and the reliability of trial results” is specifically stated, as is risk management focused on the processes and data deemed critical.

Section 5.0 further states, “Protocols, case report forms, and other operational documents should be clear, concise, and consistent.”

Section 5.1.1 states that “The sponsor is responsible for implementing and maintaining quality assurance and quality control systems with written SOPs to ensure that trials are conducted and data are generated, documented (recorded), and reported in compliance with the protocol, GCP, and the applicable regulatory requirement(s).” In particular, Section 5.1.3 states that “Quality control should be applied to each stage of data handling to ensure that all data are reliable and have been processed correctly.”

Section 5.5.1 states, “The sponsor should utilize appropriately qualified individuals to supervise the overall conduct of the trial, to handle the data, to verify the data, to conduct the statistical analyses, and to prepare the trial reports.” Section 5.5.3 (e) further states that the sponsor should “Maintain a list of the individuals who are authorized to make data changes.”

Section 5.5.4 under Trial Management, Data Handling and Recordkeeping, states, “If data are transformed during processing, it should always be possible to compare the original data and observations with the processed data.”

The Medicines and Healthcare products Regulatory Agency (MHRA) GXP Data Integrity Guidance and Definitions addresses principles of data integrity, establishing data criticality and inherent risk, designing systems and processes to assure data integrity, and also covers the following topics particularly relevant to CCGs:

Similar to ICH E6(R2), MHRA Section 2.6 states that “Users of this guidance need to understand their data processes (as a lifecycle) to identify data with the greatest GXP impact. From that, the identification of the most effective and efficient risk-based control and review of the data can be determined and implemented.”11

Section 6.2, Raw Data states, “Raw data must permit full reconstruction of the activities.”

Section 6.4 states, “Data integrity is the degree to which data are complete, consistent, accurate, trustworthy, and reliable and that these characteristics of the data are maintained throughout the data life cycle. The data should be collected and maintained in a secure manner, so that they are attributable, legible, contemporaneously recorded, original (or a true copy) and accurate”

Section 6.7 Recording and Collection of Data states, “Organisations should have an appropriate level of process understanding and technical knowledge of systems used for data collection and recording, including their capabilities, limitations and vulnerabilities.” and that “The selected method [of data collection and recording] should ensure that data of appropriate accuracy, completeness, content and meaning are collected and retained for their intended use.” Section 6.7 further states, “When used, blank forms … should be controlled. … [to] allow detection of unofficial notebooks and any gaps in notebook pages.”

Section 6.9 Data Processing states, “There should be adequate traceability of any user-defined parameters used within data processing activities to the raw data, including attribution to who performed the activity.” and that “Audit trails and retained records should allow reconstruction of all data processing activities…”

The FDA guidance, Use of Electronic Health Record Data in Clinical Investigations, emphasizes that data sources should be documented and that source data and documents be retained in compliance with 21 CFR 312.62(c) and 812.140(d).12

Section V.I states that “Clinical investigators must retain all paper and electronic source documents (e.g., originals or certified copies) and records as required to be maintained in compliance with 21 CFR 312.62(c) and 812.140(d).”

Similarly, the FDA’s guidance on electronic source data used in clinical investigations recommends that all data sources at each site be identified.13

Section III.A states, “A list of all authorized data originators (i.e., persons, systems, devices, and instruments) should be developed and maintained by the sponsor and made available at each clinical site. In the case of electronic, patient-reported outcome measures, the subject (e.g., unique subject identifier) should be listed as the originator.”

As such, we state minimum standards for the creation, maintenance, and implementation of CCGs in Table 1.

Table 1

Minimum Standards.

1 CCGs specify procedures for observation, measurement, abstraction from source documents, and form completion. As such, they support the evaluation of study conduct and the quality of the data produced. CCGs should exist for every study.
2 CCGs should specify procedures for assuring that data are Attributable, Legible, Contemporaneous, Original, and Accurate, Complete, Consistent, Enduring, and Available (ALCOA +) and Traceable.
3 CCGs should exist within a quality management system focused on “ensuring human subject protection and the reliability of trial results”1 and, in particular, decisions affecting which data are used and their transformation during data origination, collection, and recording.
4 CCGs should be considered essential documents and managed as such. A standard operating procedure(s) covering the process by which CCGs or equivalent documentation are created, versioned, reviewed, approved, updated, and distributed should exist.
5 CCGs are developed for the use of study personnel, usually site coordinators and monitors.
6 CCGs should be concise, current, easy to understand, and available to those performing relevant study operations.
7 Training on CCGs should be provided and documented for individuals with responsibility in observation, measurement, abstraction, and form completion processes. Such training should occur prior to study enrollment and should be revisited upon significant updates to CCGs.
8 The quality management system in which the CCGs exist should provide for ongoing oversight and control of observation, measurement, abstraction, and form completion processes.

5) Best Practices

Best practices were identified by both the review and the writing group and are presented in Table 2. Best practices do not have a strong requirement based in regulation or recommended approach based in guidance, but do have supporting evidence either from the literature or consensus of the writing group. As such best practices, like all assertions in GCDMP chapters, have a literature citation where available and are always tagged with a roman numeral indicating the strength of evidence supporting the recommendation. GCDMP Levels of Evidence are outlined in Table 3.

Table 2

Best Practices.

1 Creation and Maintenance of CCGs Develop guidelines in collaboration with the same roles that designed the CRF. These include protocol authors, form designers, investigators, practicing physicians, statisticians, site and medical monitors, site-based study coordinators, those familiar with the study database system, data entry, and data processing.2,3 [V], [VII]
Develop standard CCGs that accompany standard CRF modules that can be used across studies if external standards do not exist.3 [V]
Where external standards exist for data element definition and collection instructions, use them if appropriate for the study.5 [V]
Allow sufficient time for development and testing of forms and instructions.3 [V]
CRF and CCGs cannot be finalized prior to finalization of protocol.3 [V]
Design CRFs and associated CCGs simultaneously with protocol development.3 [V]
Hold dedicated meetings for timelier review and finalization of the CCGs.10 [V]
2 Format of the CCGs Ensure that the format and content of the CRF/eCRF and the CCGs that provide instructions the form completer are “self-contained;” i.e., with all needed instruction or context available on the CRF/eCRF.3 [V]
Ensure that standard CRF modules are accompanied by associated CCGs and QA guidelines.10 [V]
3 Content of the CCGs Include detailed instructions on proper CRF completion where needed; i.e., where proper completion is not obvious based on form context.2 [VII]
Do not ask leading questions or otherwise suggest answers to users completing the forms.3 [V]
Ensure forms are clear, provide necessary instructions, and are easy for the investigator to complete.3 [V]
Place instructions and graphics that guide form flow on the form so that it is clear where to stop procedures or form completion or where to skip to.2, 3 [V], [VII]
Clearly state on the form the circumstances under which an item should be skipped.3 [V]
Provide instructions for recording missing data. For example, include instructions to leave an item blank or to provide more information such as “asked but not answered” or “not done.” [VI]
Provide necessary definitions and instructions on the form, next to the item to which they apply.3 [V]
Accommodate linguistic and cultural differences within the CCGs.10 [V]
Include on the form all of the information needed to understand an item on a form. In addition to the prompt or question, it may be necessary to include a basis of comparison; e.g., over the last 24 hours, since the last visit, the assessment, time points and units of measure, precision and number of significant figures, measurement method.3 [V]
Provide explicit guidance as to order of day and month and to clarify noon versus midnight on a twelve-hour clock.3 [V]
Define important diagnoses with clear criteria.3 [V] Often there are multiple criteria sets in use for a given diagnosis. Specificity avoids confusion.
If calculations are required to inform immediate site action and these cannot be automated instructions (e.g., a worksheet) on how to complete and check the calculations should be provided.3 [V]
Clearly state within the form’s instruction the role of the individual(s) completing the form, e.g., physician, research staff, patient, or proxy. [VI]
4 Implementation of the CCGs Use innovative technology when possible to improve the usability, accessibility, and availability of CCGs. For example, CCGs may be included in electronic help and be available on the screen. [VI]
Provide training on CRF completion.10 [V] Such training may be conducted in person at an investigators’ meeting (or similar forum), on site initiation visits, or remotely.
Use appropriate techniques such as analysis or data trends or review of monitoring reports to identify undesirable events and trends in data collection and recording to prompt improvement of CCGs. [VI]
Re-educate site personnel as needed and revise CRF completion guidelines as necessary, particularly for long-term studies or if a protocol amendment affects the completion of the CRF. [VI]
Provide data management, biostatistics, medical writing, and other clinical research team members with finalized CRF completion guidelines so these groups are aware of how data are collected and recorded. [VI]
Establish metrics through which site performance in CRF completion will be assessed at a frequency commensurate with the study length. [VI]
CCGs should be tested by study staff.3, 7 [V], [III] Testing minimizes changes.
Question or the wording of prompts can influence the answer. All questions and wording of prompts should be reviewed for its potential to bias data collection or recording.3 [V]
Table 3

GCDMP Evidence grading criteria.

Evidence Level Criteria
I Large controlled experiments, meta, or pooled analysis of controlled experiments, regulation or regulatory guidance
II Small controlled experiments with unclear results
III Reviews or synthesis of the empirical literature
IV Observational studies with a comparison group
V Observational studies including demonstration projects and case studies with no control
VI Consensus of the writing group including GCDMP Executive Committee and public comment process
VII Opinion papers

Processes for Creation and Maintenance of CCGs

Because the CCGs document the process by which data are collected or recorded, they should be considered essential documents.1 [I] As such, the procedures for creation, approval, and change control should be documented in organizational procedures.1,2 [I], [VII]

Increasingly, standards exist for data elements used in clinical studies and instructions for observing, measuring, or otherwise obtaining the corresponding data. Examples of these include the Brighton Collaboration guidelines for collection, analysis, and presentation of vaccine safety data5 [V] and The Joint Commission Core Measures abstraction guidelines.14 [V] Where such standards exist and are appropriate, using them increases the ability to pool data and compare results with other studies. Specialty, discipline, or organizational standards capture this knowledge through ongoing improvement of forms and associated instructions.3 [V] Such specialty or discipline level standards are not yet available in many areas. Where such standards do not exist, investigators develop data collection forms “from scratch” often without the benefit of experiential knowledge gained from earlier studies.3,7 Use of organizational standard forms and associated instructions, while only providing the aforementioned advantage for studies within the organization, still provide advantages in terms of consistency and efficiency in study start-up and data collection. Lack of standardization of data definition and collection has been associated with the inability to compare trial results across different studies.5,7,15 [V], [III], [III] and settings, as well as the creation of difficulties drawing conclusions from groups of studies.5 [V] Thus, use of standard forms and associated completion instructions is recommended with priority given to specialty or discipline level standards.2,3,5 [V], [V], [VII]

CCGs accomplish their goal of increasing consistency in data collection and recording by serving as a job aid to those collecting and recording data. As such, they should be written in plain and precise language and simple sentences.1 [I]3 [V] Unnecessary words and double negatives should be avoided.3 [V]

Timing of CCG creation

While some recommend starting CRF design after a finalized protocol, ostensibly to reduce rework in form design as the protocol evolves toward finalization, others recommend simultaneous work on the protocol and CRF.2,3 Because the process of designing a CRF and completion guidelines may identify areas where additional clarity is needed in the protocol or where data required for the protocol are not available or feasible to collect, we recommend the latter.3 [V] Further, those who develop forms and completion instructions should be intimately involved in protocol development or work closely with those who are.3 [V] This involvement assures that those designing forms and completion instructions understand the study objectives and rationale behind the collection of each data point.3 [V]

Authorship of the CCGs

The author initiates the creation of the CRF completion guidelines document during or following CRF design. The person drafting the CCGs must be familiar with the protocol and corresponding CRF.3 [V] In addition, and to the extent that the medical record is the intended source, the CCG author must understand the data collected including how relevant data are documented in routine care and where they are commonly found in medical records. [VI] The CCG author should also understand the quality requirements for the data and how the data will later be used for the analyses. [I]

A data manager or anyone with the appropriate knowledge of the protocol and relevant data can serve as the author of CCGs. The CCGs are developed in close collaboration with the following members of the study team.2,3 [V], [VII]

  • a protocol author, clinical scientist, or a clinical study physician familiar with the study objectives and therapeutic area

  • a biostatistician with knowledge of the statistical analysis plan for the study

  • a drug safety physician or the study medical monitor

  • team members responsible for site initiation and study monitoring or others having regular contact with site staff

  • those familiar with the study database system, data entry, and data processing

Review, approval, and revision of CCGs

The study team members outlined in the previous section should review the draft CCGs. [VI] The review should focus on ensuring that the CCGs are complete, correspond to the protocol, and provide adequate specification to the investigators, site staff, and monitors who will be using the guidelines. [VI]

The CCGs impact data collection and should be managed as a controlled document.1 [I] As such, they are usually referenced by a study Data Management Plan. Please see the Data Management Plan chapter for more information, including recommendations, minimum standards, and best practices. Study, document, and version identification should be visible on each page of printed CCGs and otherwise associated with CCGs in electronic formats.1 [I] As a controlled document, changes to approved CCGs should lead to a new version of the CCGs and should be reviewed and approved.1 [I]

The CCGs should be revised when any of the following occur: [VI]

  • a protocol amendment is issued that has an impact on CCGs,

  • changes to the database affect the eCRF completion guidelines

  • when a trend in queries is identified that show that the CCGs are not adequately guiding the site staff on CRF completion

  • an error in the CCGs has been identified that has an impact on the CRF completion

The changes made to the CCGs should be highlighted or summarized, e.g., in a revision history section in the new version, in order to help study personnel to identify the changes. [VI]

Distribution of CCGs

Enough time must be allocated to create, review, and approve the CCGs. [V] Approved CCGs should be made available to the site staff before they enroll any subjects in the study.1 [I] For example, the site initiation visit can be used to familiarize the site staff and monitors with the CCGs.

Where CCGs include medical record abstraction guidelines, they should be reviewed and tested by several sites prior to use.3 [V] It is often not possible to reflect intricacies of every site’s medical record; things like chart order, where things are documented in the chart, and clinical documentation conventions and practices differ by facility. Therefore, what may be specific and accurate direction for one site may not match the record of another.

If CCGs are not electronically available through the EDC system and a separate document is being used for the CCGs, the distributed copy should be made available to personnel involved in data collection and recording.1 [I] A copy should also be filed in the Investigator site file.1 [I] The CCGs should also be distributed to central study team members and be filed in the sponsor’s Trial Master File.1 [I] In cases where the CCGs are available through help text in an EDC system, site training should include how to access the guidance. [VI] Please see the EDC chapters for more information, including recommendations, minimum standards, and best practices. Where CCGs are available on the screen (on line), a hard copy or printed version should also be available. [VI]

Training sites on form completion

Sites should be trained on form completion prior to enrolling subjects in a study.1 [I] Training should occur on approved versions of the CCGs. [VI] Please see the Presentation at Investigator Meeting chapter for more information, including recommendations, minimum standards, and best practices. Where CCGs include medical record abstraction guidelines, such training should include practice abstracting, independent review of the practice abstraction and feedback to the trainee to assure the necessary inter-rater-reliability prior to enrollment.6 [III]

Format of CCGs

CCGs can have multiple formats depending on the needs of the study. The author of the CCGs determines the best medium to use. For studies utilizing paper, CRFs the CCGs are often provided as guidance within the CRF booklet.2 These may be provided to study personnel as a printed hard copy or offered electronically. For EDC studies, the electronic version may be made available within the EDC platform. Alternatively, form specific instructions may be included as help text directly within each eCRF to aid with the more complicated form/field entry and to help minimize the number of queries. The format chosen must allow for clear and concise instructions that align with the study protocol and other study documents such as the Clinical Monitoring Plan, Data Management Plan, and External Vendor Manuals.2 [VII].

As best practice, it is useful for organizations to create a CCG template that can be used across studies.3,16 [V] [V] This will allow for consistency in the format and look of the guidelines and result in the creation of CCGs being more efficient. These templates may consist of CRF modules, associated CCGs, and applicable quality assurance guidelines.10 [V]

CCG format for paper forms

There are several options for placement of form instructions including: on adjoining facing pages, on the top of the page, throughout the page, and on the front page for the visit. Placing instructions on the back of the page to which they refer is not recommended because they cannot be viewed while completing the form.3 [V] Spilker and others recommend placing instructions on adjoining facing pages, i.e., on the back of the preceding page, for long or more complicated instructions, and throughout the page for simple instructions.2,3 [V], [VII]

CCG Format for Electronic Forms

Electronic forms as described in the EDC Chapters provide additional options for making instructional information available during form completion. Such options include mouse-over or click-to-open help on a per question basis. Further, as described in the EDC chapters, electronic forms provide the ability to enforce data element structure such as “Select only one” or code lists for discrete data elements and significant figures and precision for continuous measures. Workflow such as skip patterns, stops, and availability of conditional and additional forms, may also be enforced. Such workflow automation is a form of external representation in that instructions are embedded in the functionality of the system and do not depend on a form completer reading or attending to them.17 Thus, such external representations decrease cognitive load17 and increase data accuracy18 and, as such, are recommended wherever possible. [VI] Please see the EDC chapters for more information, including recommendations, minimum standards, and best practices.

Regardless of the format, each question for which instructions exist should indicate where instructions are to be found.3 [V]

Outline of CCGs

The CCGs should be based on the protocol and case report forms. [V] CCGs should provide unambiguous instructions on CRF completion for, “all practical scenarios” that a one might encounter such as multiple data values, repeated assessments, data collected outside the study schedule, data corrections, and data resulting from unanticipated events.2 [VII] CCGs often contain the following details (listed below) to ensure that proper resources and instructions are provided to study personnel.

Identification of the data source

Sections 2.10 and 4.9 of E6(R2) indicate that the CCGs or other study documentation should identify the expected source for all study data.1 [I] There may be legitimate differences in data sources across sites; for example, where a parameter is collected as part of routine care at some sites but not at others, the sites documenting the parameter during routine care may use the medical record as the source whereas sites that do not document the parameter as part of routine care may use the CRF or a site worksheet as the source. Procedures should account for site-specific documentation of data sources where facility-to-facility variability is expected. [VI]

General conventions for form completion

The CCGs should include general guidelines as to the expected turnaround time for CRF completion, e.g., according to E6(R2) section 4.9.0, general timeline expectations as well as expectations for contemporaneous data recording1 [I] as well as detailed descriptions of the expected data formats. This would include items such as the proper date format to be expected (e.g., DD-MMM-YYYY) or indicating how to document partial dates, if acceptable. Structure for responses such as formatted dates and a blank for each character of a continuous measure with the decimal places are important form completion instructions.2 [VII] As external representations of the expected format of the response, they guide the form completer.

Clarifying rounding rules and abbreviations and how to properly document visits or assessments that were not performed should be clearly detailed. The CCGs should provide field definitions in cases where the field needs more guidance to reduce ambiguity.2 [VII] Screen shots of the CRF can be added where needed to clarify instructions.

Instructions should also be used to call out linked data; for example, where an adverse event indicates a drug was given, prompting the form completer to enter the drug on the concomitant medication page.2 [VII] Electronic CRFs can go further and enforce such instruction by requiring presence of the linked data. Please see the EDC chapters for more information, including recommendations, minimum standards, and best practices.

For paper studies, it is important to outline how to complete the forms ensuring legible entry utilizing indelible ink. How to properly document any required updates by ensuring the original text is still visible, including adding the initials and date of the person completing the update, should be clearly detailed in CCGs, per ICH E6(R2), section 4.9.3 and Bellary.1,2 [I] [VII] Clarification on how the paper CRFs are to be delivered to Data Management may also be outlined here.

CCGs written for site investigators and research staff differ from those needed by patients. Where forms designed for one type of form completer will be utilized with a different type of form completer, the language and type of instructions provided in the CCGs should be re-evaluated.7 [III]

Accommodating linguistic and cultural differences within the CCGs

For international studies or studies where participants from different cultures or who speak different languages are expected, the CCGs may need to provide support to sites in accounting for those differences.10 [V] For example, where lay health workers are involved in the study, CCGs may need to be translated to local language. Language differences aside, how local differences in data are to be handled, for example by converting units or obtaining source documents from different places, may need to be accounted for in the CCGs.

Description of form structure and workflow

For EDC studies, a section of CCGs should be devoted to clarifying the field/eCRF dynamics that have been included in the study design. For example, specifying which eCRFs are present once a subject is created in EDC and what entry is required in order for additional forms or visit folders to populate. This will help ensure that site personnel understand how to complete all of the expected entry. Outlining which eCRFs are required based on a subject status should also be included in this section. For example, the complete casebook may be expected for a subject who completed the study per protocol; however, only a selected amount of screening eCRFs may be required for collection on a subject who is a screen failure.

Where to locate information in the Medical Record

Where the medical record is the source of the information, the process of reviewing the medical record and identifying the needed data is called Medical Record Abstraction (MRA) or chart review. Form completion instructions should specify where in the chart data needed for the CRF is to be found.6 [III] Special consideration should be given to the impact of time.8,9 [V] Similarly, special consideration may need to be given to the location in the record from which the information is to be extracted. Examples of such considerations include specification between a five versus ten minute APGAR score; between ejection fraction from a Trans-esophageal versus a trans-thoracic echo; between a medication order, a medication administration record, and medication reconciliation data; between a problem list diagnosis and information in a pathology report, a machine versus physician interpretation of an ECG, obtaining diagnostic test results from a test report versus from a discharge summary, etc. Form completion instructions should also address common variability in clinical settings and resulting imperfections in clinical data.8,9 [V] For example, what to do when data within the protocol-specified time window or from a specific location are not present but other data values are, or multiple data values are present, and whether to seek clinical records from another facility.

Medical evidence was categorized by Feinstein et al. as a description, a designation, or an interpretation.8,9 These are fundamentally different processes. Rebound tenderness in the right lower quadrant, pain, fever, and elevated white blood cell count are descriptions.9 [V] These descriptive items are directly perceived, measured, or asked of research subjects19 [VII] and can be observed systematically and often objectively. Assigning the diagnosis of appendicitis on the other hand is a designation, and infected vermiform appendix or rupture is an interpretation (until observed directly such as on an image or during surgery). Feinstein et al. point out that descriptions can be cited directly whereas designations and interpretations are arbitrary and require criteria.9 [V] Such criteria should (1) be provided in CCGs and, while not often done in practice, (2) be validated a priori to be reliable through measures such as inter-rater reliability or a Kappa statistic or be characterized in terms of sensitivity, specificity, positive predictive value, and negative predictive value against a gold-standard or, at minimum, should be characterized with inter-rater reliability during the study.6 [III] Such criteria-based and objective consistency is necessary in experimental designs such as randomized clinical trials and guidance for developing them can be found in Feinstein et al.9 [V] The issues of subjectivity and irreproducibility in designation and interpretation are the rationale behind the recommendation to (1) collect “raw” or “primary,” i.e., original descriptive data and to (2) process the data in subsequent steps. Types of challenges using medical records as source include the following9:

  • Missing or otherwise imprecise data in descriptive information occurs when the medical record does not contain documentation of the desired observations or test results. In this case, the CCGs can only document applicable “null flavors” and when to use each. (See the instructions for handling missing data section below.) A special case occurs when data expected given a particular medical condition is missing, for example, a white blood cell count in a patient with a fever of unknown origin. Because the lab value is routinely charted in this case, some tend toward considering its absence as a likely indication that it was not done while others tend toward unknown. While the choice between these two labels for missing data does not matter clinically, to assure consistency and prevent later work in the Source Data Verification or data cleaning processes, CCGs should indicate which to choose. [VI]

  • Uncertain information occurs when the medical record contains vague language or vague notation of clinical information. Such language is often a reflection of the uncertainty present in clinical situations and medical decision-making. Examples include measured values stated as a range or limit such as “blood glucose > 300 mmol/L” in a case where multiple measures were taken, variability was noted but it was clear that the observed values were in the high and range. CCGs should indicate how uncertain quantitative information should be recorded and how multiple measures should be handled in the case where more than one value would meet the criteria for the singular field on the form. Clear instruction on which value should be chosen such as “the peak (or trough or average) value within the period” or “the first (or last or middle) value of the period.” A similar situation surrounds designations of symptoms and diagnoses. For example, a CRF may require a yes/no response for “Positive fecal occult blood” but the medical record states, “dark tarry stool” or “scant bright red blood reported with last bowel movement,” or a patient may report feeling “hot” for the past two nights and “sweating” but did not measure a temperature yet the CRF requires yes/no indication of fever within two days of admission. Such cases also occur in clinical diagnosis where early in the diagnostic process for example, the record may state a Bipolar diagnosis and state possible psychosis, an emergency department work up for chest pain might be documented as possible myocardial infarction in which case later confirmation (or not) would be expected elsewhere in the chart. Such variability and uncertainty can be expected in clinical documentation in many therapeutic areas. The data management goals here are two-fold: (1) accurately reflecting the uncertainty and (2) consistency in how the uncertainty is reflected in the CRF. CCGs should indicate how such foreseeable uncertainty should be recordable on the CRF because it is reflective of reality. [VI] Uncertainty in attribution of a symptom to a disease, identifying the initial clinical manifestation, and identifying a precipitating event are common and instruction is required to achieve consistency in the abstraction process.9 [V]

  • Inconsistent information occurs in the medical record when two reports from the same or different reporters, measurements, places in the record fail to agree. Given the extent to which data are pulled forward from one assessment to another, summarized, re-reported, measured by a different method, or documented by a second observer, we should expect medical records to contain many inconsistencies. CCGs should anticipate important data for which such inconsistencies may occur and provide instruction as to which value to choose.[VI]

  • Errors also occur in the medical record. Given common practices indicated above, some data values in the medical record such as information in discharge summaries and clinical notes will have undergone several transformation steps.3,20,21 Some of the inconsistencies may be errors, and error can exist without being inconsistent with other information in the record. CCGs should anticipate important data for which such errors may occur and provide instruction as to which value to choose. [VI]

In all of these cases, study leadership can set any categorization, convention or decision rule to be followed in abstracting data. Such categorization schemes, conventions, and decision rules are arbitrary and chosen based on the type of data, and the purpose of the study. As long as these are set a priori, scientifically valid, bias free, logically consistent, reasonable to implement, reproducible, clearly stated in the CCGs, and are applied diligently, they will increase consistency of the abstraction and provide traceability.9 [V] At the same time, caution is wise; such rules to assure consistent abstraction will never account for all possible cases and as such constrain an abstractor’s freedom to choose the most clinically relevant value. Because these categorization schemes, conventions and decision rules represent data transformations and as such explain why one value was chosen over another their documentation is required for traceability and they will be consulted during audits and inspections.

The examples in this section emphasize the need for practicing clinicians to be involved in development of CRF completion instructions, for data managers to be familiar with the clinical documentation practices in a therapeutic area, and for study staff at multiple centers to test forms and completion instructions. [VI]

Where to locate other data

Include clear and precise instructions on where external data such as bottle numbers, kit numbers, or accession numbers are to be located. A description of the number should be included; for example,” the 10-digit kit number is located in the upper right-hand corner of the kit.” Include a visual example so that the information can be unambiguously identified.

Field specific instructions for form completion

For each CRF field, a field definition (operational definition) should be provided where needed to reduce ambiguity.2 [VII] Medical record abstraction instructions, inclusive of where to find data values in the medical record and which values to use when multiple results are recorded, should be provided for data expected to come from the medical record.6 [III] Definitions for discrete response options should be included where needed for consistency. [VI] Terms such as ‘‘low-grade,” ‘‘mild,” ‘‘moderate,” ‘‘high,” ‘‘severe,” or ‘‘significant” are prone to wide interpretation. Where subjective classification cannot be avoided, each category should be clearly defined with definitions available during form completion.5 [V] Such classification should be validated or characterized by calculation of inter-rater reliability with the instructions tested used in the form completion instructions.5 [V]

Including directions on the CRF

All of the information needed to understand the question should be on the form including basis of comparison, e.g., “over the last 24 hours,” “since the last visit,” assessment, time points and units of measure, precision and number of significant figures, measurement method.3 [V]

For emergency medicine and inpatient studies, careful definition and instruction must be given regarding important study patient milestones. Designation of the timing of an index event such as occurrence of cancer, myocardial infarction, stroke, bleeding, or a psychotic episode may seem simple, but there are multiple choices such as symptom onset, first treatment, or hospitalization. However, these may be nuanced in clinical settings; for example, is new onset ischemia or myocardial infarction within 24 hours of a coronary intervention a new event or a complication of treating the initial event?8,9 [V]

Instructions for handling missing data

There are multiple reasons why a datum might be missing. Because one of these reasons is oversight, and because data are usually important to be collected, instances of missing data are usually checked. CCGs should provide clear direction or a mechanism to document the reason for a missing datum.2,3 [V], [VII] The most complete categorization of reasons for missing datum is in the ISO 21090 standard. The standard calls reasons for missing “null flavors” and defines a null flavor as an ancillary piece of data providing additional (often explanatory) information when the primary piece of data to which it is related is missing. The ISO 21090 list of null flavors includes familiar values like Unknown, Other, Asked but unknown, Masked, and Not Applicable among its fourteen terms.22 Null flavors for all required data should be enumerated and defined in CCGs. [VI] Some EDC systems may provide special functionality for associating a missing value with a reason why it is missing.

CCGs should include or reference the time and events table and clearly specify the minimum data required for screen failures, early terminators and lost to follow-up patients. Any additional special data collection rules for these and similar situations should also be provided. Some EDC systems may provide special functionality for controlling the visibility of pages once a subject is indicated as an early-terminator missing.

CCGs should include instructions on how to mark empty pages and any scenarios that require different handling of empty pages. For paper studies, the CCGs should further specify disposition of empty pages such as sending them to the data center with the headers completed and otherwise marked empty or leaving them in subject binders to be retrieved and reconciled at close-out. Some EDC systems may provide special functionality for associating a missing page with a reason why it is missing or for marking a missing page.

Forms and fields requiring monitoring

For EDC studies, some organizations find it helpful to include instructions on steps required for the monitor to take in order to indicate that source data verification has been completed. Likewise, details for adding, canceling, answering, and closing queries are helpful if the role that is expected to perform monitoring has these rights within the system. While the former is merely informative to sites regarding how monitoring will occur and be documented, the latter includes steps that site personnel are required to take and should be available to sites in CCGs or other documentation. [VI]

Calling special forms to attention, e.g., patient completed questionnaires

Although site personnel are not usually responsible for transcribing or entering data from a patient completed questionnaire, instructions for such may be included in CCGs. [VI] When transcribing, entering or managing patient-reported data, changes should not be made unless agreed procedures and conventions exist and are exhaustively documented.1 [I] Additional directions to sites may include details of instructions to be provided to subjects before completing the questionnaire or procedures for reviewing the responses for completeness prior to the subject leaving the site. [VI] Please see the Patient Reported Outcomes (PRO) chapter for more information including recommendations, minimum standards, and best practices.

Calling to attention data collected by external devices

For EDC studies, if data are integrated from external sources it may be helpful to communicate the expected frequency of the data integration. Clarifying the data points that will not be enterable by site personnel and providing details as to when such data will be available through the EDC system and how to report or respond to reported discrepancies in external data should be included within this section of the guidelines. [VI] Please see the EDC and Integration of External Data chapters for more information, including recommendations, minimum standards, and best practices.

Forms requiring Investigator signature

Forms requiring Investigator signature should be specified in the CCGs. [VI] Although not all CRFs may require signature, the details provided to the investigators should remind them that they are ultimately responsible for all data submitted within the subjects’ casebooks. [VI]

For EDC studies, instructions on steps required for investigators to apply their electronic signatures should also be provided. [VI] Details for removing a signature or how data modifications may necessitate re-signing are also helpful tips to consider including.2 [VII]

Contact information

A contact for questions and clarifications should be identified within the CCGs. [VI]

6) Recommended Standard Operating Procedures

Section 5.0.1 of ICH E6 states that, “During protocol development the Sponsor should identify processes and data that are critical to ensure human subject protection and the reliability of trial results.1” This implies that organizations should map out the processes involved in study design, start-up, conduct, and closeout and make explicit decisions about which are considered to impact human subject protection and the reliability of trial results. Organizational processes may be partitioned differently leading to different scope and titles for SOPs. We provide the following as a list of processes commonly considered to impact human subject protection and the reliability of trial results. Organizations may differ as to how these processes are covered in SOPs.

  • Creation, approval and change control of CCGs [I]

  • Training investigators, site staff and monitors on CCGs [I]

7) Literature Review details and References

This revision is based on a systematic review of the peer-reviewed literature indexed for retrieval. The goals of literature review were to (1) identify published research results and reports of evaluation of new methods regarding CRF Completion Guidelines and (2) identify, evaluate, and summarize evidence capable of informing the practice of CCG creation, maintenance, and implementation.

The following PubMed query was used:

(“form completion” OR “CRF completion” OR “CRF guidelines” OR “data collection guidelines” OR “medical record abstraction form” OR “chart review form” OR “chart review form”)

The search query was customized for and executed on the following databases: PubMed (78 results), CINAHL (1 results), EMBASE (156 results), Science Citation Index/Web of Science (3 results), PsychSOURCE (0 result), Association for Computing Machinery (ACM) Guide to the Computing Literature (not searched due to lack of dependence on CCGs on computers), the Institute of Electrical and Electronics Engineers (IEEE) (0 results). A total of 238 works were identified through the searches. The searches were conducted in February. Search results were consolidated to obtain a list of 208 distinct articles. Because this was the first review for this chapter, the searches were not restricted to any time range. Literature review and screening details are included in the PRISMA diagram for the chapter, which follows the references.

Figure 1
Figure 1

PRISMA* Diagram for CRF Completion Guideline Chapter.

*PRISMA is the acronym for the Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

Two reviewers used inclusion criteria to screen all abstracts. Disagreements were adjudicated by the writing group. Twenty articles meeting inclusion criteria were selected for review. Two individuals reviewed each of the twenty selected articles and the eight additional sources identified through the review. Each was read for mention of explicit practice recommendations or research results informing practice. Relevant findings have been included in the chapter and graded according to the GCDMP evidence grading criteria described in Table 3. This synthesis of the literature relevant to CRF Completion Guidelines supports transition of this chapter to an evidence-based guideline.

Additional File

The additional file for this article can be found as follows:


Example CRF Completion Guidelines. DOI: https://doi.org/10.47912/jscdm.X.s1

8) Revision History

Date Revision description
September 2002 Initial version of the CRF CCG chapter
May 2007 Revised for style, grammar, and clarity. Substance of chapter content unchanged.
June 2008 Revised for content, style, grammar, and clarity
December 2019 Complete revision based on systematic literature review

Competing Interests

The authors have no competing interests to declare.


Food and Drug Administration. US Department of Health and Human Services. ICH E6(R2) Good Clinical Practice: Integrated Addendum to ICH E6(R1), March 2018. Available at https://www.fda.gov/regulatory-information/search-fda-guidance-documents/e6r2-good-clinical-practice-integrated-addendum-ich-e6r1.

Bellary S, Krishnankutty B, Latha MS. Basics of case report form designing in clinical research. Perspect Clin Res. 2014; 5(4): 159–166. DOI:  http://doi.org/10.4103/2229-3485.140555

Spilker B. Guide to Clinical Trials. New York: Raven Press; 1991: Chapter 36.

Zozus MN. Chapter 5, Fundamental Dynamic Aspects of Data in The Data Book: Collection and Management of Research Data. Boca Raton, FL: Taylor & Francis/CRC Press; 2017a.

Jones CE, Munoz FM, Spiegel HM, et al. Guideline for collection, analysis and presentation of safety data in clinical trials of vaccines in pregnant women. Vaccine. 2016; 34(49): 5998–6006. DOI:  http://doi.org/10.1016/j.vaccine.2016.07.032

Zozus MN, Pieper C, Johnson CM, et al. Factors affecting accuracy of data abstracted from medical records. PLoS One. 2015; 10(10): e0138649. DOI:  http://doi.org/10.1371/journal.pone.0138649

Zozus MN. Chapter 2, Defining Data and Information in The Data Book: Collection and Management of Research Data. Boca Raton, FL: Taylor & Francis/CRC Press; 2017b. DOI:  http://doi.org/10.1201/9781315151694

Boiko A. Part IV: Data collection guidelines for questionnaires to be used in case-control studies of multiple sclerosis. Neurology. 1997; 49(2 Suppl 2): (S75–80). DOI:  http://doi.org/10.1212/WNL.49.2_Suppl_2.S75

Feinstein AR, Pritchett RRL, Schimpff CR. The epidemiology of cancer therapy IV the extraction of data from medical records. Arch Intern Med. 1969; 123. DOI:  http://doi.org/10.1001/archinte.1969.00300150089013

Feinstein AR, Pritchett RRL, Schimpff CR. The epidemiology of cancer therapy III the management of imperfect data. Arch Intern Med. 1969; 123. DOI:  http://doi.org/10.1001/archinte.1969.00300140094023

Backhouse ME, Gnanasakthy A, Schulman KA, Akehurst R, Glick H. The development of standard economic datasets for use in the economic evaluation of medicines. Drug Inf J. 2000; 34(4): 1273–1291. DOI:  http://doi.org/10.1177/009286150003400435

Medicines & Healthcare products Regulatory Agency (MHRA). ‘GXP’ Data Integrity Guidance and Definitions. Revision 1: March 2018. Accessed June 2, 2018. Available at https://www.gov.uk/government/publications/guidance-on-gxp-data-integrity

Food and Drug Administration. US Department of Health and Human Services. Guidance for industry: Use of Electronic Health Record Data in Clinical Investigations. July 2018. Accessed August 8, 2018. Available from https://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM501068.pdf

Food and Drug Administration. US Department of Health and Human Services. Guidance for Industry: Electronic Source Data in Clinical Investigations. September 2013. Accessed August 8, 2018. Available from https://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM328691.pdf

The Joint Commission (TJC). Specifications Manual for Joint Commission National Quality Measures. 2017 (v2017B2). Accessed June 16, 2018. Available from https://manual.jointcommission.org/releases/TJC2017B2/.

Nahm, M. Knowledge acquisition from and semantic variability in schizophrenia clinical trial data. Proceedings of the International Conference on Information Quality (ICIQ). November 2012. Available from http://mitiq.mit.edu/ICIQ/2012

Kennedy D. Hutchinson D, CRF Design a Practical Guide to Case Report Form Design and Production. Surrey UK: Carary Ltd; 2002. DOI:  http://doi.org/10.1207/s15516709cog1801_3

Zhang J, Norman DA. Representations in distributed cognitive tasks. Cogn Sci. 1994; 18: 87–122. DOI:  http://doi.org/10.1207/s15516709cog1801_3

Miller GA. The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychol Rev. 1956; 63(2): 81–97. DOI:  http://doi.org/10.1037/h0043158

Hirschtick RE. A piece of my mind. JAMA. 2006; 20: 2335–60. DOI:  http://doi.org/10.1001/jama.295.20.2335

Burnum JF. The misinformation era: the fall of the medical record. Ann Intern Med. 1989; 110(6): 482–4. DOI:  http://doi.org/10.7326/0003-4819-110-6-482

International Organization for Standardization (ISO). ISO/DIS 21090, Health Informatics — Harmonized data types for information interchange. 2011.