Exploration on Standardization of Clinical Research Data in a Clinical Trial Institute of Traditional Chinese Medicine

Authors: Li Qingna (Xiyuan Hospital of China Academy of Chinese Medical Sciences) , Rui Gao (Xiyuan Hospital of China Academy of Chinese Medical Sciences) , Zhao Yang (Xiyuan Hospital of China Academy of Chinese Medical Sciences) , Fang Lu (Xiyuan Hospital of China Academy of Chinese Medical Sciences) , Xu Hao (Xiyuan Hospital of China Academy of Chinese Medical Sciences)

Traditional Chinese Medicine(TCM) products and practices are widely used worldwide with increasing popularity. The number of TCM clinical trials has increased significantly in the past two decades. There is an urgent need to standardize TCM data and terminology for use in clinical studies. It will ensure data quality, improve the efficiency and expedite scientific communications in TCM. This paper illustrates our exploration processes in the application and development of the data standards in TCM clinical trials. We hope these explorations will help TCM data exchange, sharing, and aggregation which will ultimately accelerate the launch of TCM products and bring benefits to people.

Keywords: data standardization, clinical trials, Traditional Chinese Medicine, Therapeutic Area User Guide, Coronary Artery Disease-Angina

Qingna, L. & Gao, R. & Yang, Z. & Lu, F. & Hao, X., (2023) "Exploration on Standardization of Clinical Research Data in a Clinical Trial Institute of Traditional Chinese Medicine", Journal of the Society for Clinical Data Management 4(1). doi: https://doi.org/10.47912/jscdm.207


  Science and Technology Innovation Project, China Academy of Chinese Medical Sciences (grant CI2021A04701)



13 Dec 2023
1. Introduction

Traditional Chinese Medicine (TCM) has a long history and plays an increasingly important role in the global medical system with increasing numbers of TCM trials conducted in and outside China. The World Federation of Chinese Medicine Societies (WFCMS) has grown to 277 group members and 203 branches in 72 countries and regions since its founding in 2003.1 Currently, a variety of TCM treatments, such as Chinese herbal medicine, acupuncture, and tuina therapy are widely used in the diagnosis and treatment of diseases. In particular, the outbreak of COVID-19 in recent years has proved that the use of TCM based on pattern identification and treatment has documented advantages in treating disease.23

The international standardization of TCM data helps to promote the data exchange and the sharing of TCM trials worldwide. The Clinical Data Interchange Standards Consortium (CDISC) data standards are mature, globally recognized, and are heavily used by the pharmaceutical industry for regulatory clinical trial data submissions.4 In 2011, researchers at the Xiyuan Hospital of China Academy of Chinese Medical Sciences began to explore the use of CDISC data standards, to accelerate the research process and to improve the quality of research data.5 Pattern identification and treatment is an important basic characteristic of TCM. A TCM pattern is defined as a categorized pattern of symptoms and signs in a patient at a specific stage during the course of a disease, which is classified based on categorization of signs and symptoms under the guidance of TCM theory, while considering western medicine disease diagnosis. For example, there are seven common TCM patterns of coronary artery disease-angina (CAD-Angina), ‘Pattern of qi deficiency with blood stasis’, ‘Pattern of heart blood stasis and obstruction’, ‘Pattern of phlegm obstruction in the heart vessels’, ‘Pattern of qi stagnation and blood stasis’, ‘Pattern of yin cold congelation and stagnation’, ‘Pattern of dual deficiency of qi and yin’, and‘Pattern of yang qi debilitation’. The TCM pattern diagnosis and evaluation data generally collected in TCM studies are requested by National Medical Products Administration (NMPA).67 Considering the ‘Pattern of qi deficiency with blood stasis’, the case report forms shown in Tables 1 and 2 are used to diagnose the TCM pattern and to evaluate symptoms and signs of the TCM pattern in a clinical trial. However, it is difficult to model and standardize TCM pattern diagnosis and evaluation data using the existing CDISC standards and structure. It is also hard to code the diagnoses, medical histories, and adverse events that relate to TCM because of the lack of controlled terminology for TCM patterns, symptoms, and signs, in both CDISC controlled terminology and the ICH Medical Dictionary for Regulatory Activities (MedDRA).

Table 1

Diagnosis: TCM Pattern of Qi Deficiency with Blood Stasis.

No. Symptom Name Result
1 Chest pain □1 Yes □0 No
2 Tightness in chest □1 Yes □0 No
3 Fatigue □1 Yes □0 No
4 Shortness of breath □1 Yes □0 No
5 Spontaneous sweat □1 Yes □0 No
6 Labor or work-induced angina □1 Yes □0 No
7 Dark complexion □1 Yes □0 No
Dark lip color □1 Yes □0 No
8 Dull red-colored tongue □1 Yes □0 No
Petechiae or ecchymosis of tongue □1 Yes □0 No
9 Cyanotic sublingual vessel □1 Yes □0 No
Varicose sublingual vessel □1 Yes □0 No
10 Tongue manifestation (unique findings by examining the tongue) Please describe: _____________________________
11 Pulse condition (unique findings by taking of the pulse) □Weak pulse
□Rough pulse
□Others, please describe: _____________________
Does the patient have the pattern of qi deficiency with blood stasis?
□1 Yes            □0 No
Table 2

Evaluation: TCM Pattern of Blood Stasis Due to Qi Deficiency

Primary symptom None Mild Moderate Severe
Chest pain □0 None □1
Onset occasionally, mild chest pain lasting 5 minutes or less, pain relief after resting, and will not affect normal daily life and activities.
Onset often, moderate chest pain lasting about 5 to 10 minutes, pain relief after medication, and normal activities may resume.
Onset frequently, severe chest pain lasting more than 10 minutes, multiple medications required for pain relief, which severely affect daily life and activities. (eg, dressing, defecation).
Tightness in chest □0 None □1
Onset occasionally, mild chest tightness lasting 5 minutes or less, chest tightness relief after resting, and will not affect normal daily life and activities.
Onset often, moderate chest tightness lasting about 5 to 10 minutes, chest tightness relief after medication, and normal activities may resume.
Onset frequently, severe chest tightness lasting more than 10 minutes, multiple medications required for chest tightness relief, which severely affect daily life and activities. (eg, dressing, defecation).
Secondary symptom None Mild Moderate Severe
Palpitation □0 None □1
Palpitation onset occasionally, experience slight discomfort.
Palpitation onset often and long lasting, experience obvious discomfort.
Palpitation onset frequently, affects normal daily life and activities, and difficult to relieve.
Fatigue □0 None □1
Lacking strength or vigor, but able to conduct daily activities.
General weakness stick to work arduously.
Too tired and weak to conduct normal daily activities.
Shortness of breath □0 None □1
Onset after normal activities.
Onset after simple activity.
Continuous onset irrelevant to any activities.
Spontaneous sweat □0 None □1
Moist skin daily, turns into humid just after activities.
Humid skin daily and sweat just after activities.
Sweat daily, profuse sweat after activities.
Color of lips □0
Normal lip color.
Slightly dark-colored lips.
Dark-colored lips.
Dark purple lips.
Color of tongue □0
Normal tongue color (pale red tongue).
Slightly dark tongue color.
Dull red-colored tongue with petechiae or ecchymosis of tongue.
Dark purple tongue color.
Varicose sublingual vessel □0 None □1
Varicose tongue base vessel.
Varicose sublingual vessel over half sublingual vessel.
Entire varicose sublingual vessel.

Xiyuan Hospital of China Academy of Chinese Medical Sciences established its first clinical trial institute in 1983 in China. As the National Clinical Research Center for Chinese Medicine Cardiology, Xiyuan Hospital includes clinical experts in cardiovascular diseases and robust research platforms in Xiyuan Hospital, including human subject protection, quality control, data management, and statistical analysis for conducting phase I-IV clinical trials testing Chinese herbal medicines and plants. Researchers in Xiyuan Hospital have been exploring the development and use of data standards in TCM clinical trials since 2015.

2. Data Management Based on Data Standardization in Xiyuan Hospital

The electronic data management platform in Xiyuan Hospital has supported more than 100 clinical trials, including Investigational New Drug Applications (INDs) and Investigator Initiated Trials (IITs) since its establishment in 2009. We have been exploring how to accelerate the research process and improve the quality of research data through the use of data standards.

Researchers adopted clinical research data collection standards, including 38 CDISC-compliant domains and developed internal standard domains where existing standards lacked coverage of TCM concepts. Since 2009, new standard domains have been added to and have supplemented this repository. A general module library in the Electronic Data Capture (EDC) system was then developed using these standards; common modules for each therapeutic area were established in the EDC system. This was followed by the creation of a Case Report Form (CRF) library based on these standards to support the design of clinical trial CRFs. Lastly, researchers created corresponding Data Verification Plans (DVPs) and DVP online program templates based on these modules.

CRF modules were divided into general modules, such as Demographics (DM) and Adverse Events (AE); Therapeutic Area (TA) modules, such as exercise electrocardiogram stress test, and gastroscopy; and modules unique for a study. The general modules and therapeutic area modules directly refer to standard modules in the CRF library; the corresponding DVP and online DVP programs are reusable as well, which greatly improves the efficiency of data management. This system enables data managers (DMs) to focus on the design of the modules unique for studies, compliance between CRFs and protocols, and logical data storage, which greatly improves data management and research data quality.

3. Development of Therapeutic Area User Guide for TCM for Coronary Artery Disease-Angina

In September 2015, Xiyuan Hospital began to collaborate with CDISC to standardize general data collected in IND studies for TCM clinical trials. It was decided that, as a pilot program and the first of its kind, it would make sense to focus the data standardization effort on a single disease area. Coronary Artery Disease-Angina (CAD-Angina) is one of the diseases with high morbidity and mortality around the world, and the NMPA had published a clinical guideline for treating CAD-Angina using TCM in 2011.8 In the early 1980s, the study of Guanxin No. 2 Compound in the treatment of CAD-Angina led by Academician Chen Keji in Xiyuan Hospital was recognized as the first randomized controlled trial (RCT) of TCM in China.9 Xiyuan Hospital had carried out numerous cardiovascular clinical trials over the preceding 40 years. As the National Clinical Research Center for Chinese Medicine Cardiology, Xiyuan Hospital was able to provide rich details about clinical end points and real-world data used on studies. This information was used to define the scope of the TCM-CAD-Angina Therapeutic Area User Guide (TAUG) standards development project.

The TCM-CAD-Angina therapeutic area standards development project was launched in January 2016. The development process was carried out according to CDISC standard operating procedures for standards development. CDISC Standards Development Process (COP-001) includes eight stages (Figure 1).10 The development team of the TCM-CAD-Angina TAUG encompassed clinical therapeutic area Subject Matter Experts (SMEs) from different TCM hospitals, biomedical concept engineers, metadata developers, terminology experts and medical writers. The main activities of each development stage are shown in Table 3.

Figure 1
Figure 1

CDISC TA Standards Development Process.

Table 3

Main Activities of the different development stages for the TCM-CAD-Angina TAUG.

Stage Main Activities Goals Outcomes
Stage 0
Discussion and understanding of the common TCM concepts. To provide concepts of basic terminology for the development of TCM clinical research data standards. 19 common TCM concepts are defined both in English and Chinese in reference to the WHO TCM standards.11
Input from SMEs on the diagnosis and evaluation of TCM-CAD-Angina patterns. NMPA guidelines8 describe the clinical diagnosis and evaluation standards for TCM-CAD-Angina patterns in basic and foundational terms. SMEs to provide more detailed interpretations, based on regulatory guidance and on actual clinical practice, to better define the diagnosis and evaluation of the patterns. Seven specific TCM-CAD-Angina patterns were developed based on regulatory guidance and SME input toward implementation in clinical trials.
Development of the Chinese/English CRF for TCM-CAD-Angina patterns. Develop CRFs for the diagnosis and evaluation of TCM-CAD-Angina patterns in line with current guidelines, informed by previous CRFs used in this therapeutic area and SME recommendations. Detailed CRFs for all seven TCM-CAD-Angina patterns were developed.
Inclusion of routine data translated and annotated with existing CDISC standards. Model the most common data used both in TCM and other clinical trials for CAD-Angina. Exercise electrocardiogram stress test and echocardiographic assessments were translated and annotated using the Study Data Tabulation Model (SDTM) domains and variables.
Stage 1
Identification of Biomedical Concepts
In-depth discussions on modeling the seven TCM patterns with multiple stakeholder teams at a Standards Review Council (SGC) meeting, one TA Workshop, and one Workshop with the CDISC Board of Directors for strategic direction on the development of this TAUG Determine data model(s) that conform to both CDISC principles and the meaning of TCM patterns. The TCM tongue observations and pulse manifestation observations, as well as other bodily signs, were considered as conditions that reflect the body as a whole; they are therefore represented in the Whole-Body System (WB) Findings domain. The WB domain is a new custom domain developed for the first TCM TAUG.
Stage 3a
Internal Review
Internal CDISC review of the TAUG, resolution of the comments through the CDISC JIRA system, and update of the TAUG. Internal review of the draft TCM-CAD-Angina TAUG by as many CDISC staff, volunteers, and other stakeholders as possible. The internal review cycle generated 31 issues from experts in various TCM institutions, the China CDISC Coordinating Committee (C3C), staff, and volunteers. Suggestions included updates/edits to analysis data, such as updating the ADaM dataset to include the ABLFL and ITTFL variables, links to the draft WB domain, and the use of a consolidated terminology of TCM and normative descriptions, such as spelling out all abbreviations.
Stage 3b
Public Review
In preparation for public review, translation of the TAUG from English to Chinese, making this the first bilingual CDISC TAUG followed by public review and comment Promote the draft TCM-CAD-Angina TAUG and CDISC standards to the Chinese user community.
Invite worldwide review of the TAUG.
The public review cycle generated 30 issues from experts in TCM institutes, the China CDISC Coordinating Committee (C3C), the Center for Drug Evaluation (CDE) of the NMPA, and CDISC volunteers. Issues identified included updates to the modeling of echocardiographic assessments, references, translation quality and other sections of the TAUG.12

In TCM, doctors inspect the patient’s tongue and take the pulse as important observations. The concept map of the TCM tongue manifestation (Figure 2) played a crucial role in promoting clear communication between the Chinese and US therapeutic area team members at stage one. For example, physical observations drawn in reference to the tongue manifestation and pulse condition were assigned to the Gastrointestinal System Findings (GI) and Vital Signs (VS) domains respectively in the current CDISC model. However, in TCM instead of belonging to single physiology or body system findings domains, these findings are conceptualized as indicative of the body as a whole. Concept maps, such as that in Figure 2 below, helped the Chinese developers to understand that even if such observations can be represented as a series of questions that constitute the diagnosis and evaluation of TCM pattern as a whole, these data for TCM patterns should not be shown in the CDISC SDTM Questionnaire (QS) domain, because these questions are not standardized and validated. The factors of TCM patterns involve multiple body systems and the interrelationships between the factors is complicated,13 which imposes serious challenges in standardizing TCM patterns by mapping to existing CDISC SDTM domains. Symptoms and signs that make up the diagnosis or evaluation of a TCM pattern do conceptually match Events and Findings domains based on the SDTM model. As a result, the diagnosis data of the TCM-CAD-Angina patterns are represented in FAMH and WB domains. The evaluation data for the TCM pattern are modeled as Findings About Clinical Events (FACE) and WB domains (Figure 2).

Figure 2
Figure 2

TCM Tongue Manifestation Concept Map.12

The TCM-CAD-Angina TAUG was developed in partnership with CDISC and the Xiyuan Hospital of China Academy of Chinese Medical Sciences and was supported by the United States National Institutes of Health (NIH) National Cancer Institute Enterprise Vocabulary Services (NCI-EVS), China CDISC coordinating committee (C3C), and other TCM Institutes in China. The TCM-CAD-Angina TAUG is CDISC’s first standard to be released for TCM as well as the first CDISC standard published in both Chinese and English. Version 1.0 of the Therapeutic Area User Guide for Traditional Chinese Medicine Coronary Artery Disease-Angina (TAUG-TCM-CAD-Angina) was published in September 2019 and describes the most common biomedical concepts relevant to Coronary Artery Disease-Angina in TCM and the metadata necessary to represent such data consistently with Controlled Terminology, SDTM, and ADaM (Figure 3).

Figure 3
Figure 3

Sections in the First Version of the TCM-CAD-Angina TAUG.

4. Benefits Achieved with the Creation of the TAUG-TCM-CAD-Angina

Based on the concept of integrated TCM and Western medicine, this TAUG is the first time data for biological concepts of disease and TCM patterns for clinical research have been represented together through the universal language of medical concept map and together in the same data standard. The TAUG also includes a whole-body (WB) system domain for reflecting the whole-body condition, such as tongue and pulse manifestation. Integration of Western medicine and TCM in the same standard facilitates testing TCM interventions, the generation of evidence, and incorporation into evidence-based practice guidelines.

To provide a common terminology as a basis for the development of additional TCM data standards, the commonly used and foundational TCM concepts and their definitions, such as TCM Pattern and Four TCM Examinations were included in this TAUG. Common assessments in clinical trials in CAD-Angina, such as Exercise Stress Test and Echocardiographic Assessments were represented using CDISC SDTM domains. It is beneficial to promote the standardization of CAD-Angina clinical trial data, including relevant TCM patterns, as a foundation to accelerate regulatory data review in China. It is important to note that standards are not static; through an iterative process of implementation and feedback, it is expected that each therapeutic-area standard developed will continue to evolve with the science of the therapeutic area and the SDTM.14

5. Discussion

Through the efforts of the TCM-CAD-Angina team and CDISC volunteers, the TAUG was officially released, following three years of development.15 As the first CDISC TCM data standard, many new concepts were encountered. As such, the development process was complex. TCM has a different theoretical system from Western medicine. Data models that represent TCM concepts must accurately represent TCM theory in the same way that data representing Western medicine concepts must faithfully represent biological foundations. At the same time, the development of the TAUG in collaboration with CDISC required adherence to existing CDISC model rules and processes. It required close communication between Chinese and American team members. Communication during TAUG development was facilitated using medical concept mapping as a bridge to help all parties communicate effectively.

The reliability and repeatability of TCM’s diagnosis and treatment system is the basis of TCM practice standardization. A TCM physician collects symptoms and signs from a patient by inspecting, listening, smelling, inquiring, and palpating to determine the etiology, location, and nature of the disease, so as to make a diagnosis of a TCM pattern. Treatment is then given based on the TCM pattern, much like the diagnostic criteria used for mental health disorders in the Diagnostic and Statistical Manual (DSM) IV. Patients can get more precise treatments at different stages of disease development through pattern identification and treatment. However, it is difficult to carry out reliability and validity evaluations for the diagnosis and evaluation questionnaire of a TCM pattern. Academic institutions of TCM have been carrying out research on diagnosis and evaluation criteria of TCM patterns,1617and it is believed that data of TCM patterns will be standardized as this line of inquiry continues. TCM patterns are diagnostic criteria that are based on the presence (or absence) and the extent of signs and symptoms. As diagnostic criteria, they are different from questionnaires and should not be represented in the QS domain. The WB System Findings domain, created during the development of TCM-CAD-Angina TAUG, will house holistic assessments of signs that are identified according to TCM criteria. As such, the WB domain is currently limited to only TCM data. Non-TCM use-cases are out of scope for the WB domain.

Though based on concepts of Chinese medicine rather than Western medicine, TCM patterns are a disease diagnosis. As such, the diagnosis name of a TCM pattern is represented in the MH domain. However, the variable ‘MHDECOD’ is a dictionary-derived text description from MedDRA. Since MedDRA is primarily used for Western medicine, developing analogous standardized TCM terminology, similar to that of MedDRA, is needed for the normalized coding of TCM data, including the diagnosis of TCM patterns and TCM unique adverse events.18

The “Opinions on Promoting the Inheritance, Innovation and Development of Traditional Chinese Medicine”19 issued by the State Council of the People’s Republic of China in 2019 pointed out that: “Promoting the opening up and development of TCM, including promoting the formulation of international standards for Traditional Chinese Medicine and actively participating in the formulation of international rules related to TCM.”19 CDISC standards are currently mandated by the US Food and Drug Administration (FDA) and the Pharmaceuticals and Medical Devices Agency (PMDA) for regulatory submissions. In 2020, the Center for Drug Evaluation (CDE) of NMPA released the “Guideline on the Submission of Clinical Trial Data”, which encouraged sponsors to submit clinical trial data and related submissions in accordance with CDISC standards.20 In September 2022, another TCM related TAUG-acupuncture was released for public comment.21 Data standards are tools that assist researchers in achieving semantic interoperability. The development of international data standards for TCM will facilitate future evaluations of TCM products according to the requirements of international trade such that effective TCM products will be available in more regions of the world to benefit a greater number of people.


The authors express their appreciation to the expertise and technical support from CDISC, National Institutes of Health (NIH) – National Cancer Institute Enterprise Vocabulary Services (NCI-EVS), CDISC Chinese Coordinating Committee (C3C) and David Hardison, PhD, the former chair of the CDISC board of directors. Special thanks to all the experts from industry, academia, and regulatory authorities, who participated in the internal and public review of the draft TAUG. The authors gratefully acknowledge the efforts of all the team members from all friendly units, which include Jordan Li, Junchao Chen, Rhonda Facile, Jingyuan Mao, Ruiling Peng, and Diane Wold. The authors also gratefully acknowledge Jordan Li, Wenjun Bao, and Alana St. Clair, who kindly provided help with this report. The authors are grateful to Science and Technology Innovation Project, China Academy of Chinese Medical Sciences (CI2021A04701) for partial financial support.

Competing Interests

The authors have no competing interests to declare.


