Variation in menopausal vasomotor symptoms outcomes in clinical trials: a systematic review

Background There is substantial variation in how menopausal vasomotor symptoms are reported and measured among intervention studies. This has prevented meaningful comparisons between treatments and limited data synthesis. Objectives To review systematically the outcome reporting and measures used to assess menopausal vasomotor symptoms from randomised controlled trials of treatments. Search strategy We searched MEDLINE, Embase, and Cochrane Central Register of Controlled Trials from inception to May 2018. Selection criteria Randomised controlled trials with a primary outcome of menopausal vasomotor symptoms in women and a sample size of at least 20 women per study arm. Data collection and analysis Data about study characteristics, primary vasomotor‐related outcomes and methods of measuring them. Main results The search identified 5591 studies, 214 of which were included. Forty‐nine different primary reported outcomes were identified for vasomotor symptoms and 16 different tools had been used to measure these outcomes. The most commonly reported outcomes were frequency (97/214), severity (116/214), and intensity (28/114) of vasomotor symptoms or a composite of these outcomes (68/214). There was little consistency in how the frequency and severity/intensity of vasomotor symptoms were defined. Conclusions There is substantial variation in how menopausal vasomotor symptoms have been reported and measured in treatment trials. Future studies should include standardised outcome measures which reflect the priorities of patients, clinicians, and researchers. This is most effectively achieved through the development of a Core Outcome Set. This systematic review is the first step towards development of a Core Outcome Set for menopausal vasomotor symptoms. Tweetable summary Menopausal hot flushes and night sweats have been reported in 49 different ways in clinical research. A core outcome set is urgently required.


Introduction
There is general agreement that vasomotor symptoms (hot flushes and night sweats) are the most common and problematic menopausal symptom. 1,2 Vasomotor symptoms are also the leading patient priority for treatment. 3 Oestrogencontaining menopausal hormone therapy (MHT) is an effective treatment for menopausal vasomotor symptoms; however, use of MHT has fallen substantially following concerns about safety. 4 There is a growing focus on the development and evaluation of nonpharmacological and nonhormonal treatments for vasomotor symptoms. 5 In addition, MHT is contraindicated in women with a personal history of breast cancer who may report more severe vasomotor symptoms than women experiencing natural menopause. 6 Enhanced understanding of the central mechanisms regulating vasomotor symptoms is driving the development and evaluation of novel targeted therapies, 7 but the interpretation and implementation of these studies is hampered by lack of consensus about how vasomotor symptoms should be reported and measured. This limits the potential to compare treatments and to synthesise the evidence, which in turn compromises decision-making by clinicians and patients.
Current National Institute for Health and Care Excellence (NICE) guidelines on the management of menopause 8 highlight the need for greater standardisation of outcome reporting and measures for treatment trials in menopause, and the consequent difficulty in evidence synthesis. There is an urgent need to determine what outcomes are most important to patients, clinicians, and researchers in order to increase the relevance of future intervention studies and facilitate comparisons between treatments. 9,10 The Core Outcomes in Effectiveness (COMET) initiative is leading protocols for the development of Core Outcome Sets. These are well defined, condition-specific, and feasible outcomes which should be included as a minimum set of outcomes in intervention studies. 11 To advance the development of Core Outcome sets in women's health, 12 80 editors of women's health journals have formed a consortium to support the development, dissemination, and implementation of core outcome sets within the reproductive field (Core Outcomes in Women's and Newborn Health-CROWN, http://www. crown-initiative.org). 13 The Core Outcomes in Menopause (COMMA) initiative is an international consortium of clinicians, researchers, and consumers developing a Core Outcome Set for menopausal symptoms. Following a standardised process, we have first systematically reviewed all randomised controlled trials (RCTs) of interventions for menopausal vasomotor symptoms to determine what outcomes have been reported and how they have been measured. We will then repeat this process for vaginal symptoms at menopause. This information will then be used to inform a Delphi survey by clinicians, researchers, and patients to identify priorities for inclusion in the final Core Outcome Set. 12

Study eligibility
We included all RCTs with a primary outcome of female menopausal vasomotor symptoms and a sample size of at least 20 women per study arm to minimise the likelihood of including feasibility or pilot studies 14 . We excluded studies that assessed menopausal vasomotor symptoms as a secondary outcome, quasi-randomised studies, secondary analyses of previously published RCTs, conference abstracts of RCTs, observational, analytical, or diagnostic studies and feasibility/pilot studies. We also excluded studies primarily aiming to assess pharmacokinetics, the mechanism of drug action, or tolerability and intervention studies with no explicit sample size calculations.

Search strategy
We searched MEDLINE, Embase, and Cochrane Central Register of Controlled Trials (CENTRAL) until May 2018. We hand-searched the reference lists of the included trials or other keynote publications. Search terms included menopause, menopausal, menopausal symptoms, climacteric, hot flush or flash, night sweat, vasomotor, and a search filter for RCTs (Appendix S1). There was no language restriction.

Data extraction
Two reviewers (W.W.L. and S.I.) independently assessed the studies using the predefined criteria described above. Disagreement was resolved by discussion with the steering committee. Full articles were obtained and data were extracted using a prespecified extraction sheet.

Quality assessment
Jadad scoring was used for assessing the methodological quality of the included trials. 15 The 5-point validated scoring system assesses the following: whether the trial (1) was described as randomised (1 point), (2) used an appropriate method of randomisation (1 point), (3) was blinded (1 point), (4) used an appropriate method of blinding (1 point), and (5) accounted for all patients randomised (1 point); ≤2 points was considered low quality and ≥3 was considered medium to high quality.
The quality of describing and reporting the outcome was assessed using the 6-point Management of Otitis Media with Effusion in Cleft Palate (MOMENT) scoring criteria, 16 which has been used previously in the context of quality assessment of studies for the development of a core outcome set and a cut-off of ≥4 to indicate a high-quality trial. The following points were considered: whether the primary outcome was (1) clearly stated (1 point), (2) clearly defined (1 point); whether the secondary outcomes were (3) clearly stated (1 point), (4) clearly defined (1 point); (5) whether the authors explained the use of the outcomes they selected (1 point); (6) whether specific methods were used to enhance the quality of outcome measurement (1 point).

Core outcomes
Core outcomes do not exist in this research field and were therefore not used in the systematic search. Our aim is to develop and disseminate core outcomes of menopausal symptoms and this systematic review is the first step of the process.

Results
The search identified 5591 studies, of which 2711 duplicates were removed. We screened 2880 titles and abstracts and excluded 2372 records which did not meet the inclusion criteria. In all, 544 studies were read in full. Of these, 330 were excluded; 59 included fewer than 20 women per study arm and 54 were not an RCT; 126 did not clearly state a sample size calculation; 52 did not measure vasomotor symptoms as the primary outcome; 39 were secondary analysis. Following these exclusions, 214 RCT were included 17-230 ( Figure 1). Table S1 describes the 214 studies with a total of 22 682 participants. The studies were published between 1994 and 2018. More than one-third of the trials (77/214, 36%) were conducted in USA. The followup period ranged from 4 to 52 weeks, with half the included studies following up participants for 12 weeks (108/214, 50%).
Forty-nine primary outcomes were identified from 214 RCTs including 22 682 women. Almost half of the RCTs (94/214, 44%) only included postmenopausal women, 12% (26/214) only included women with a history of breast cancer, and 5% (12/214) included peri-and postmenopausal women. Around a quarter of the RCTs (56/214, 26%) included both surgical and naturally peri-and postmenopausal women. We categorised the primary outcomes into four domains: (1) purely vasomotor-related outcomes (183/214, 86%); (2) quality-of-life-related outcomes (9/214,  (4) functional impact, specifically how bothersome, interfering and problematic vasomotor symptoms are; for this review we will refer to the latter category as 'interference' outcomes (5/214, 2.3%). The largest group was purely vasomotorrelated outcomes, comprising 33 individual outcomes. The second largest group was composite outcomes, all of which included vasomotor symptoms as one of the parameters.

Measurement tools
Seven different measurement tool categories were used to measure purely vasomotor-related outcomes. Most ( 233,234 have been used to measure interference due to vasomotor symptoms but were not the primary outcome of eligible RCTs, and so were not included in the systematic review. However, they will still be used for the subsequent Delphi process. Other subjective vasomotor symptoms measurement tools included a 20-item structured symptom checklist; two of the items asking about the presence of hot flushes and cold/night sweats, 128 a 5-point (from none to very severe) scoring system about the severity of hot flushes and night sweats, 103 Interactive Voice Response System to record the number and severity of the hot flushes, 157 and self-reported surveys. 156 Objective measures of vasomotor symptoms such as skin conductance were used in five trials in addition to subjective measures 33,68,118,138,167 (Table 3). Diaries were used to record 25 different types of menopausal vasomotor-related outcomes and accounted for 57% (25/44) of all primary outcomes. The number, frequency, severity, and intensity of menopausal vasomotor symptoms were the most commonly reported vasomotor-related outcomes assessed by diaries. However, there was substantial variation in the definitions of each outcome. For example, for frequency, the majority of included studies (72/214) reported the 'number of vasomotor symptoms per 24 h' (8 retrospectively and 64 prospectively), and 39/214 measured the 'number of vasomotor symptoms per week'. Table S2 shows the complete list of vasomotor-related outcomes recorded by diaries and how often they have been reported in RCTs of intervention studies for vasomotor symptoms.

Variation in the definitions of vasomotor-related outcomes
There was substantial heterogeneity in the definition of vasomotor-related outcomes. Three different definitions were used to measure the frequency of vasomotor symptoms. Most studies (79/97, 81%) defined frequency as the number of hot flushes or night sweats, whereas 18 studies did not define how frequency was measured. The severity of vasomotor symptoms was defined in nine different ways and the intensity in seven different ways (Table S3). The 68 studies reporting composite outcomes for vasomotor symptoms utilised 11 different ways of defining the composite score. The most commonly used approach (27/68, 40%) measured the number of hot flushes and night sweats, and calculated a composite score weighted by severity rating. There was considerable overlap between composite score definitions.

Quality assessment of trials
Regarding methodological quality, 34% (73/214) of included RCTs scored 5 out of 5 points on the Jadad scale. More than half of the trials (118/214, 55%) scored 6 out of 6 points on the MOMENT scale (Table S1).

Main findings
This is the first systematic review of outcomes used to measure vasomotor symptoms in randomised controlled trials of interventions. Our findings demonstrate major inconsistencies in how treatments for the same symptoms have been evaluated. For example, the severity of hot flushes and night sweats had nine different definitions. Overall, the most commonly used outcomes for vasomotor symptoms (based on 214 RCTs including 22 682 women) were the frequency and intensity of vasomotor symptoms or a compound measure of these (n = 59). A smaller number of studies (n = 5) took a different approach and measured the interference due to vasomotor symptoms. It remains unclear which measures of vasomotor symptoms best reflect the priorities of patients, clinicians, and researchers, and this will be addressed by the development of a Core Outcome Set. Inclusion of the Core Outcome Set in future intervention studies for vasomotor symptoms will enhance the quality and relevance of trials and facilitate decision-making by clinicians and patients.

Strengths and limitations
We conducted a comprehensive search strategy with a robust methodological design to include all large (>20 participants per arm) RCTs of interventions for vasomotor symptoms. Two researchers independently evaluated the available evidence to minimise overlooking relevant evidence. To our knowledge, this is the first time that reported vasomotor-related outcomes have been synthesised, a necessary step to inform the Delphi process for key stakeholders to rate the components of the core outcome set. 12 Although most included trials were of medium or high methodological standard, the diverse nature of the outcome measures used diminishes the value of these trials to inform patient choices and clinical decision-making. This study focused on menopausal vasomotor symptoms in the first instance. We recognise that personal, ethnic, cultural, and geographical factors influence the nature and experience of menopause, and that not all women experience vasomotor symptoms at menopause. 235 We comprehensively searched three major databases and it is unlikely we have missed an RCT published elsewhere. We acknowledge we did not search CINAHL, but we doubt that additional RCTs could only be identified in that database. We included randomised trials with vasomotor-related symptoms as a primary outcome with over 20 participants in each arm, excluding observational studies and pilot studies. However, given the large number of included studies, we do not anticipate that we have missed outcomes not captured in larger trials. We recognise that this systematic review was limited to trials where vasomotor symptoms were the primary outcome. However, given the large number of studies included, we do not considered that we missed important outcomes. The expert panel and Delphi process will highlight any additional important outcomes that may have been overlooked because they were not included in treatment studies or were reported as secondary outcomes. We also appreciate that our findings may be skewed towards FDA-driven outcomes, as many studies were conducted in USA. We have only listed the range of outcome measures used and have not applied any qualitative assessment of the value or importance of these measures for women or clinicians. Only a few relatively recent RCTs have measured the impact of interventions on the interference caused in women by vasomotor symptoms and it is uncertain whether the frequency/severity or interference of symptoms best reflects women's treatment priorities. These issues will be addressed by the Delphi survey and the subsequent consensus meeting. Most published RCTs of interventions for vasomotor symptoms focused on caucasian women who may experience menopause differently from other ethnicities. 236 The COMMA consortium includes representation from a wide range of geographical areas and ethnic groups to ensure that the Core Outcome Set reflects variations in stakeholder priorities.

Interpretation
Inconsistency in measures used for the evaluation of treatments for vasomotor symptoms limits comparisons between treatments and the interpretation of findings for  237 Understanding the efficacy of new treatments and how they compare with existing approaches requires the use of standardised outcome measures that are meaningful to patients, and feasible for clinicians and researchers. 238 A Core Outcome Set does not preclude the inclusion of additional outcome measures, but sets a minimum standard of outcomes that should be reported in all interventional trials. This systematic review is the first step towards the development of meaningful consensus by identifying how vasomotor symptoms have been measured to inform consensus through the Delphi process. 239

Conclusion
Most intervention studies for vasomotor symptoms have measured frequency or severity of symptoms, or a combination of both. Some have measured the interference caused in daily life due to symptoms. There is a need for consensus around the optimum outcomes and how these should be measured to facilitate comparisons between interventions and ensure patient-centred clinical practice.

Disclosure of interests
No conflict of interest to disclose. Completed disclosure of interest forms are available to view online as supporting information.

Contribution of authorship
MH and SI conceived the idea and set the protocol. GM, RN, and MAL refined the protocol. SI and WW conducted the systematic search, and SI wrote the first draft of the paper with contribution from WW in writing up the methods. MSH provided useful insight regarding the bothersome aspect of menopausal symptoms. All authors edited and accepted the manuscript prior to submission.

Details of ethics approval
No ethics approval was required as we have summarised already published data.

Supporting Information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Table S1. Study characteristics and quality assessment scoring of the included studies. Table S2. Vasomotor-related outcomes measured by diary.