Diagnostic accuracy of the partograph alert and action lines to predict adverse birth outcomes: a systematic review

Background There are questions about the use of the ‘one‐centimetre per hour rule’ as a valid benchmark for assessing the adequacy of labour progress. Objectives To determine the accuracy of the alert (1‐cm/hour) and action lines of the cervicograph in the partograph to predict adverse birth outcomes among women in first stage of labour. Search strategy PubMed, EMBASE, CINAHL, POPLINE, Global Health Library, and reference lists of eligible studies. Selection criteria Observational studies and other study designs reporting data on the correlation between the alert line status of women in labour and the occurrence of adverse birth outcomes. Data collection and analysis Two reviewers at a time independently identified eligible studies and independently abstracted data including population characteristics and maternal and perinatal outcomes. Main results Thirteen studies in which 20 471 women participated were included in the review. The percentage of women crossing the alert line varied from 8 to 76% for all maternal or perinatal outcomes. No study showed a robust diagnostic test accuracy profile for any of the selected outcomes. Conclusions This systematic review does not support the use of the cervical dilatation over time (at a threshold of 1 cm/h during active first stage) to identify women at risk of adverse birth outcomes. Tweetable abstract Alert line of partograph does not identify women at risk of adverse birth outcomes.


Introduction
Although most women and their babies are considered to be at low risk of complications at the onset of labour, 1,2 the time around childbirth is associated with the highest risk of maternal and perinatal mortality and morbidity. [3][4][5] Unfortunately, the task of identifying pregnant women at risk of developing complications through labour, birth, and immediate postpartum is not trivial. 6 Even in high-resource settings, between 20 and 30% of low-risk women present unexpected intrapartum complications such as dystocia, postpartum haemorrhage, infection, fetal distress or neonatal complications that would require specific obstetric or neonatal care. 1,2 For over two decades, the partograph has been the paper tool routinely applied to supporting decision-making during labour with the aim of optimising time of interventions and referral. 7 The central feature of the current partograph design is the cervicograph where cervical dilation is plotted, usually from 4 cm against time with an acceptable rate of dilatation at 1 cm per hour as designated by the alert line and, supposedly, representing the slowest tenth centile of nulliparous women in labour. 8,9 The action line was drawn at intervals of 4 hours after the alert line and was meant to identify abnormally slow labours and trigger review by medical staff with a view to augmentation, termination of labour or supportive therapy for women crossing this line.
Since the 1990s, WHO has promoted the use of the partograph during active phase of labour with a 4-hour action line for monitoring progress of labour. [10][11][12] However, more recently, several observational studies have raised questions about the use of the 'one-centimetre per hour rule' as a valid benchmark for assessing the adequacy of labour progress. [13][14][15] In this context, there is a need for a systematic assessment of the utility of the cervicograph alert and action lines in identifying women at higher risks of complications due to slow labour progression and need of interventions to reduce their risks of adverse birth outcomes. Therefore, we conducted a systematic review to determine the accuracy of the alert (1-cm/h) and action lines of the cervicograph in the partograph to predict adverse birth outcomes among women in first stage of labour.

Methods
This systematic review was conducted in accordance with the PRISMA guidelines, 16 and followed a protocol, as described below.

Eligibility criteria and search strategies
The review identified any study design where data showing the correlation between the alert line status of women in labour and the occurrence of adverse birth outcomes were reported, regardless of when the alert line was plotted (3 or 4 cm). Published or unpublished randomised controlled trials, diagnostic test accuracy studies, cross-sectional studies, and longitudinal studies (retrospective or prospective) were considered eligible for inclusion if they used the WHO partograph, or any modified version of the WHO partograph, with alert and action lines at 1-cm/h cervical dilatation rate threshold and defined a population of nulliparous and/or parous women with near-term or term singleton pregnancy. Women considered at risk of developing complications during labour and childbirth, including women having presented complications during pregnancy, twin pregnancies or non-cephalic presentations, were not excluded. No restriction based on sample size or number of participants with the outcome of interest was applied.
PubMed, EMBASE, CINAHL, POPLINE, Global Health Library, and reference lists of eligible studies were searched for potentially eligible studies. No restrictions related to publication status, date or language were applied. The literature search in electronic databases was carried out in April 2017. The search was updated in February 2019. The search strategy used a combination of the following terms, expanded and adapted for each database: 'partograph', 'partogram', 'alert line', and 'birth outcomes'. Details of the search strategy are provided in Appendix S1. We contacted authors for ongoing and unpublished studies.
Study selection, data collection, data items, and risk of bias All citations identified through the electronic search were downloaded into reference management software, and duplicates were removed. All titles and abstracts were screened in duplicate by three independent reviewers (JPS, OTO, MB) considering the eligibility criteria. Full texts of potential eligible articles were assessed independently by two reviewers at a time (JPS, OTO, MB). Data were extracted using a standardised electronic table developed for this review and based on adapted criteria of the STARD (Standards for Reporting Diagnostic Accuracy Studies). 17 Extracted data were doublechecked by a second reviewer (JPS or MB). Discrepancies on inclusions and/or data extraction were resolved through discussion or, if required, by the third reviewer.
Data extracted included the following domains: general information (author, title, publication date, country(ies) where the study took place, sample size); source of data; characteristics of participants (participant eligibility and recruitment method, participant characteristics, interventions during labour, study dates); description of use of partograph (including cervical dilatation to start plotting, time intervals between the alert and the action lines); adverse birth outcomes (definition of outcomes and measurement); and missing data. Study outcomes included fresh stillbirths, maternal (death, uterine rupture, organ dysfunction with dystocia), and neonatal outcomes (Apgar score at 1 and 5 minutes, resuscitation at birth, birth asphyxia/perinatal hypoxic-ischaemic encephalopathy, labour ward deaths), as defined by authors. Data extracted from Diarra 18 correspond to the full publication of the thesis 19 as the journal publication has some data inconsistencies (confirmed with the authors).
For each study, the number of women in each of the following four categories was determined: women who crossed the alert line and had adverse birth outcomes, women who crossed the alert line and did not have adverse birth outcomes, women who did not cross the alert line and had adverse birth outcomes, women who did not cross the alert line and did not have adverse birth outcomes. Similarly, if available, the equivalent data for the action line were collected.
We developed a risk of bias assessment checklist, based on existing tools. 20,21 The assessment included the following domains: population selection (appropriate sampling and inclusions/exclusions), study attrition, measurement (temporality of the observations, outcomes measurement), and analysis (primary intent of the study). Quality of the studies was assessed by one reviewer (MB) and checked by a second reviewer (JPS). Discrepancies were resolved through discussion until consensus. The studies were assessed to be at low, high or unclear risk of bias based on whether the criterion is adequately fulfilled in the study or the study report does not provide sufficient information to allow for a clear judgement ( Figure S1).

Data analysis
The percentage of women crossing the alert line and the prevalence of adverse birth outcomes were determined for each study. The sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, diagnostic odds ratio, and J statistic with their 95% confidence intervals were calculated to estimate the accuracy of the alert and action lines in the identification of women who would develop adverse maternal, fetal or neonatal outcomes. The diagnostic odds ratio is the ratio of the odds of disease in test positives relative to the odds of disease in test negatives: (TP 9 TN)/(FP 9 FN). 22 The J statistic summarises the performance of a binary classifier 23 and also expresses the proportion of ideal performance of a diagnostic test. It is calculated by (sensitivity + specificity) À 1, and a score close to '1' indicates higher predictive capability. Diagnostic odds ratio was not computed for studies with zero values in one of the four categories described above. Interpretation of these statistics was performed as described in Table S2.
If outcome definitions were similar, the results of the studies were pooled by birth outcome, calculating summary sensitivity and specificity values. Results are also presented by outcome using paired forest plots. A composite outcome including fetal, maternal, and neonatal outcomes was also used if the data allowed differentiation of fresh stillbirths and at least one neonatal outcome. When multiple neonatal outcomes were reported, Apgar at 5 minutes or resuscitation was used to construct the composite outcome. A similar analysis considering the action line was carried out, and results are presented in Tables S4 and S5. The analyses were performed using an electronic spreadsheet (Microsoft Office Professional Plus 2010, Version 14.0) programmed with the standard formulas for diagnosis accuracy measures. Paired forest plots were designed using the scatter plots function in EXCEL 2010. Core outcome sets and patient involvement are not relevant for this review.

Results
The search strategies returned a total of 1007 potentially relevant citations (876 in April 2017 and additional 131 in February 2019), and 69 studies had full-text manuscripts assessed for eligibility. A total of 13 studies in which 20 471 women participated were included in the review ( Figure 1). The characteristics of included studies are provided in Table S1. Studies were conducted mainly in secondary or tertiary care facilities in Africa (12 106 women from Mali, Nigeria, Senegal, South Africa, and Uganda), the Americas (733 women from Brazil and Ecuador), Asia (7292 women from India, Indonesia, Malaysia, Thailand), and the Middle East (140 women from Iran). Two studies were conducted in the early 1990s and 10 on or after 2005. Three studies reported starting plotting cervical dilatation at 3 cm. [24][25][26] Most of the studies included nulliparous (30-40% of the samples where specified) and parous women, one study included nulliparous women only, 27 and the population was not specified in one study. 26 Most women had no history of medical, surgical or obstetric problems and were included from 4 cm cervical dilatation, in spontaneous labours and vertex presentation. 18,26,[28][29][30][31][32] Two studies included spontaneous or induced labours and any fetal presentation. 25,33 One study 33 also included in total 12% of women with pre-labour complications during pregnancy. Women were attended by a range of providers, community health workers, 24 midwives, 25,28,33,34 and obstetricians and midwives. 33 In three studies, midwives were under supervision of an obstetrician 28 or cases showing abnormal course of labour were re-evaluated by senior obstetrician. 27,35 When reported, frequency of interventions during labour ranged from 7.6 to 46% for oxytocin augmentation, from 17% of artificial rupture of membranes to 100% in women in active phase of labour. Caesarean sections were performed in 2.7-60% of the births. Table 1 and Figure 2A present the percentages of women who crossed the alert line, the prevalence of adverse fetal outcomes, and diagnosis accuracy measures of five included studies providing data on occurrence of fresh stillbirths (n = 17 029 women). The frequency of fresh stillbirths varied from 0 to 1.4% and alert line crossing varied from 9.3 to 75.9%. The sensitivity of the 1-cm/h threshold (alert line) ranged from 36.0 to 100%, and the specificity from 24.1 to 91.1% for prediction of stillbirth during labour.
The same measures described above for neonatal outcomes are presented in Table 2 and Figures 2B,C, including Apgar score <7 at 5 minutes, birth asphyxia, and neonatal mortality following failed resuscitation after birth. There is wide variation in the frequency of adverse neonatal outcomes across the included studies (0.6-17.2%), with between 19.2 and 75.9% of women crossing the alert lines. Sensitivity and specificity ranges are very large, and results showed poor accuracy for all neonatal outcomes. No study showed a robust diagnostic test accuracy profile (i.e. positive likelihood ratio >10, negative likelihood ratio <0.20, diagnostic odds ratio >100, J statistic >50-80%). Table S3 presents results for Apgar score <7 at 1 minute and neonatal resuscitation. Heterogeneity of definitions precluded summary estimates for neonatal resuscitation. The diagnostic test accuracy measures for the action line are presented in Tables S4 and S5, respectively, for fresh stillbirths and neonatal outcomes, with similar results. Results of the composite outcome including fetal, maternal, and neonatal outcomes are presented in Table S6.

Main findings
In general, no study showed a robust diagnostic test accuracy profile of the alert and action lines for any of the outcomes studied.

Strengths and limitations
To our knowledge, this is the first review assessing diagnostic accuracy of the partograph alert and action lines for the identification of women at risk of birth complications. Included studies covered populations with diverse obstetric history and characteristics, exposed to a range of healthcare practices and contexts. Most of included studies provided recent data from the last decade. These characteristics favour generalisability of our results to the current management of women in labour in low-and middle-income countries. No eligible studies were conducted in high-income countries. Large variations in prevalence of adverse outcomes and diagnostic test performance should be interpreted with caution. The interaction between prevalence and sensitivity and specificity should not be overlooked, particularly when the prevalence of the condition is low, and considering that the occurrence of false positives can erode the test performance.
Overall quality of the studies was low, mainly in relation to the inherent limitations in the design and conduct of the primary studies. Selection of populations and definitions of outcomes included in the review varied, in particular for resuscitation and low Apgar. We hypothesised that when timing of resuscitation was not reported, resuscitation referred to the period immediately after birth, but cases reported could have referred to any moment during hospital stay after birth and may not be directly related to childbirth. In addition, the review focused on the accuracy of the partograph line to identify women and fetuses at risk of birth complications and did not include the same assessment for other variables registered on the partograph. We were unable to assess the usefulness of the alert line in optimising referral of women in labour from rural or primary healthcare facilities to secondary or tertiary units, as only one of the studies identified was conducted in a peripheral hospital. 24 Finally, we did not assess new labour curves and consequent new partographs (i.e. those not abiding to the 1 cm/h dilatation rate rule). 13

Interpretation
The largest study included in this review found a mild increase in the risk of adverse birth outcomes in slow labours compared with fast labour. 33 The authors recognised that fetal and early neonatal outcomes are much more likely to be impacted by events that are not related to cervical dilatation rate. Those authors found that although other partograph variables were associated with mild to moderately increased odds of severe adverse birth outcomes, these also had poor diagnostic performance for the prediction of severe birth outcomes.
In light of the findings of another systematic review on cervical dilatation patterns, 37 it is not surprising that the alert and action line failed to identify women at higher risk of adverse outcomes. That review showed that it is not uncommon for women to experience long labours and still have good birth outcomes. It also showed that labour progression in women with normal birth outcomes is not linear, and that dilatation rates before 5 or 6 cm may be slower than 1 cm/h, but cervical dilatation rates may be faster after 5-6 cm. These findings may help to explain our results: if it is common for women with good birth outcomes to have labours slower than 1 cm/h, they could have crossed the partograph alert line between 3 and 4 cm and up to 6 cm.
Thus, overall, studies included in this review showed high proportions of women crossing the alert line, in contrast to the expected rate of alert line crossing which was supposed to represent the slowest 10% of labour progress in primigravidas. 9 In this sense, the use of the alert line alone to trigger referrals from peripheral to higher level hospitals may have unnecessarily increased referrals of women who otherwise had labours that were progressing normally. This may have had huge emotional, physical, and costs implications not only for the women, the fetus, and their families, but also for healthcare providers in referring and receiving facilities and the health system, in particular in places where referral systems are suboptimal or where higher level maternity units are overcrowded or understaffed. 38 Included studies did not systematically report whether protocols were in place for assessment of labour progress, fetal vital status, and birth asphyxia, which could have affected the management of complications and the reported adverse birth outcomes across studies. Differences in protocols used along with the partograph to manage labour, including labour dystocia as depicted by the partograph lines, may have affected progress of labour and outcomes, and increased the risks of iatrogenic adverse outcomes, related to the use of oxytocin, caesarean section or suboptimal referrals, or decreased that risk if interventions to accelerate labour had a positive effect in averting adverse outcomes. The need for intensified monitoring and specialised care, e.g. to monitor an augmented labour or to perform a caesarean section, may have further increased costs and contributed to staff fatigue and burn out in health facilities.
This review does not question the importance of labour monitoring for all women and fetuses. Healthcare professionals should continue to plot other partograph parameters to monitor the well-being of the woman and her baby, and identify risks for adverse birth outcomes until new tools are proven to be more or equally effective. The Cochrane review 7 on the partograph recognises that its use may provide some benefits in terms of quality of care benefits. The alert line continues to be relevant for care of women in healthcare facilities where interventions such as augmentation and caesarean section cannot be performed and where referral-level facilities are difficult to reach.
Recently, some organisations have revised their labour definitions of active first stage of labour, to start at 5 or 6 cm, 1,39,40 and to accommodate a cervical dilatation rate slower than 1 cm/h as the normal threshold. New tools have been developed to allow for longer labours without intervention. However, assessment of these new definitions and tools is limited and have included small samples, 41,42 conducted in high-income countries or focusing on reduction of caesarean section. 36,[41][42][43] Others have focused on evaluation of the impact of new labour definitions on labour outcomes interventions. Two observational studies conducted in the USA yielded to different findings in terms of frequency of maternal and neonatal morbidity, and reduction of caesarean sections. 42,44 There is a need to assess the added value of different designs of labour monitoring tools in the improvement of birth outcomes and reduction of unnecessary interventions during labour. This should also include cost-benefit analysis, considering the potential reduction in unnecessary interventions. 44

Conclusion
The body of evidence compiled in this systematic review does not support the use of a threshold of 1 cm/h of cervical dilatation to identify women at risk of adverse birth outcomes. Women with fast labours (i.e. not crossing the alert, or action, line) are not free of risk of adverse birth outcomes. There is a need to identify optimal benchmarks for assessing progress of labour to guide birth attendants on when best to intervene to reduce adverse birth outcomes for women and infants.

Disclosure of interests
OTO, JPS, and AMG participated in a large study on labour monitoring and action with a component that included assessment of diagnostic accuracy of labour curves in the partograph. MB has no conflicts of interest to

Contribution to authorship
OTO and JPS conceived the review and drafted the protocol of the review, with input from MB and AMG. OTO worked with the WHO information specialists to build the search strategies and undertake the searches. OTO, JPS, and MBS performed the initial screening of search outputs, identified eligible studies, and extracted data. JPS and MB performed the data analysis with inputs from the other authors, and MBS wrote the first draft of the paper. All authors contributed to revising the final version and approved the manuscript for publication.

Funding
The UNDP/UNFPA/UNICEF/WHO/World Bank Special Programme of Research, Development and Research Training in Human Reproduction (HRP), Department of Reproductive Health and Research, World Health Organization funded the preparation of this systematic review through a grant from the United States Agency for International Development (USAID), as part of the evidence base preparation towards the WHO recommendations on intrapartum care for a positive childbirth experience.

Supporting Information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Figure S1. Risk of bias assessment. Table S1. Description of included studies. Table S2. Suggested interpretation of diagnostic accuracy statistics. Table S3. Diagnostic test accuracy of the alert line for adverse neonatal outcomes. Table S4. Diagnostic test accuracy of the action line for adverse fetal outcomes. Table S5. Diagnostic test accuracy of the action line for adverse neonatal outcomes. Table S6. Diagnostic test accuracy of the alert line for composite adverse birth outcomes.