Cluster analysis identifying clinical phenotypes of preterm birth and related maternal and neonatal outcomes from the Brazilian Multicentre Study on Preterm Birth

To explore a conceptual framework of clinical conditions associated with preterm birth (PTB) by cluster analysis, assessing determinants for different PTB subtypes and related maternal and neonatal outcomes.

Specialists have proposed a new conceptual framework for PTB by selecting conditions that present in the index pregnancy, including maternal, fetal, and placental conditions that are not necessarily risk factors for PTB, but that are reasonably part of its pathways. 2,3 It is possible that there is not just one clinical phenotype related to PTB, and the identification of such phenotypes might shed light on the complex interactions among the underlying conditions related its occurrence.
A recent cluster analysis by a multi-ethnic international multicenter study showed that 30% of all cases of sPTB had no maternal, fetal, or placental conditions that might be related to its occurrence. 4 On the other hand, there were clusters characterized by conditions potentially sharing common severe maternal conditions with similar pathophysiological underlying conditions such as pre-eclampsia, third trimester bleeding and fetal growth restriction. In addition, it was possible to specify the most frequent clinical conditions related to its occurrence. Furthermore, not only were the predisposing causes shown to vary in the different clusters, but the maternal and neonatal outcomes were also distinct. 4 A secondary analysis using a database of sPTB cases used a different clustering approach to establish nine clinical phenotypes with three levels of evidence for each phenotype. 5 After a hierarchical cluster analysis, PTB cases were grouped into five clusters characterized by different conditions such as maternal stress, premature rupture of membranes, familial factors, maternal morbidities, and multifactorial conditions. According to the study authors, women from the same cluster were more likely to share common causes and common genetic predispositions. 5 Clustering analysis applied to PTB determinants is thus an innovative approach to identify groups of women who might require special attention, interventions, and surveillance depending on the conditions associated with the different subtypes of PTB and also the maternal and perinatal outcomes. This might be helpful for the identification of clinical phenotypes related to specific subtypes of PTB, and also facilitate studies of its determinants and associated outcomes, because the maternal clinical conditions can be identified by clinicians and healthcare providers during prenatal care.
The aim of the present study was therefore to perform a secondary analysis of The Brazilian Multicentre Study on Preterm Birth (EMIP) to identify whether there is a correlation between clustering of clinical, maternal, and fetal conditions and PTB subtypes, and to demonstrate maternal and neonatal outcomes related to the final clusters.

| MATERIALS AND METHODS
The present secondary cluster analysis was based on data from EMIP, a multicenter cross-sectional study with a nested case-control component of PTB conducted between April 1, 2011, and September 30, 2012, that collected comprehensive data related to the three subtypes T A B L E 1 Definition of maternal, fetal, and placental conditions potentially associated with preterm birth. The EMIP has been previously described. [6][7][8] In brief, it was a comprehensive observational study that identified all PTBs occurring in 20 referral facilities with more than 33 000 deliveries, and collected more than 300 variables related to potentially associated factors and maternal and neonatal outcomes. Information about medical history, sociodemographic status, and pregnancy, delivery, and postpartum details were retrospectively collected after delivery through an interview with the participating women and a review of hospital medical records including prenatal charts. Maternal and neonatal data were collected until either discharge or 40 days after delivery.

Maternal
The present analysis used the concept framework and maternal, fetal, and placental conditions of Barros et al., 3,4 which were defined as potential conditions that might be directly or indirectly related to the occurrence of PTB (Table 1). These conditions were used to establish the different preterm phenotypes.
Preterm birth was classified as one of three subtypes: spontaneous preterm birth (sPTB) due to spontaneous onset of labor; premature rupture of membranes leading to preterm birth (PROM-PTB); or provider-initiated preterm birth (pi-PTB) due to maternal and/or fetal conditions motivating preterm delivery.
The distribution of maternal and neonatal outcomes, including mode of delivery, gestational age category (extreme, moderate, and late preterm), Apgar score <7 at 5 minutes, admission to neonatal intensive care unit (NICU), neonatal near miss (based on birthweight below 1700 g, Apgar score below 7 at 5 minutes of life, and gestational age <33 weeks), and neonatal death before discharge was determined in the clusters. The distribution of some maternal and pregnancy characteristics in the PTB clusters was also determined. Adequacy of weight gain was categorized as insufficient, adequate, and excessive in accordance with the US Institute of Medicine definition for weekly rate of weight gain. 9 Statistical analysis was conducted by using SAS version 9.4 (SAS Institute, Cary, NC, USA). A cluster analysis was conducted to identify clusters dependent on the predefined maternal, fetal and placental conditions listed in Table 1. A k-modes model, which is a variation of the k-means model for categoric variables, was applied to identify clusters from the predefined conditions using a fuzzy algorithm. The number of final clusters was determined by automatized methods (no predefined number of clusters was set). χ 2 test was used to evaluate differences in maternal and neonatal outcomes among the clusters. A P value of less than 0.05 was taken to indicate significance.
Flowchart showing the study population of the Brazilian Multicentre Study on Preterm Birth. The preterm birth subtypes were spontaneous preterm birth (sPTB); preterm birth due to preterm premature rupture of membranes (pPROM-PTB); and provider-initiated preterm birth (pi-PTB). The 4150 cases of PTB were clustered into three groups according to the 12 predefined maternal, fetal, and placental conditions (Table 1).
Not having any predefined condition was also considered to be a 'predefined condition'. The prevalence of the main condition and the next most frequent conditions in the three clusters are presented in Table 2.
Cluster 1 (n=650, 15.7%) was characterized by women who did not have any defined maternal, fetal, or placental conditions. Cluster 2 (n=2319, 55.9%) was characterized by the following set of conditions: 42.5% had extrauterine infection, 34.9% had maternal chronic disease, and approximately 20% had mid-late pregnancy bleeding.
The maternal and neonatal outcomes did not differ among the three clusters (Table 5). Cesarean was the most prevalent mode of delivery, ranging from 52.7% to 55.0% of PTBs in the clusters.
The distribution of maternal and pregnancy characteristics in the three clusters was determined (Table 6). White race, obesity (body mass index, calculated as weight in kilograms divided by the square of height in meters, >25), excessive weight gain during pregnancy, and previous cesarean delivery were more prevalent in cluster 3 than in cluster 2, and more prevalent in cluster 2 than in cluster 1. None of the other characteristics examined differed among the clusters.

| DISCUSSION
The present analysis found that the 4150 PTBs of the EMIP study were clustered into three groups, which presented with very different clinical conditions (phenotypes). The first cluster had no associated T A B L E 3 Distribution of maternal, fetal and placental conditions according to clusters of preterm birth phenotype.

Cluster 3 (n=1181)
Extrauterine conditions; the second cluster had mixed conditions; and the third cluster was related to pre-eclampsia and fetal growth restriction. No differences in maternal or perinatal outcomes were observed among the clusters; regarding PTB subtype, however, the prevalence of pi-PTB was significantly higher in cluster 3 (P<0.001).
The study used an unsupervised data-driven cluster analysis, which meant that pre-clusters were not predefined and the initial number of clusters was not established. This approach enables a more genuine clustering of cases according to the predefined clinical conditions. The reproducibility of cluster analysis might depend on the dataset, and also on the availability of the defined clinical conditions. Nevertheless, it was considered that the selected clinical conditions are reproducible and commonly addressed in PTB studies, and are potentially available regardless of the setting or population.
The EMIP study followed standardized data collection protocols and several procedures to assure data quality. 10 Nevertheless, the present analysis has some limitations. First, there were no data on cervical length, a maternal condition that is highly associated with the occurrence of sPTB. 11 Second, it was an observational study with retrospective data collection after delivery for variables related to pregnancy. Therefore, the classification of some conditions was based only on self-report by the participating women or on medical records/prenatal charts, limiting the standardization and audit. Last, the definition of maternal chronic disease was based on different diseases that have potentially distinct effects on maternal and fetal health during pregnancy.
In the present analysis, the conceptual framework used by Barros  80% of women in the mixed-conditions cluster (cluster 2) had sPTB or pPROM-PTB, confirming that women with this subtype of PTB may have a multiplicity of conditions, which the present cluster analysis resolved into an inseparable group, in contrast to the findings of Barros et al. 4 Not surprisingly, the cluster of women with pre-eclampsia, eclampsia, or HELLP syndrome also included fetal growth restriction as the second most frequent condition (cluster 3). Both conditions are "great obstetric syndromes" that are directly linked to ischemic placental disease and share common altered placentation mechanisms. 15,16 Hypertensive disorders and fetal growth restriction are the most important indications for pi-PTB due to maternal or fetal conditions, 8,17 which explains the high rates of pi-PTB in cluster 3. The prevalence of obesity and excessive weight gain during pregnancy was higher in cluster 3 than in the other clusters. Both conditions are considered risk factors for hypertensive disorders, but not for fetal growth restriction. 18 It is estimated that pre-eclampsia and fetal growth restriction account for only approximately 12% of ischemic placental disease in PTBs. 19 Although there is a concurrence of pre-eclampsia and fetal growth restriction, which are followed by poorer outcomes, the risk factors and conditions associated with each condition do not invariably overlap. 19,20 Although the present analysis identified three clusters with very distinct clinical phenotypes, we consider that a clearer definition of the predefined conditions would provide better cluster resolution, considering that women were grouped into a very few number of clusters and one of them included multiple mixed conditions. For example, infectious diseases are underlying causes of PTB; however, the lack of details regarding the severity, treatment received, and moment when women were affected by infectious disease might have underestimated the association of such conditions with PTB. This marked condition was grouped in the same cluster as many other conditions (cluster 2). Rather than simply noting that women had an infectious T A B L E 6 Maternal and pregnancy characteristics according to preterm birth phenotype clusters.

CONFLICTS OF INTEREST
The authors have no conflicts of interest.