Diagnostic accuracy of postmortem ultrasound vs postmortem 1.5‐T MRI for non‐invasive perinatal autopsy

ABSTRACT Objectives To determine the diagnostic accuracy of postmortem magnetic resonance imaging (PM‐MRI) and postmortem ultrasound (PM‐US) for perinatal autopsy in the same patient cohort, and to determine whether PM‐US can provide the same anatomical information as PM‐MRI. Methods In this prospective, 5‐year (July 2014–July 2019) single‐center study, we performed 1.5‐T PM‐MRI and PM‐US in an unselected cohort of perinatal deaths. The diagnostic accuracies of both modalities were calculated, using autopsy as the reference standard. As a secondary objective, the concordance rates between the two imaging modalities for the overall main diagnosis and for five anatomical regions (brain, spine, thorax, heart and abdomen) were calculated. Results During the study period, 136 cases underwent both PM‐US and PM‐MRI, of which 88 (64.7%) also underwent autopsy. There was no significant difference in the rates of concordance with autopsy between the two modalities for overall diagnosis (PM‐US, 86.4% (95% CI, 77.7–92.0%) vs PM‐MRI, 88.6% (95% CI, 80.3–93.7%)) or in the sensitivities and specificities for individual anatomical regions. There were more non‐diagnostic PM‐US than PM‐MRI examinations for the brain (22.8% vs 3.7%) and heart (14.7% vs 5.1%). If an ‘imaging‐only’ autopsy had been performed, PM‐US would have achieved the same diagnosis as 1.5‐T PM‐MRI in 86.8% (95% CI, 80.0–91.5%) of cases, with the highest rates of agreement being for spine (99.3% (95% CI, 95.9–99.9%)) and cardiac (97.3% (95% CI, 92.4–99.1%)) findings and the lowest being for brain diagnoses (85.2% (95% CI, 76.9–90.8%)). Conclusion Although there were fewer non‐diagnostic cases using PM‐MRI than for PM‐US, the high concordance rate for overall diagnosis suggests that PM‐US could be used for triaging cases when PM‐MRI access is limited or unavailable. © 2020 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of International Society of Ultrasound in Obstetrics and Gynecology.


INTRODUCTION
A perinatal autopsy can provide useful additional clinical information in approximately 25-36% of cases 1,2 , not only allowing parents to understand the circumstances surrounding their child's death, but also helping to refine clinical management for future pregnancies. Nevertheless, more than half of all parents typically decline a 450 Shelmerdine et al. conventional autopsy for personal, emotional or religious reasons, many referring to the invasive nature of the procedure [3][4][5] . Non-invasive autopsies, utilizing imaging techniques, have therefore seen an increase in popularity, with postmortem magnetic resonance imaging (PM-MRI) and postmortem ultrasound (PM-US) demonstrating high rates of concordance with autopsy as the reference standard 6,7 .
Much of the published literature relating to PM-US and PM-MRI reports diagnostic accuracy rates within different cohorts of patients, making direct comparison of these two imaging modalities difficult. Recent work by Kang et al. 8 showed that when PM-US and PM-MRI (at 3T) were used in the same cohort of fetuses which underwent perinatal death, their rates of concordance with autopsy were similar when diagnostic-quality imaging was achieved (approximately 81-96% concordance with autopsy). However, access to 3-T MRI may be limited and, while 1.5-T MRI is much more widely available, some centers may have no practical access to MRI at all, as clinical scanner time may mean that postmortem cases cannot be accommodated. How the diagnostic accuracy compares between PM-US and 1.5-T PM-MRI in the same perinatal cohort is currently unknown. This, however, is important, as PM-US could be an easily accessible, cheaply available alternative screening tool when access to MRI is limited, providing it has sufficient diagnostic accuracy.
In this prospective, single-center cohort study, our objectives were two-fold. First, we aimed to determine the diagnostic accuracy of both 1.5-T PM-MRI and PM-US, using autopsy as the reference standard, in the same cohort of perinatal deaths. Second, we aimed to review the major differences between the two imaging modalities in overall and organ-specific diagnosis, to evaluate the impact that using PM-US rather than 1.5-T PM-MRI would have. These outcomes could provide an evidence basis for setting national non-invasive autopsy imaging protocols.

METHODS
Ethical approval was granted for this single-center, prospective cohort study, conducted at Great Ormond Street Hospital, London, UK (IRAS ID:13195; REC reference: 13/LO/1494). All parents gave written consent allowing postmortem imaging and autopsy (where this was performed) to be conducted. The study was approved by a national research ethics committee (REC 09/H0713/2) and all samples were handled in accordance with the Human Tissue Act (2004).

Patient selection
We included in this study, over a 5-year period (July 2014-July 2019), consecutive unselected perinatal deaths which had both perinatal PM-US and 1.5-T PM-MRI. Cases referred for any type of perinatal autopsy were included in our cohort. The decision to undertake both imaging modalities was not predetermined, being based on the availability of both PM-US and 1.5-T PM-MRI at the time of case referral, and was performed in an arbitrary order. No inclusion or exclusion criteria were applied.
Demographic details for each case, including gestational age, gender, mode of fetal death, date of birth/death, postmortem weight and fetal size (i.e. crown-rump and crown-heel lengths) were documented. A maceration score (based on a severity scale of 0 to 3; with 3 assigned to cases with extensive/marked maceration) was also assigned to each case, based on the pathologists' description at external examination.

PM-MRI and PM-US imaging and reporting
PM-MRI was performed on a 1.5-T Avanto (Siemens, Munich, Germany) scanner, according to published local departmental protocols 9,10 . In brief, this included whole-body, isovolumetric T2-weighted and T1-weighted sequences with diffusion-weighted imaging. PM-US examinations were performed using a dedicated ultrasound machine based in the hospital mortuary (UGEO HM70A, Samsung, Munich, Germany, equipped with a 7-16-MHz linear probe). The examinations were performed by a pediatric radiology research fellow (S.C.S., with 4 years' experience in postmortem pediatric imaging and 6 years' general pediatric radiology experience) according to previous publications 11,12 .
The radiology report was issued on the same day as the imaging examination, and entered into our local radiology information system and the pathology database. All radiology reports followed a predefined reporting template, divided according to five anatomical regions: brain, spine, heart, thorax and abdomen. For each area, the radiologist specified either 'normal' or 'abnormal', with further description of the particular abnormality and organ involved within the body area. A final overall diagnosis (normal/abnormal, with further description) was also provided. When the imaging for a particular anatomical region was non-diagnostic, this was recorded also.
The PM-MRI findings were reported by a specialist pediatric radiologist with expertise in postmortem imaging (O.J.A., with 10 years' experience of postmortem pediatric imaging). The PM-US studies were reported by the same radiology research fellow who had performed the examination (S.C.S.). Both radiologists were blinded to the antenatal and maternal history, and to each other's reports; they were informed only of the patient's gestational age and manner of death (i.e. termination of pregnancy, stillbirth, miscarriage).

Histological sampling and autopsy
When parental consent had been provided for a full, conventional autopsy, this was conducted according to the Royal College of Pathologists autopsy guidelines [13][14][15] by one of seven experienced pediatric pathologists at our institution. When consent had been provided for a minimally invasive autopsy (MIA), this was either conducted by a pediatric pathologist via a laparoscopic keyhole technique 16,17 or the radiology research fellow performed ultrasound-guided biopsies 18 of the major organs, which were then assessed histologically by a pediatric pathologist. When consent was given only for external, non-invasive invasive autopsy, no incisions were made or tissue sampling performed. As part of routine practice, dissection of the spine is not performed for any of the autopsy types. For MIA, brain dissection is not performed unless there is specific additional consent for this.
In all cases, the pediatric pathologist issuing the final autopsy report was aware of the imaging findings. The results from the autopsy were subsequently entered into the pathology database, with abnormalities being specified according to the same five predefined body areas as in the imaging reports.

Data analysis
Results from the radiology and autopsy reports were extracted from the pathology database and input into a dedicated research database in Microsoft Excel (Microsoft Corp., Redmond, WA, USA). For our primary aim, we calculated the diagnostic accuracy rates for PM-MRI and PM-US, using autopsy as the reference standard. In this calculation, we included only cases in which histological tissue sampling had been performed (i.e. full conventional autopsy or MIA). Descriptive statistics and diagnostic accuracy calculations using exact methods were used to derive sensitivity, specificity, positive (PPV) and negative (NPV) predictive values and positive and negative likelihood ratios, as well as concordance for organ-specific findings and overall diagnoses. Non-diagnostic body parts were also analyzed, but excluded from accuracy rate calculations. For our secondary aim, all cases, which had undergone both PM-US and PM-MRI (regardless of autopsy type), were included for analysis and overall differences between them with respect to both the main diagnosis and the findings according to organ system were compared using descriptive statistics.

Statistical power calculation
Prior to the study, we calculated the sample size required to detect a 10% difference in concordance between PM-MRI and PM-US in all five anatomical regions assuming independence between regions, a 5% significance level and 90% power. The sample-size formula for a matched comparison of two diagnostic tests was used 19 . Calculations were based on concordance estimates derived from local PM-MRI data 6 and assumed that PM-US has 10% lower sensitivity in all five anatomical areas. The abdominal region required the largest sample size to maintain 90% power. Assuming an MRI concordance with autopsy of 89% in the abdominal region (and therefore 79% for PM-US), we determined that a cohort of at least 278 cases would be required.
Cases that did not have an autopsy were of a slightly lower average gestational age (24 weeks (non-invasive autopsy) vs 26 weeks (full, conventional autopsy) and 29 weeks (MIA)), and thus had an associated lower postmortem weight and length. Cases that underwent MIA were more likely to be markedly macerated (44.6% of cases) than were those which underwent full autopsy (18.8%) or non-invasive autopsy (14.6%).
In Table S1 we have provided the positive and negative likelihood ratios of both imaging modalities for the different body systems and the overall diagnosis. The highest positive likelihood ratios were for brain PM-MRI and PM-US diagnoses (∞), with those for overall diagnosis on PM-US being 12.23 (95% CI, 4.04-36.99) and on PM-MRI being 7.07 (95% CI, 3.33-15.03). The values for both positive and negative likelihood ratios were not significantly different between PM-US and PM-MRI.
Details of individual false-negative and false-positive diagnoses are provided according to anatomical region in Table S2. For the brain, there were no false-positive diagnoses generated by either imaging modality, although PM-MRI failed to identify one case of cerebellar hypoplasia and both modalities were unable to detect a case of severe hypoxic-ischemic encephalopathy with periventricular necrosis and hindbrain neuronal loss.

452
Shelmerdine et al.  Values in parentheses are 95% CI. There were no statistically significant differences in diagnostic accuracy between two imaging modalities. *Overall diagnosis refers to major pathology identified as cause of perinatal death. FN, false negative; FP, false positive; ND, non-diagnostic; NPV, negative predictive value; PPV, positive predictive value; TN, true negative; TP, true positive. Discrepancies in thoracic findings arose mainly from the overcalling or missing of pulmonary hypoplasia (in three cases for PM-US only; in two cases for PM-MRI only; and in three cases for both modalities), suggesting a high level of subjective opinion in this diagnosis. For cardiac anomalies, PM-US missed four diagnoses (one case of dilated cardiomyopathy, one of double-outlet right ventricle (DORV), one of cardiac hypertrophy and dysplastic pulmonary valve, and one, also missed by PM-MRI, of cardiomegaly) and overcalled one case of ventricular septal defect (VSD), while PM-MRI missed two cardiac anomalies (the case of cardiomegaly and one of VSD) and overcalled one case of DORV.
There were no misses on either modality for abdominal pathologies. However, PM-MRI generated more false-positive diagnoses compared with PM-US; two abnormalities were overcalled on both imaging modalities, one case of anal atresia (in which the anus was present and patent) and one case of intra-abdominal gas with suspected sepsis (in which no organisms were identified on microbiology of the abdomen, although the placenta demonstrated chorioamnionitis).
Examples of cases in which both imaging modalities identified correctly the pathological diagnosis are shown in Figures 1 and 2, while Figures 3-5 give examples of when one or both modalities were inaccurate.

Cases with autopsy, but no neuropathology
For the 57/88 (64.8%) cases in which the brain was not examined at autopsy, the breakdown of imaging results were as follows. On both PM-US and PM-MRI, 32 of the 57 (56.1%) were normal and 1/57 (1.8%) was judged to have absent corpus callosum (ACC), ventriculomegaly and periventricular nodular heterotopia. On PM-MRI only (PM-US was normal), 1/57 (1.8%) had isolated ACC, 1/57 (1.8%) had cystic hygroma and 1/57 (1.8%) had cerebellar hypoplasia and mega cisterna magna. There were no abnormalities observed only on PM-US and not on PM-MRI. In 16/57 (28.1%) cases, PM-US was non-diagnostic while PM-MRI was normal, in 3/57 (5.3%) cases the brain was not examined at PM-US due to overlapping sutures (PM-MRI was normal) and in 2/57 (3.5%) both modalities were non-diagnostic.

Agreement between PM-US and 1.5-T PM-MRI
If PM-US imaging were to be performed for all perinatal deaths instead of PM-MRI, the same overall diagnosis would be seen in 86.8% (95% CI, 80.0-91.5%) of cases, with the highest concordance rates being for spine (99.3% (95% CI, 95.9-99.9%)) and cardiac (97.3% (95% CI, 92.4-99.1%)) diagnoses and the lowest concordance rate being for brain diagnoses (85.2% (95% CI, 76.9 -90.8%)). PM-US detected one case of hypoplastic cerebellum and one case of pelvic kidney for which PM-MRI was negative or non-diagnostic. PM-US failed to diagnose an anomaly identified by PM-MRI in 16/136 (11.8%) cases, of which 10 (62.5%) were brain-related. Individual overall diagnoses by each imaging modality are detailed in Table S3, with a more detailed breakdown of the findings according to anatomical region being provided in Table S4.

456
Shelmerdine et al. particular emphasis on gestational age at delivery, extent of maceration and postmortem interval (PMI). Cases that were non-diagnostic for the brain at PM-US and PM-MRI (both individually and combined) were more likely to be > 20 weeks' gestation, to have suffered marked maceration-related changes and to have a time interval between delivery and imaging of 8-14 days. Cases that were non-diagnostic for the heart at PM-US and PM-MRI (individually) were more likely to have extensive maceration and to have a PMI of 8-14 days, while those that were non-diagnostic for both PM-US and PM-MRI were more likely to have a PMI > 15 days. There were no non-diagnostic imaging examinations for the thorax or spine. PM-MRI was non-diagnostic for one abdominal study in a fetus < 20 weeks' gestation, without any maceration, imaged > 15 days post-delivery.

Pathological yield in cases with non-diagnostic imaging
There were six cases with non-diagnostic PM-US of the brain for which autopsy data were available; there were no abnormalities in the brain in three (50.0%) of these cases. The three with brain pathology included one case of ACC, one case of ACC with occipital polymicrogyria and one case of vein of Galen malformation. Similarly, autopsy data showed no abnormalities of the heart in 9/12 (75.0%) cases with non-diagnostic PM-US; the cardiac anomalies present in the other three cases included one case of hypoplastic aortic arch, one VSD and one transposition of the great arteries. There were no cases in which there was non-diagnostic PM-MRI of the brain or abdomen that had additional information at autopsy. For the two cases with non-diagnostic PM-MRI of the heart that had autopsy findings, the autopsy was normal in one (50.0%) case and in the other there was a VSD.

DISCUSSION
In this study we found that, when diagnostic images were obtained, there were no significant differences in accuracy between perinatal PM-US and 1.5-T PM-MRI. If PM-US were to be used as a frontline imaging tool instead of 1.5-T PM-MRI, the same overall diagnosis would be reached in the majority (> 85%) of cases. There was a higher rate of non-diagnostic imaging on PM-US compared with PM-MRI, particularly of the brain and heart. Marked maceration was a common contributing factor to this for both modalities. Our results are supported by the only other published work comparing PM-US with PM-MRI 8 . In that study, concordance with autopsy for final diagnosis was achieved in 67.8% (95% CI, 54.4-79.4%) of cases using PM-US compared with in 78.0% (95% CI, 65.3-87.7%) of cases using 3-T PM-MRI (in our study these rates were 86.4% (95% CI, 77.7-92.0%) for PM-US and 88.6% (95% CI, 80.3-93.7%) for 1.5-T PM-MRI). There were no statistical differences in their sensitivities and specificities between the two modalities for any of the five anatomical regions, as we also report. Although not significant, we found the largest differences in sensitivity between PM-US and 1.5-T PM-MRI to be for cardiac (50.0% for PM-US vs 81.8% for PM-MRI) and thoracic (40.0% for PM-US vs 73.3% PM-MRI) abnormalities. For cardiac anomalies, the difference was mostly due to misses of complex cardiac anomalies for both modalities (but mostly PM-US); for thoracic abnormalities it was due to misdiagnosis of subjective pulmonary hypoplasia. The low sensitivity of PM-US could be explained by the lack of circulating blood, presence of intracardiac thrombus and gas, and the densely consolidated lungs in the postmortem state making diagnosis difficult.
With respect to non-diagnostic studies, our results reflect closely those of Kang et al. 8 , who reported a higher non-diagnostic rate for PM-US than for 3-T PM-MRI, particularly for the brain (26.9% for PM-US vs 4.4% for PM-MRI (compared with 22.8% for PM-US vs 3.7% for 1.5-T PM-MRI in our study)) and heart (30.6% for PM-US vs 3.8% for 3-T PM-MRI (compared with 14.7% for PM-US vs 5.1% for 1.5-T PM-MRI in our study)). In contrast to their results, we did not identify any non-diagnostic abdominal or spinal examinations at PM-US, whilst they reported non-diagnostic rates of 23.7% and 1.9%, respectively. This could be due to differences in ultrasound systems, operator experience or interpretation of 'diagnostic quality'. We found that marked maceration was a common factor in our non-diagnostic cases. This information could be helpful when counseling parents about the potential success of a non-invasive (imaging-based) autopsy.
Our study has several clinical implications for the potential future role of PM-US in perinatal non-invasive autopsy, especially when access to PM-MRI is limited or is unavailable. Given the high concordance for overall diagnosis between PM-MRI and PM-US, it is reasonable to suggest that PM-US could be used as a first-line or alternative imaging tool, particularly in cases in which an abdominal or spinal abnormality is suspected. Given that the lowest sensitivities and specificities were seen for cardiothoracic abnormalities, and that non-diagnostic rates for brain and heart PM-US were also high, PM-MRI should be considered as the first-line imaging tool for suspected cardiac malformations, and as a second-line tool when PM-US of the brain is non-diagnostic. This would help to minimize missed diagnoses, given that autopsy confirmed the presence of intracranial and cardiac abnormalities in 50% and 25% of cases, respectively, when PM-US was non-diagnostic.
Our study has several limitations. The main one relates to our relatively small sample size, in particular the subset which also had autopsy results available, this group being smaller than intended according to our power calculation. This resulted in wide confidence intervals for many of our diagnostic-accuracy rates, and may have precluded detection of any significant differences. Nevertheless, these early results show that the overall sensitivity and specificity rates for body organs and overall diagnoses for both imaging modalities were very similar to those of previously published work 6,11,[20][21][22][23][24][25][26][27][28] , and we included as many cases from our center as possible, spanning a 5-year study period. Second, we acknowledge that ultrasonography is operator-dependent and our PM-US was conducted by a specialist experienced pediatric radiologist at a tertiary center. It may be difficult to replicate this in other centers, and thus our diagnostic quality and accuracy rates may not be widely generalizable. We recommend comprehensive PM-US training as appropriate before considering offering a PM-US service to replace PM-MRI. Finally, given the variation in the timing of imaging after fetal delivery or neonatal death, we cannot exclude the possibility that this may have contributed to the non-diagnostic imaging quality or led to missed diagnoses. Given that our institution does not have an on-site maternity unit, there is usually a delay in the processing and transport of cases. Similarly, the availability of our MRI scanner and radiologist is variable, replicating normal clinical practice. Performing imaging as close as possible to death or delivery may help improve diagnostic rates, although this remains to be established in larger studies.
In conclusion, this study has found high diagnostic accuracy rates for both PM-US and 1.5-T PM-MRI, without significant differences between the two methods. If all cases undergoing 1.5-T PM-MRI were redirected to PM-US imaging, the final diagnosis would be the same for the majority of cases. PM-US could be implemented as a first-line imaging tool in centers wishing to offer an affordable, non-invasive autopsy service, with 1.5-T PM-MRI being most useful for suspected cardiac and brain malformations.

SUPPORTING INFORMATION ON THE INTERNET
The following supporting information may be found in the online version of this article:

Table S1
Postmortem ultrasound (PM-US) and postmortem magnetic resonance imaging (PM-MRI) positive (LR+) and negative (LR-) likelihood ratios for individual body systems, all body systems summated and overall diagnoses, using autopsy as reference standard