Noninvasive prenatal screening for patients with high body mass index: Evaluating the impact of a customized whole genome sequencing workflow on sensitivity and residual risk

Abstract Objective Women with high body mass index (BMI) tend to have reduced fetal fraction (FF) during cell‐free DNA‐based noninvasive prenatal screening (NIPS), causing test failure rates up to 24.3% and prompting guidelines that recommend aneuploidy screening other than NIPS for patients with significant obesity. Because alternatives to NIPS are only preferable if they perform better, we compared the respective sensitivities at different BMI levels of traditional aneuploidy screening and a customized whole‐genome sequencing NIPS. Method The relationship between FF, aneuploidy, and BMI was quantified from 58 105 patients screened with a customized NIPS that does not fail samples because of low FF alone. Expected analytical sensitivity as a function of aneuploidy and BMI (eg, trisomy 18 sensitivity when BMI = 35) was determined by scaling the BMI‐ and aneuploidy‐specific FF distribution by the FF‐ and aneuploidy‐specific sensitivity calculated from empirically informed simulations. Results Across all classes of obesity and assuming zero FF‐related test failures, analytical sensitivity for the investigated NIPS exceeded that of traditional aneuploidy screening for trisomies 13, 18, and 21. Conclusion Relative to traditional aneuploidy screening, a customized NIPS with high accuracy at low FF and a low test‐failure rate is a superior screening option for women with high BMI.

pregnancy (typically 15-to 20-week gestation). Therefore, there is clinical utility and patient desire for noninvasive screening modalities to identify pregnancies at increased risk for aneuploidy at an earlier gestational age with high sensitivity and specificity.
Two prenatal screening approaches are widely used today. The first relies upon measurements that do not involve DNA ("non-DNA screening"), including serum marker levels (eg, concentrations of alpha-fetoprotein and pregnancy-associated placental protein A), and imaging analysis (eg, nuchal translucency) collected in the first and/or second trimester. The second approach is noninvasive prenatal screening (NIPS) via cell-free DNA (cfDNA). 4 Non-DNA screening indirectly tests for trisomy 21 (T21), trisomy 18 (T18), and trisomy 13 (T13) by measuring biomolecule concentrations and ultrasound features that differ in affected and normal pregnancies. There are many permutations of non-DNA screening (eg, combined screening, sequential screening, and integrated screening), with integrated screening showing the highest sensitivity. Though seminal studies (eg, the FaSTER 5 Trial) have characterized performance of non-DNA screening for trisomy 21, herein we use Baer et al. 1 as our reference for non-DNA screening performance because it reports sensitivity results for T13, T18, and T21, plus it involved greater than 10x more patients than the FaSTER Trial, greater than 90% of whom received integrated screening. Non-DNA screening sensitivity for T21 and T18 was 92.9% and 93.2%, respectively, 1 but together with the specificities (96.0% for T21, 99.6% for T18, calculated from Tables 2 and 3 in 1 ), the positive predictive values (PPV) for these trisomies are lackluster: 6.2% for T21 and 14.8% for T18 (calculated from Tables 2 and 3 in 1 ). For confirmed T13 cases, non-DNA screening returned abnormal results in 80.4% of patients, 1 making this number the effective sensitivity; however, because non-DNA screening does not specifically identify T13 as the source of abnormality, specificity and PPV cannot be directly calculated.
NIPS directly interrogates cfDNA extracted from maternal plasma, which consists primarily of maternal-derived DNA but critically also contains a minority of genomic material from the pregnancy. Relative to serum-and imaging-based approaches, NIPS has higher sensitivity, specificity, and PPV (99.7%, 99.96%, and 96.7%, respectively, for T21 6 ). The sensitivity of NIPS is not constant for all pregnancies, rather, the ability to detect aneuploidy scales with the proportional share of fetal-derived cfDNA in the maternal plasma (ie, the "fetal fraction" or "FF"). 7 Many NIPS laboratories fail samples below a FF threshold because of concerns about reporting false negatives as a result of diminished sensitivity. [8][9][10] However, low-FF performance is both platform and laboratory dependent: Modeled versions of the two common NIPS platforms-the whole-genome sequencing (WGS) and single-nucleotide polymorphism (SNP) methods-show that WGS has higher sensitivity for low-FF samples at a fixed specificity level. 7 The many laboratories offering NIPS via WGS have implemented the molecular and computational aspects of the methodology differently, meaning that performance may vary.
Factors that affect FF include gestational age, chromosome abnormalities, and body mass index (BMI). FF rises with gestational age likely because of increased placental size, but the effect is relatively subtle, with FF increasing 0.1% per week on average between weeks 10 and 21 of gestation. 11 Chromosome abnormalities affect the size and structure of the placenta: T13 and T18 pregnancies tend to have compromised placentas and low FF; T21 pregnancies, by contrast, have mildly elevated FF. 12 High BMI is associated with lower FF, potentially because of higher turnover of maternal adipose tissue 13 or because maternal tissue is relatively more abundant than placental tissue.
Patients with high BMI have an elevated test-failure rate-as high as 24.3% in obese women 14 -on NIPS platforms with a FF threshold. [13][14][15] Reports of this elevated test-failure rate prompted the American College of Medical Genetics and Genomics (ACMG) to recommend against using NIPS in patients with "significant obesity." 16 Despite stating that NIPS is "the most sensitive screening option," ACMG instead recommended that such patients receive "aneuploidy screening other than NIPS," such as non-DNA screening. 16 Other medical societies have not provided specific guidance about patients with high BMI. Because "significant obesity" is not well defined, this recommendation potentially means that many US patients-greater than 25% with at least class I obesity and greater than 10% with at least class II obesity 14,17,18 -would be treated differently based on their height and weight alone. Further, adherence to this recommendation could create inequity in patient care because of ethnicityspecific differences in the distribution of BMI. 19 Here, we explore NIPS performance in patients with high BMI using an NIPS methodology that does not impose an FF failure threshold. Our guiding premise was that not performing NIPS on women with high BMI is only justified if the expected NIPS sensitivity actually drops below the sensitivity of non-DNA screening. In a cohort of geater than 58 000 NIPS patients, we elucidated the relationship between FF, BMI, and aneuploidy. By combining these data with an empirically informed adaptation of our model of WGS sensitivity, we calculated the expected NIPS sensitivities for T13, T18, and T21 for different BMI classes, and we compared these values to sensitivities previously reported for non-DNA screening.
What's already known about this topic?
• Women with high body mass index (BMI) often receive a test failure on noninvasive prenatal screening (NIPS) because of low fetal fraction (FF).
• The American College of Medical Genetics and Genomics recommends offering traditional aneuploidy screening to patients with "significant obesity." • NIPS offerings differ in their efficacy at low FF.
What does this study add?
• Irrespective of BMI and without FF-based test failures, it is possible for a customized NIPS to provide all women with accurate prenatal screening.

| Patient cohort
The study included 58 105 patients who underwent WGS-based NIPS over an 8-month period with the Prequel Prenatal Screen (Myriad Women's Health, South San Francisco, California) and whose height, weight, and ethnicity were reported on the test requisition form.
Patients from New York State or who opted out of research were excluded from the study. The protocol was reviewed and designated as exempt by Western Institutional Review Board because it involved de-identified patients who had consented to anonymized research, and it complied with the Health Insurance Portability and Accountability Act (HIPAA).

| BMI calculation and NIPS results
BMI for patients in the cohort was calculated from maternal height (m) and weight (kg) as (weight/height 2 ). 20 For most analyses, BMI was evaluated in steps of five, corresponding to established classes: BMI < 25 is "normal," 25 ≤ BMI < 30 is "overweight," 30 ≤ BMI < 35 is "class I obese," 35 ≤ BMI < 40 is "class II obese," and BMI ≥ 40 is "class III obese". 20 Figure 1 shows the frequency of different BMI levels as a function of ethnicity, revealing multiple ethnicities in which at least one in four women is obese and underscoring the importance of characterizing aneuploidy screening performance in this population.
Aneuploidy was detected via a z score that measures deflections in a chromosome's WGS read-depth relative to a disomic expectation. [21][22][23] For instance, a sample was called positive for T21 if the median depth among equally sized bins tiling chromosome 21 had a sufficiently high z score relative to the corresponding medians of euploid samples. Fetal fraction was inferred using a regression model that calculates a weighted sum of the normalized read depth in bins tiling autosomes. 24

| Calculation of BMI-and aneuploidy-dependent analytical sensitivity
For a given BMI class and aneuploidy, the expected analytical sensitivity ( Figure 2, "Total Sensitivity" box) was calculated by weighing the aneuploidy-and FF-specific analytical sensitivity ( Figure 2, "Sensitivity Function" box; see "Empirically informed WGS simulation to measure sensitivity as a function of FF" in the Supporting Information Data S1) by the probability of observing a pregnancy with particular FF, aneuploidy, and BMI levels ( Figure 2, "FF distribution" box; see "Fetal-fraction distributions as a function of BMI and aneuploidy" in Supporting Information Data S1). Inputs to this analysis are shown in green boxes in Figure 2. This weighted product was evaluated at FF levels between 0% and 4% (the FF probability distribution was normalized over this range to sum to 100%) in increments of 0.1%, and the weighted values were summed to yield an expected analytical sensitivity for the entire low-FF range. To estimate the sensitivity for all samples in a particular BMI class-not just restricting to those with low FF-the sum of weighted products was evaluated from 0% to 40% FF (the FF probability distribution was re-normalized to sum to 100% over this larger range). The following equation describes the calculation of analytical sensitivity ("AS") as a function of FF, number of reads, aneuploidy, and BMI.

| Why a modeling approach was needed
Our goal was to measure NIPS sensitivity for T13, T18, and T21 in particular BMI classes and at different FF levels. The ideal, but unfortunately impractical, way to make such a measurement would be to count how frequently NIPS correctly identified aneuploid pregnancies at given BMI and FF: This approach is untenable because while high BMI is rather common, low FF is quite rare, and aneuploidy itself is very rare, together making observed cases too infrequent to power confident measurements, even with a very large data set. For instance, suppose the population frequencies for class III BMI, FF < 2%, and aneuploidy were 10%, 2%, and 1%, respectively, in order to have Information Data S1).

| Determining how BMI and aneuploidy affect FF distribution
Though it had previously been observed that BMI and aneuploidy affect FF levels, our modeling approach required detailed resolution of a quantitative relationship among these factors. The relative frequencies of different FF levels are well described by a beta distribution ( Figure 3A); thus, we sought to determine how the shape of the beta distribution would change for different chromosomal aneuploidies and BMI classes. First, we observed via linear regression analysis of the raw data ( Figure 3B) that FF tends to fall as BMI increases (downward slope of the linear fit in Figure 3B and leftward shift of the entire distribution in Figure 3C). Next, we found that T13 and T18 have downward-shifted FF (ie, their FF levels were at low percentile values relative to the distribution of euploid samples; Figure 3B,D), and T21 had upward-shifted FF, with values comparable with the high percentile range of euploid samples. By combining these observations as described in the Section 2 and Supporting Information Data S1, we approximated the beta distributions of FF for each aneuploidy and BMI class; sample distributions for a BMI of 35 are shown in Figure 3 E (distributions for other BMI levels shown in Figure S1).

| Empirically informed simulation of WGS sensitivity for common aneuploidies
Previous comparison of the WGS and SNP methodologies modeled the performance of each platform in idealized conditions, 7 but our aim in this study was to describe empirical sensitivity of our customized WGS methodology. By analyzing data from sequential clinical On average, FF decreases as BMI increases, and T13 and T18 pregnancies tend to have lower FF than euploid and T21 pregnancies. C, FF frequency distribution as a function of BMI and irrespective of fetal ploidy. The number of patients in each BMI class is indicated in the legend. Each trace is a beta distribution fit to the empirical data. D, Each trace depicts the probability density of the percentile of screen-positive aneuploid samples relative to euploid samples in the same BMI class (see Section 2). Because they have higher probability at low percentiles, T13 and T18 tend to have lower FF than euploid samples; by contrast, T21 positives tend to have higher FF than euploid samples because the trace is elevated at high percentiles. E, The BMI-specific inferred FF distributions for each ploidy state can be deduced from data in panels A to D; the particular traces are shown for a BMI of 35 samples processed in our laboratory (see Supporting Information Data S1), we determined the number of reads per sample at which our WGS-based NIPS behaves in a Poisson manner ( Figure S2) and then used this reads-per-sample level in our WGS simulations 7 to predict analytical sensitivity for T13, T18, and T21 at low FF ( Figure 4). As anticipated, 25 these empirically informed sensitivity estimates were comparable with the idealized levels from our previous analysis. 7 The correspondence between Figure 3D and Figure 4 has two noteworthy features that enable high sensitivity for our implementation of the WGS method of NIPS. First, even though FF levels of T13 and T18 were downward shifted ( Figure 3D), sensitivity was relatively high on these chromosomes (as compared with T21) because of their larger size ( Figure 4). Second, despite T21 having lower sensitivity relative to T13 and T18 at low FF levels because of its size, T21 pregnancies tend to have upward-shifted FF levels, meaning that WGSbased NIPS is sensitive in the FF regime where it is needed.

| Expected WGS-based NIPS analytical sensitivity exceeds that of non-DNA screening
The results of the FF analysis (eg, Figure 3E for BMI = 35) and analytical sensitivity simulations (Figure 4) enabled direct calculation of expected analytical sensitivity for a given aneuploidy and BMI class ( Figure 5; see Section 2). For each aneuploidy, even though sensitivity declines as BMI rises, the analytical sensitivity remained above 94%, even for patients with class III obesity (BMI > 40). The estimated analytical sensitivity of our customized NIPS exceeded the clinical sensitivities via non-DNA screening (blue traces remain above gray region in Figure 5A-C; Figure 5D).
As a function of BMI, we compared the expected sensitivity of NIPS offerings that fail low-FF samples to the sensitivity of non-DNA screening. Aneuploidies among failed samples are undetected by the test and lower a test's actual sensitivity. 26 As such, we set FF-specific sensitivity to 0% for FF values below published cutoffs for other implementations of NIPS-2.8% in Ryan et al 8

and 4% in
McCullough et al 9 -and estimated the impact on expected sensitivity as a function of BMI ( Figure 5A-C, orange and brown traces). With either FF threshold, the missed positives among failed samples lowered overall NIPS sensitivity to a level below that of non-DNA screening. Furthermore, we expect this analysis to have overestimated sensitivity for these tests with FF failure thresholds because they likely have reduced sensitivity approaching the failure threshold, whereas our modeling used idealized sensitivity values shown in Figure 4 for all above-threshold FF values. In total, for certain NIPS tests that require an FF threshold, the relative sensitivities at high BMI revealed in these data are consistent with the recommendation for using non-DNA screening instead of NIPS; however, the data also suggest that this recommendation should not be universal because NIPS sensitivity as a function of BMI varies by platform and laboratory.
Though not directly evaluated here, analytical specificity and PPV of WGS-based NIPS is dictated primarily by the z-score threshold: With the z-score cutoff of 3 in our simulations, the NIPS false-positive (FP) rate per chromosome was approximately 1 in 1000 or 0.1%. This specificity greatly exceeds that of non-DNA screening, which was reported to have an overall FP rate of 4.5%. 1 Together, for women with high BMI, our results suggest that non-DNA screening has lower sensitivity, specificity, and PPV than our customized WGS-based NIPS optimized for performance at low FF.

| Principal findings and results
The clinical validity and utility for fetal aneuploidy screening is maximized when patients have access to testing with the highest sensitivity, specificity, and PPV. If non-DNA approaches were strictly superior to cfDNA-based NIPS on these performance measures among patients with high BMI, then a case could be made for universal recommendation of non-DNA screening rather than NIPS in that population. However, using a large patient cohort and empirically guided modeling of a customized WGS platform that does not fail samples for having low FF, we have shown that estimated NIPS performance can exceed the clinical performance of non-DNA screening for patients with high BMI ( Figure 5).   is true fetal mosaicism, present in less than 1% of pregnancies with T13, T18, and T21. 30 Indeed, multiple studies demonstrate the high clinical sensitivity of NIPS. [31][32][33] Finally, we have assumed non-DNA screening sensitivity is constant across all BMI classes, but it is wellknown that the ability to obtain an NT measurement decreases with increasing BMI, with some estimates noting failure rates up to 22%. 34 Therefore, we are likely overestimating the performance of non-DNA screening in this patient population.

| CONCLUSION
Many pregnant women have high BMI in the United States and have low FF levels that yield elevated test failure rates on most NIPS offerings, highlighting the need for alternatives. Though the alternatives include non-DNA screening modalities, an NIPS customized and demonstrated to be sensitive at low FF should also be among the alternatives and potentially the preferred option because of its superior sensitivity at all BMI levels.