333 research outputs found

    Expected-value bias in routine third-trimester growth scans.

    Get PDF
    OBJECTIVES: Operators performing fetal growth scans are usually aware of the gestational age of the pregnancy, which may lead to expected-value bias when performing biometric measurements. We aimed to evaluate the incidence of expected-value bias in routine fetal growth scans and assess its impact on standard biometric measurements. METHODS: We collected prospectively full-length video recordings of routine ultrasound growth scans coupled with operator eye tracking. Expected value was defined as the gestational age at the time of the scan, based on the estimated due date that was established at the dating scan. Expected-value bias was defined as occurring when the operator looked at the measurement box on the screen during the process of caliper adjustment before saving a measurement. We studied the three standard biometric planes on which measurements of head circumference (HC), abdominal circumference (AC) and femur length (FL) are obtained. We evaluated the incidence of expected-value bias and quantified the impact of biased measurements. RESULTS: We analyzed 272 third-trimester growth scans, performed by 16 operators, during which a total of 1409 measurements (354 HC, 703 AC and 352 FL; including repeat measurements) were obtained. Expected-value bias occurred in 91.4% of the saved standard biometric plane measurements (85.0% for HC, 92.9% for AC and 94.9% for FL). The operators were more likely to adjust the measurements towards the expected value than away from it (47.7% vs 19.7% of measurements; P < 0.001). On average, measurements were corrected by 2.3 ± 5.6, 2.4 ± 10.4 and 3.2 ± 10.4 days of gestation towards the expected gestational age for the HC, AC, and FL measurements, respectively. Additionally, we noted a statistically significant reduction in measurement variance once the operator was biased (P = 0.026). Comparing the lowest and highest possible estimated fetal weight (using the smallest and largest biased HC, AC and FL measurements), we noted that the discordance, in percentage terms, was 10.1% ± 6.5%, and that in 17% (95% CI, 12-21%) of the scans, the fetus could be considered as small-for-gestational age or appropriate-for-gestational age if using the smallest or largest possible measurements, respectively. Similarly, in 13% (95% CI, 9-16%) of scans, the fetus could be considered as large-for-gestational age or appropriate-for-gestational age if using the largest or smallest possible measurements, respectively. CONCLUSIONS: During routine third-trimester growth scans, expected-value bias frequently occurs and significantly changes standard biometric measurements obtained. © 2019 the Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of the International Society of Ultrasound in Obstetrics and Gynecology

    Estimation of gestational age in early pregnancy from crown-rump length when gestational age range is truncated: the case study of the INTERGROWTH-21st Project.

    Get PDF
    BACKGROUND: Fetal ultrasound scanning is considered vital for routine antenatal care with first trimester scans recommended for accurate estimation of gestational age (GA). A reliable estimate of gestational age is key information underpinning clinical care and allows estimation of expected date of delivery. Fetal crown-rump length (CRL) is recommended over last menstrual period for estimating GA when measured in early pregnancy i.e. 9+0-13+6 weeks. METHODS: The INTERGROWTH-21st Project is the largest prospective study to collect data on CRL in geographically diverse populations and with a high level of quality control measures in place. We aim to develop a new gestational age estimation equation based on the crown-rump length (CRL) from women recruited between 9+0-13+6 weeks. The main statistical challenge is modelling data when the outcome variable (GA) is truncated at both ends, i.e. at 9 and 14 weeks.We explored three alternative statistical approaches to overcome the truncation of GA. To evaluate these strategies we generated a data set with no truncation of GA that was similar to the INTERGROWTH-21st Project CRL data, which we used to explore the performance of different methods of analysis of these data when we imposed truncation at 9 and 14 weeks of gestation. These 3 methods were first tested in a simulation based study using a previously published dating equation by Verburg et al. and evaluated how well each of them performed in relation to the model from which the data were generated. After evaluating the 3 approaches using simulated data based on the Verburg equations, the best approach will be applied to the INTERGROWTH-21st Project data to estimate GA from CRL. RESULTS: Results of these rather "ad hoc" statistical methods correspond very closely to the "real data" for Verburg, a data set that is similar to the INTERGROWTH-21st project CRL data set. CONCLUSIONS: We are confident that we can use these approaches to get reliable estimates based on INTERGROWTH-21st Project CRL data. These approaches may be a solution to other truncation problems involving similar data though their application to other settings would need to be evaluated

    Multimodal Convolutional Neural Networks to Detect Fetal Compromise During Labor and Delivery

    Get PDF
    The gold standard to assess whether a baby is at risk of oxygen deprivation during childbirth, is monitoring continuously the fetal heart rate with cardiotocography (CTG). The aim is to identify babies that could benefit from an emergency operative delivery (e.g., Cesarean section), in order to prevent death or permanent brain injury. The long, dynamic and complex CTG patterns are poorly understood and known to have high false positive and false negative rates. Visual interpretation by clinicians is challenging and reliable accurate fetal monitoring in labor remains an enormous unmet medical need. In this work, we applied deep learning methods to achieve data-driven automated CTG evaluation. Multimodal Convolutional Neural Network (MCNN) and Stacked MCNN models were used to analyze the largest available database of routinely collected CTG and linked clinical data (comprising more than 35000 births). We also assessed in detail the impact of the signal quality on the MCNN performance. On a large hold-out testing set from Oxford (n= 4429 births), MCNN improved the prediction of cord acidemia at birth when compared with Clinical Practice and previous computerized approaches. On two external datasets, MCNN demonstrated better performance compared to current feature extraction-based methods. Our group is the first to apply deep learning for the analysis of CTG. We conclude that MCNN hold potential for the prediction of cord acidemia at birth and further work is warranted. Despite the advances, our deep learning models are currently not suitable for the detection of severe fetal injury in the absence of cord acidemia - a heterogeneous, small, and poorly understood group. We suggest that the most promising way forward are hybrid approaches to CTG interpretation in labor, in which different diagnostic models can estimate the risk for different types of fetal compromise, incorporating clinical knowledge with data-driven analyses

    Reduction in twin stillbirth following implementation of NICE guidance.

    Get PDF
    OBJECTIVE: There has been an unprecedented fall in the rate of stillbirth in twin pregnancy in the UK. It is contested whether implementation of the National Institute for Health and Care Excellence (NICE) guidance on the antenatal management of uncomplicated twin pregnancies has contributed to this change. The aim of this study was to investigate whether the implementation of NICE guidance was associated with a reduction in the rate of stillbirth in twin pregnancies delivered in a large UK hospital. METHODS: This was a retrospective cohort study including all twin pregnancies delivered at St George's Hospital, London, UK, between 2000 and 2018. Data were analyzed according to two time periods: before implementation of the NICE guidance on twins (before June 2013; pre-NICE) and after its implementation (after June 2013; post-NICE). The exclusion criteria were higher-order multiple gestations, pregnancies of unknown chorionicity, pregnancies complicated by miscarriage, those that underwent termination and those diagnosed with vanishing twin. The main outcome was stillbirth. Other outcomes included neonatal death (NND), admission to the neonatal intensive care unit (NICU) and emergency Cesarean section. We planned a priori a sensitivity analysis according to chorionicity. The chi-square test and Mann-Whitney U-test were used to compare outcomes between the study groups. RESULTS: We included in the analysis 1666 twin pregnancies (3332 fetuses), of which 1114 pregnancies (2228 fetuses) were delivered before and 552 pregnancies (1104 fetuses) after June 2013. Of those, 1299 were dichorionic and 354 were monochorionic diamniotic. The incidence of stillbirth was significantly lower in the post-NICE than in the pre-NICE group (3.6 per 1000 births vs 13.5 per 1000 births; P = 0.008). The reduction in stillbirth rate was from 8.5 to 3.6 per 1000 births (P = 0.161) in dichorionic and from 33.6 to 3.8 per 1000 births (P = 0.011) in monochorionic diamniotic twin pregnancies. There was no significant difference in the rates of NND (P = 0.625), NICU admission (P = 0.506) or emergency Cesarean section (P = 0.820) between the two groups. The median gestational age at delivery was significantly lower in the post-NICE than in the pre-NICE group (median 36.3 vs 36.9 weeks; P  70% in the stillbirth rate in twin pregnancies was noted after implementation of the NICE guidance. This reduction was statistically significant in monochorionic, but not dichorionic, twin pregnancies. The improvement in twin pregnancy outcome was achieved without a concomitant increase in NND, admission to the NICU or emergency Cesarean section. © 2020 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of the International Society of Ultrasound in Obstetrics and Gynecology

    Detection of non-cardiac fetal abnormalities on ultrasound at 11–14 weeks: systematic review and meta-analysis

    Get PDF
    Objectives To assess the diagnostic accuracy of two-dimensional ultrasound at 11–14 weeks' gestation as a screening test for individual fetal anomalies and to identify factors impacting on screening performance. Methods This was a systematic review and meta-analysis that was developed and registered with PROSPERO (CRD42018111781). MEDLINE, EMBASE, Web of Science Core Collection and the Cochrane Library were searched for studies evaluating the diagnostic accuracy of screening for 16 predefined, non-cardiac, congenital anomalies considered to be of interest to the early anomaly scan. We included prospective and retrospective studies from any healthcare setting conducted in low-risk, mixed-risk and unselected populations. The reference standard was the detection of an anomaly on postnatal or postmortem examination. Data were extracted to populate 2 × 2 tables and a random-effects model was used to determine the diagnostic accuracy of screening for the predefined anomalies (individually and as a composite). Secondary analyses were performed to determine the impact on detection rates of imaging protocol, type of ultrasound modality, publication year and index of sonographer suspicion at the time of scanning. Post-hoc secondary analysis was conducted to assess performance among studies published during or after 2010. Risk of bias assessment and quality assessment were undertaken for included studies using the Quality Assessment of Diagnostic Accuracy Studies-2 tool. Results From 5684 citations, 202 papers underwent full-text review, resulting in the inclusion of 52 studies comprising 527 837 fetuses, of which 2399 were affected by one or more of the 16 predefined anomalies. Individual anomalies were not equally amenable to detection on first-trimester ultrasound: a high (> 80%) detection rate was reported for severe conditions, including acrania (98%), gastroschisis (96%), exomphalos (95%) and holoprosencephaly (88%); the detection rate was lower for open spina bifida (69%), lower urinary tract obstruction (66%), lethal skeletal dysplasias (57%) and limb-reduction defects (50%); and the detection rate was below 50% for facial clefts (43%), polydactyly (40%) and congenital diaphragmatic hernia (38%). Conditions with a low ( 99% for all anomalies. Secondary analysis showed that detection improved with advancing publication year, and that the use of imaging protocols had a statistically significant impact on screening performance (P  Conclusions The accurate detection of congenital anomalies using first-trimester ultrasound is feasible, although detection rates and false-positive rates depend on the type of anomaly. The use of a standardized protocol allows for diagnostic performance to be maximized, particularly for the detection of spina bifida, facial clefts and limb-reduction defects. Highlighting the types of anomalies amenable to diagnosis and determining factors enhancing screening performance can support the development of first-trimester anomaly screening programs

    Quality of ultrasound biometry obtained by local health workers in a refugee camp on the Thai-Burmese border.

    Get PDF
    Objective: In a refugee camp on the Thai–Burmese border, accurate dating of pregnancy relies on ultrasound measurements obtained by locally trained health workers. The aim of this study was to substantiate the accuracy of fetal biometry measurements performed by locally trained health workers by comparing derived reference equations with those published for Asian and European hospitals. Methods: This prospective observational study included 1090 women who had a dating crown–rump length (CRL) scan and one study-appointed ultrasound biometry scan between 16 and 40 weeks of gestation. The average of two measurements of each of biparietal diameter, head circumference, abdominal circumference and femur length was used in a polynomial regression model for the mean and SD against gestational age (GA). The biometry equations obtained were compared with published equations of professional sonographers from Asian and European hospitals by evaluation of the SD and Z-scores of differences between models. Results: Reference equations of biometric parameters were found to fit cubic polynomial models. The observed SD values, for any given GA, of fetal biometric measurements obtained by locally trained health workers were lower than those previously reported by centers with professional sonographers. For nearly the entire GA range considered, the mean values of the Asian and European equations for all four biometric measurements were within the 90% expected range (mean ± 1.645 SD) of our equations. Conclusion: Locally trained health workers in a refugee camp on the Thai–Burmese border can obtain measurements that are associated with low SD values and within the normal limits of published Asian and European equations. The fact that the SD values were lower than in other studies may be explained by the use of the average of two measurements, CRL dating or motivation of the locally trained sonographer

    Quality-improvement program for ultrasound-based fetal anatomy screening using large-scale clinical audit.

    Get PDF
    OBJECTIVE: A large-scale audit and peer review of ultrasound images may improve sonographer performance, but is rarely performed consistently as it is time-consuming and expensive. The aim of this study was to perform a large-scale audit of routine fetal anatomy scans to assess if a full clinical audit cycle can improve clinical image-acquisition standards. METHODS: A large-scale, clinical, retrospective audit was conducted of ultrasound images obtained during all routine anomaly scans performed from 18 + 0 to 22 + 6 weeks' gestation at a UK hospital during 2013 (Cycle 1), to build a baseline understanding of the performance of sonographers. Targeted actions were undertaken in response to the findings with the aim of improving departmental performance. A second full-year audit was then performed of fetal anatomy ultrasound images obtained during the following year (Cycle 2). An independent pool of experienced sonographers used an online tool to assess all scans in terms of two parameters: scan completeness (i.e. were all images archived?) and image quality using objective scoring (i.e. were images of high quality?). Both were assessed in each audit at the departmental level and at the individual sonographer level. A random sample of 10% of scans was used to assess interobserver reproducibility. RESULTS: In Cycle 1 of the audit, 103 501 ultrasound images from 6257 anomaly examinations performed by 22 sonographers were assessed; in Cycle 2, 153 557 images from 6406 scans performed by 25 sonographers were evaluated. The analysis was performed including the images obtained by the 20 sonographers who participated in both cycles. Departmental median scan completeness improved from 72% in the first year to 78% at the second assessment (P < 0.001); median image-quality score for all fetal views improved from 0.83 to 0.86 (P < 0.001). The improvement was greatest for those sonographers who performed poorest in the first audit; with regards to scan completeness, the poorest performing 15% of sonographers in Cycle 1 improved by more than 30 percentage points, and with regards to image quality, the poorest performing 11% in Cycle 1 showed a more than 10% improvement. Interobserver repeatability of scan completeness and image-quality scores across different fetal views were similar to those in the published literature. CONCLUSIONS: A clinical audit and a set of targeted actions helped improve sonographer scan-acquisition completeness and scan quality. Such adherence to recommended clinical acquisition standards may increase the likelihood of correct measurement and thereby fetal growth assessment, and should allow better detection of abnormalities. As such a large-scale audit is time consuming, further advantages would be achieved if this process could be automated. © 2018 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of the International Society of Ultrasound in Obstetrics and Gynecology

    Spatio-Temporal Partitioning And Description Of Full-Length Routine Fetal Anomaly Ultrasound Scans

    Get PDF
    This paper considers automatic clinical workflow description of full-length routine fetal anomaly ultrasound scans using deep learning approaches for spatio-temporal video analysis. Multiple architectures consisting of 2D and 2D + t CNN, LSTM, and convolutional LSTM are investigated and compared. The contributions of short-term and long-term temporal changes are studied, and a multi-stream framework analysis is found to achieve the best top-l accuracy =0.77 and top-3 accuracy =0.94. Automated partitioning and characterisation on unlabelled full-length video scans show high correlation (ρ=0.95, p=0.0004) with workflow statistics of manually labelled videos, suggesting practicality of proposed methods

    Machine learning-based analysis of operator pupillary response to assess cognitive workload in clinical ultrasound imaging.

    Get PDF
    INTRODUCTION: Pupillometry, the measurement of eye pupil diameter, is a well-established and objective modality correlated with cognitive workload. In this paper, we analyse the pupillary response of ultrasound imaging operators to assess their cognitive workload, captured while they undertake routine fetal ultrasound examinations. Our experiments and analysis are performed on real-world datasets obtained using remote eye-tracking under natural clinical environmental conditions. METHODS: Our analysis pipeline involves careful temporal sequence (time-series) extraction by retrospectively matching the pupil diameter data with tasks captured in the corresponding ultrasound scan video in a multi-modal data acquisition setup. This is followed by the pupil diameter pre-processing and the calculation of pupillary response sequences. Exploratory statistical analysis of the operator pupillary responses and comparisons of the distributions between ultrasonographic tasks (fetal heart versus fetal brain) and operator expertise (newly-qualified versus experienced operators) are performed. Machine learning is explored to automatically classify the temporal sequences into the corresponding ultrasonographic tasks and operator experience using temporal, spectral, and time-frequency features with classical (shallow) models, and convolutional neural networks as deep learning models. RESULTS: Preliminary statistical analysis of the extracted pupillary response shows a significant variation for different ultrasonographic tasks and operator expertise, suggesting different extents of cognitive workload in each case, as measured by pupillometry. The best-performing machine learning models achieve receiver operating characteristic (ROC) area under curve (AUC) values of 0.98 and 0.80, for ultrasonographic task classification and operator experience classification, respectively. CONCLUSION: We conclude that we can successfully assess cognitive workload from pupil diameter changes measured while ultrasound operators perform routine scans. The machine learning allows the discrimination of the undertaken ultrasonographic tasks and scanning expertise using the pupillary response sequences as an index of the operators' cognitive workload. A high cognitive workload can reduce operator efficiency and constrain their decision-making, hence, the ability to objectively assess cognitive workload is a first step towards understanding these effects on operator performance in biomedical applications such as medical imaging
    corecore