515 research outputs found
Kepler Presearch Data Conditioning II - A Bayesian Approach to Systematic Error Correction
With the unprecedented photometric precision of the Kepler Spacecraft,
significant systematic and stochastic errors on transit signal levels are
observable in the Kepler photometric data. These errors, which include
discontinuities, outliers, systematic trends and other instrumental signatures,
obscure astrophysical signals. The Presearch Data Conditioning (PDC) module of
the Kepler data analysis pipeline tries to remove these errors while preserving
planet transits and other astrophysically interesting signals. The completely
new noise and stellar variability regime observed in Kepler data poses a
significant problem to standard cotrending methods such as SYSREM and TFA.
Variable stars are often of particular astrophysical interest so the
preservation of their signals is of significant importance to the astrophysical
community. We present a Bayesian Maximum A Posteriori (MAP) approach where a
subset of highly correlated and quiet stars is used to generate a cotrending
basis vector set which is in turn used to establish a range of "reasonable"
robust fit parameters. These robust fit parameters are then used to generate a
Bayesian Prior and a Bayesian Posterior Probability Distribution Function (PDF)
which when maximized finds the best fit that simultaneously removes systematic
effects while reducing the signal distortion and noise injection which commonly
afflicts simple least-squares (LS) fitting. A numerical and empirical approach
is taken where the Bayesian Prior PDFs are generated from fits to the light
curve distributions themselves.Comment: 43 pages, 21 figures, Submitted for publication in PASP. Also see
companion paper "Kepler Presearch Data Conditioning I - Architecture and
Algorithms for Error Correction in Kepler Light Curves" by Martin C. Stumpe,
et a
Kepler Presearch Data Conditioning I - Architecture and Algorithms for Error Correction in Kepler Light Curves
Kepler provides light curves of 156,000 stars with unprecedented precision.
However, the raw data as they come from the spacecraft contain significant
systematic and stochastic errors. These errors, which include discontinuities,
systematic trends, and outliers, obscure the astrophysical signals in the light
curves. To correct these errors is the task of the Presearch Data Conditioning
(PDC) module of the Kepler data analysis pipeline. The original version of PDC
in Kepler did not meet the extremely high performance requirements for the
detection of miniscule planet transits or highly accurate analysis of stellar
activity and rotation. One particular deficiency was that astrophysical
features were often removed as a side-effect to removal of errors. In this
paper we introduce the completely new and significantly improved version of PDC
which was implemented in Kepler SOC 8.0. This new PDC version, which utilizes a
Bayesian approach for removal of systematics, reliably corrects errors in the
light curves while at the same time preserving planet transits and other
astrophysically interesting signals. We describe the architecture and the
algorithms of this new PDC module, show typical errors encountered in Kepler
data, and illustrate the corrections using real light curve examples.Comment: Submitted to PASP. Also see companion paper "Kepler Presearch Data
Conditioning II - A Bayesian Approach to Systematic Error Correction" by Jeff
C. Smith et a
Human-Centered Tools for Coping with Imperfect Algorithms during Medical Decision-Making
Machine learning (ML) is increasingly being used in image retrieval systems
for medical decision making. One application of ML is to retrieve visually
similar medical images from past patients (e.g. tissue from biopsies) to
reference when making a medical decision with a new patient. However, no
algorithm can perfectly capture an expert's ideal notion of similarity for
every case: an image that is algorithmically determined to be similar may not
be medically relevant to a doctor's specific diagnostic needs. In this paper,
we identified the needs of pathologists when searching for similar images
retrieved using a deep learning algorithm, and developed tools that empower
users to cope with the search algorithm on-the-fly, communicating what types of
similarity are most important at different moments in time. In two evaluations
with pathologists, we found that these refinement tools increased the
diagnostic utility of images found and increased user trust in the algorithm.
The tools were preferred over a traditional interface, without a loss in
diagnostic accuracy. We also observed that users adopted new strategies when
using refinement tools, re-purposing them to test and understand the underlying
algorithm and to disambiguate ML errors from their own errors. Taken together,
these findings inform future human-ML collaborative systems for expert
decision-making
Microscope 2.0: An Augmented Reality Microscope with Real-time Artificial Intelligence Integration
The brightfield microscope is instrumental in the visual examination of both
biological and physical samples at sub-millimeter scales. One key clinical
application has been in cancer histopathology, where the microscopic assessment
of the tissue samples is used for the diagnosis and staging of cancer and thus
guides clinical therapy. However, the interpretation of these samples is
inherently subjective, resulting in significant diagnostic variability.
Moreover, in many regions of the world, access to pathologists is severely
limited due to lack of trained personnel. In this regard, Artificial
Intelligence (AI) based tools promise to improve the access and quality of
healthcare. However, despite significant advances in AI research, integration
of these tools into real-world cancer diagnosis workflows remains challenging
because of the costs of image digitization and difficulties in deploying AI
solutions. Here we propose a cost-effective solution to the integration of AI:
the Augmented Reality Microscope (ARM). The ARM overlays AI-based information
onto the current view of the sample through the optical pathway in real-time,
enabling seamless integration of AI into the regular microscopy workflow. We
demonstrate the utility of ARM in the detection of lymph node metastases in
breast cancer and the identification of prostate cancer with a latency that
supports real-time workflows. We anticipate that ARM will remove barriers
towards the use of AI in microscopic analysis and thus improve the accuracy and
efficiency of cancer diagnosis. This approach is applicable to other microscopy
tasks and AI algorithms in the life sciences and beyond
Prediction of MET Overexpression in Non-Small Cell Lung Adenocarcinomas from Hematoxylin and Eosin Images
MET protein overexpression is a targetable event in non-small cell lung
cancer (NSCLC) and is the subject of active drug development. Challenges in
identifying patients for these therapies include lack of access to validated
testing, such as standardized immunohistochemistry (IHC) assessment, and
consumption of valuable tissue for a single gene/protein assay. Development of
pre-screening algorithms using routinely available digitized hematoxylin and
eosin (H&E)-stained slides to predict MET overexpression could promote testing
for those who will benefit most. While assessment of MET expression using IHC
is currently not routinely performed in NSCLC, next-generation sequencing is
common and in some cases includes RNA expression panel testing. In this work,
we leveraged a large database of matched H&E slides and RNA expression data to
train a weakly supervised model to predict MET RNA overexpression directly from
H&E images. This model was evaluated on an independent holdout test set of 300
over-expressed and 289 normal patients, demonstrating an ROC-AUC of 0.70 (95th
percentile interval: 0.66 - 0.74) with stable performance characteristics
across different patient clinical variables and robust to synthetic noise on
the test set. These results suggest that H&E-based predictive models could be
useful to prioritize patients for confirmatory testing of MET protein or MET
gene expression status
Fundamental Properties of Stars using Asteroseismology from Kepler & CoRoT and Interferometry from the CHARA Array
We present results of a long-baseline interferometry campaign using the PAVO
beam combiner at the CHARA Array to measure the angular sizes of five
main-sequence stars, one subgiant and four red giant stars for which solar-like
oscillations have been detected by either Kepler or CoRoT. By combining
interferometric angular diameters, Hipparcos parallaxes, asteroseismic
densities, bolometric fluxes and high-resolution spectroscopy we derive a full
set of near model-independent fundamental properties for the sample. We first
use these properties to test asteroseismic scaling relations for the frequency
of maximum power (nu_max) and the large frequency separation (Delta_nu). We
find excellent agreement within the observational uncertainties, and
empirically show that simple estimates of asteroseismic radii for main-sequence
stars are accurate to <~4%. We furthermore find good agreement of our measured
effective temperatures with spectroscopic and photometric estimates with mean
deviations for stars between T_eff = 4600-6200 K of -22+/-32 K (with a scatter
of 97K) and -58+/-31 K (with a scatter of 93 K), respectively. Finally we
present a first comparison with evolutionary models, and find differences
between observed and theoretical properties for the metal-rich main-sequence
star HD173701. We conclude that the constraints presented in this study will
have strong potential for testing stellar model physics, in particular when
combined with detailed modelling of individual oscillation frequencies.Comment: 18 pages, 12 figures, 7 tables; accepted for publication in Ap
Reproducibility of 3-dimensional ultrasound readings of volume of carotid atherosclerotic plaque
<p>Abstract</p> <p>Background</p> <p>Non-invasive 3-dimensional (3D) ultrasound (US) has emerged as the predominant approach for evaluating the progression of carotid atherosclerosis and its response to treatment. The aim of this study was to investigate the quality of a central reading procedure concerning plaque volume (PV), measured by 3D US in a multinational US trial.</p> <p>Methods</p> <p>Two data sets of 45 and 60 3D US patient images of plaques (mean PV, 71.8 and 39.8 μl, respectively) were used. PV was assessed by means of manual planimetry. The intraclass correlation coefficient (ICC) was applied to determine reader variabilities. The repeatability coefficient (RC) and the coefficient of variation (CV) were used to investigate the effect of number of slices (S) in manual planimetry and plaque size on measurement variability.</p> <p>Results</p> <p>Intra-reader variability was small as reflected by ICCs of 0.985, 0.967 and 0.969 for 3 appointed readers. The ICC value generated between the 3 readers was 0.964, indicating that inter-reader variability was small, too. Subgroup analyses showed that both intra- and inter-reader variabilities were lower for larger than for smaller plaques. Mean CVs were similar for the 5S- and 10S-methods with a RC of 4.7 μl. The RC between both methods as well as the CVs were comparatively lower for larger plaques.</p> <p>Conclusion</p> <p>By implementing standardised central 3D US reading protocols and strict quality control procedures highly reliable ultrasonic re-readings of plaque images can be achieved in large multicentre trials.</p
- …