2,839 research outputs found
Quantification of MR spectra by deep learning in an idealized setting: Investigation of forms of input, network architectures, optimization by ensembles of networks, and training bias.
PURPOSE
The aims of this work are (1) to explore deep learning (DL) architectures, spectroscopic input types, and learning designs toward optimal quantification in MR spectroscopy of simulated pathological spectra; and (2) to demonstrate accuracy and precision of DL predictions in view of inherent bias toward the training distribution.
METHODS
Simulated 1D spectra and 2D spectrograms that mimic an extensive range of pathological in vivo conditions are used to train and test 24 different DL architectures. Active learning through altered training and testing data distributions is probed to optimize quantification performance. Ensembles of networks are explored to improve DL robustness and reduce the variance of estimates. A set of scores compares performances of DL predictions and traditional model fitting (MF).
RESULTS
Ensembles of heterogeneous networks that combine 1D frequency-domain and 2D time-frequency domain spectrograms as input perform best. Dataset augmentation with active learning can improve performance, but gains are limited. MF is more accurate, although DL appears to be more precise at low SNR. However, this overall improved precision originates from a strong bias for cases with high uncertainty toward the dataset the network has been trained with, tending toward its average value.
CONCLUSION
MF mostly performs better compared to the faster DL approach. Potential intrinsic biases on training sets are dangerous in a clinical context that requires the algorithm to be unbiased to outliers (i.e., pathological data). Active learning and ensemble of networks are good strategies to improve prediction performances. However, data quality (sufficient SNR) has proven as a bottleneck for adequate unbiased performance-like in the case of MF
Properties of ultra-cool dwarfs with Gaia. An assessment of the accuracy for the temperature determination
We aimed to assess the accuracy of the Gaia teff and logg estimates as
derived with current models and observations. We assessed the validity of
several inference techniques for deriving the physical parameters of ultra-cool
dwarf stars. We used synthetic spectra derived from ultra-cool dwarf models to
construct (train) the regression models. We derived the intrinsic uncertainties
of the best inference models and assessed their validity by comparing the
estimated parameters with the values derived in the bibliography for a sample
of ultra-cool dwarf stars observed from the ground. We estimated the total
number of ultra-cool dwarfs per spectral subtype, and obtained values that can
be summarised (in orders of magnitude) as 400000 objects in the M5-L0 range,
600 objects between L0 and L5, 30 objects between L5 and T0, and 10 objects
between T0 and T8. A bright ultra-cool dwarf (with teff=2500 K and \logg=3.5
will be detected by Gaia out to approximately 220 pc, while for teff=1500 K
(spectral type L5) and the same surface gravity, this maximum distance reduces
to 10-20 pc. The RMSE of the prediction deduced from ground-based spectra of
ultra-cool dwarfs simulated at the Gaia spectral range and resolution, and for
a Gaia magnitude G=20 is 213 K and 266 K for the models based on k-nearest
neighbours and Gaussian process regression, respectively. These are total
errors in the sense that they include the internal and external errors, with
the latter caused by the inability of the synthetic spectral models (used for
the construction of the regression models) to exactly reproduce the observed
spectra, and by the large uncertainties in the current calibrations of spectral
types and effective temperatures.Comment: 18 pages, 17 figures, accepted by Astronomy & Astrophysic
Maximum Entropy for Gravitational Wave Data Analysis: Inferring the Physical Parameters of Core-Collapse Supernovae
The gravitational wave signal arising from the collapsing iron core of a Type
II supernova progenitor star carries with it the imprint of the progenitor's
mass, rotation rate, degree of differential rotation, and the bounce depth.
Here, we show how to infer the gravitational radiation waveform of a core
collapse event from noisy observations in a network of two or more LIGO-like
gravitational wave detectors and, from the recovered signal, constrain these
source properties. Using these techniques, predictions from recent core
collapse modeling efforts, and the LIGO performance during its S4 science run,
we also show that gravitational wave observations by LIGO might have been
sufficient to provide reasonable estimates of the progenitor mass, angular
momentum and differential angular momentum, and depth of the core at bounce,
for a rotating core collapse event at a distance of a few kpc.Comment: 44 pages, 12 figures; accepted version scheduled to appear in Ap J 1
April 200
Fitting the integrated Spectral Energy Distributions of Galaxies
Fitting the spectral energy distributions (SEDs) of galaxies is an almost
universally used technique that has matured significantly in the last decade.
Model predictions and fitting procedures have improved significantly over this
time, attempting to keep up with the vastly increased volume and quality of
available data. We review here the field of SED fitting, describing the
modelling of ultraviolet to infrared galaxy SEDs, the creation of
multiwavelength data sets, and the methods used to fit model SEDs to observed
galaxy data sets. We touch upon the achievements and challenges in the major
ingredients of SED fitting, with a special emphasis on describing the interplay
between the quality of the available data, the quality of the available models,
and the best fitting technique to use in order to obtain a realistic
measurement as well as realistic uncertainties. We conclude that SED fitting
can be used effectively to derive a range of physical properties of galaxies,
such as redshift, stellar masses, star formation rates, dust masses, and
metallicities, with care taken not to over-interpret the available data. Yet
there still exist many issues such as estimating the age of the oldest stars in
a galaxy, finer details ofdust properties and dust-star geometry, and the
influences of poorly understood, luminous stellar types and phases. The
challenge for the coming years will be to improve both the models and the
observational data sets to resolve these uncertainties. The present review will
be made available on an interactive, moderated web page (sedfitting.org), where
the community can access and change the text. The intention is to expand the
text and keep it up to date over the coming years.Comment: 54 pages, 26 figures, Accepted for publication in Astrophysics &
Space Scienc
What governs star formation in galaxies? A modern statistical approach
Understanding the process of star formation is one of the key steps in understanding the formation and evolution of galaxies. In this thesis, I address the empirical star formation laws, and study the properties of galaxies that can affect the star formation rate.
The Andromeda galaxy (M31) is the nearest large spiral galaxy, and Therefore, high resolution images of this galaxy are available. These images provide data from various regions with different physical properties. Star formation rate and gas mass surface densities of M31have been measured using three different methods, and have been used to compare different star formation laws over the whole galaxy and in spatially-resolved regions. Using hierarchical Bayesian regression analysis, I conclude that there is a correlation between surface density of star formation and the stellar mass surface density. A weak correlation between star formation rate, stellar mass and metallicity is also found.
To study the effect of other properties a galaxy on the star formation rate, I utilize an unsupervised data mining method (specifically the self-organizing map) on measurements of both nearby and high-redshift galaxies. Both observed data and derived quantities (e.g. star formation rate, stellar mass) of star-forming regions in M31 and the nearby spiral galaxy M101 are used as inputs to the self-organizing map. Clustering the M31 regions in the feature space reveals some (anti)-correlations between the properties the galaxy, which are not apparent when considering data from all regions in the galaxy. The self-organizing map can be used to predict star formation rates for spatially-resolved regions in galaxies using other properties of those regions.
I also apply the self-organizing map method to spectral energy distributions of high-redshift galaxies. Template spectra made from galaxies with known morphological type are used to train self-organizing maps. The trained maps are used to classify a sample of galaxy spectral energy distributions derived from fitting models to photometry data of 142 high-redshift galaxies. The grouped properties of the classified galaxies are found to be more tightly correlated in mean values of age, specific star formation rate, stellar mass, and far-UV extinction than in previous studies
Recommended from our members
Three decades of the Shuffled Complex Evolution (SCE-UA) optimization algorithm: Review and applications
Adaptive Feature Engineering Modeling for Ultrasound Image Classification for Decision Support
Ultrasonography is considered a relatively safe option for the diagnosis of benign and malignant cancer lesions due to the low-energy sound waves used. However, the visual interpretation of the ultrasound images is time-consuming and usually has high false alerts due to speckle noise. Improved methods of collection image-based data have been proposed to reduce noise in the images; however, this has proved not to solve the problem due to the complex nature of images and the exponential growth of biomedical datasets. Secondly, the target class in real-world biomedical datasets, that is the focus of interest of a biopsy, is usually significantly underrepresented compared to the non-target class. This makes it difficult to train standard classification models like Support Vector Machine (SVM), Decision Trees, and Nearest Neighbor techniques on biomedical datasets because they assume an equal class distribution or an equal misclassification cost. Resampling techniques by either oversampling the minority class or under-sampling the majority class have been proposed to mitigate the class imbalance problem but with minimal success. We propose a method of resolving the class imbalance problem with the design of a novel data-adaptive feature engineering model for extracting, selecting, and transforming textural features into a feature space that is inherently relevant to the application domain.
We hypothesize that by maximizing the variance and preserving as much variability in well-engineered features prior to applying a classifier model will boost the differentiation of the thyroid nodules (benign or malignant) through effective model building. Our proposed a hybrid approach of applying Regression and Rule-Based techniques to build our Feature Engineering and a Bayesian Classifier respectively.
In the Feature Engineering model, we transformed images pixel intensity values into a high dimensional structured dataset and fitting a regression analysis model to estimate relevant kernel parameters to be applied to the proposed filter method. We adopted an Elastic Net Regularization path to control the maximum log-likelihood estimation of the Regression model. Finally, we applied a Bayesian network inference to estimate a subset for the textural features with a significant conditional dependency in the classification of the thyroid lesion. This is performed to establish the conditional influence on the textural feature to the random factors generated through our feature engineering model and to evaluate the success criterion of our approach.
The proposed approach was tested and evaluated on a public dataset obtained from thyroid cancer ultrasound diagnostic data. The analyses of the results showed that the classification performance had a significant improvement overall for accuracy and area under the curve when then proposed feature engineering model was applied to the data. We show that a high performance of 96.00% accuracy with a sensitivity and specificity of 99.64%) and 90.23% respectively was achieved for a filter size of 13 × 13
- …