441 research outputs found

    Principal Component Analysis and Radiative Transfer modelling of Spitzer IRS Spectra of Ultra Luminous Infrared Galaxies

    Get PDF
    The mid-infrared spectra of ultraluminous infrared galaxies (ULIRGs) contain a variety of spectral features that can be used as diagnostics to characterise the spectra. However, such diagnostics are biased by our prior prejudices on the origin of the features. Moreover, by using only part of the spectrum they do not utilise the full information content of the spectra. Blind statistical techniques such as principal component analysis (PCA) consider the whole spectrum, find correlated features and separate them out into distinct components. We further investigate the principal components (PCs) of ULIRGs derived in Wang et al.(2011). We quantitatively show that five PCs is optimal for describing the IRS spectra. These five components (PC1-PC5) and the mean spectrum provide a template basis set that reproduces spectra of all z<0.35 ULIRGs within the noise. For comparison, the spectra are also modelled with a combination of radiative transfer models of both starbursts and the dusty torus surrounding active galactic nuclei. The five PCs typically provide better fits than the models. We argue that the radiative transfer models require a colder dust component and have difficulty in modelling strong PAH features. Aided by the models we also interpret the physical processes that the principal components represent. The third principal component is shown to indicate the nature of the dominant power source, while PC1 is related to the inclination of the AGN torus. Finally, we use the 5 PCs to define a new classification scheme using 5D Gaussian mixtures modelling and trained on widely used optical classifications. The five PCs, average spectra for the four classifications and the code to classify objects are made available at: http://www.phys.susx.ac.uk/~pdh21/PCA/Comment: 11 pages, 12 figures, accepted for publication in MNRA

    Creating longitudinal datasets and cleaning existing data identifiers in a cystic fibrosis registry using a novel Bayesian probabilistic approach from astronomy

    Get PDF
    Patient registry data are commonly collected as annual snapshots that need to be amalgamated to understand the longitudinal progress of each patient. However, patient identifiers can either change or may not be available for legal reasons when longitudinal data are collated from patients living in different countries. Here, we apply astronomical statistical matching techniques to link individual patient records that can be used where identifiers are absent or to validate uncertain identifiers. We adopt a Bayesian model framework used for probabilistically linking records in astronomy. We adapt this and validate it across blinded, annually collected data. This is a high-quality (Danish) sub-set of data held in the European Cystic Fibrosis Society Patient Registry (ECFSPR). Our initial experiments achieved a precision of 0.990 at a recall value of 0.987. However, detailed investigation of the discrepancies uncovered typing errors in 27 of the identifiers in the original Danish sub-set. After fixing these errors to create a new gold standard our algorithm correctly linked individual records across years achieving a precision of 0.997 at a recall value of 0.987 without recourse to identifiers. Our Bayesian framework provides the probability of whether a pair of records belong to the same patient. Unlike other record linkage approaches, our algorithm can also use physical models, such as body mass index curves, as prior information for record linkage. We have shown our framework can create longitudinal samples where none existed and validate pre-existing patient identifiers. We have demonstrated that in this specific case this automated approach is better than the existing identifiers

    Intersensory integration and reading : a theory / IREC Papers Vol. 1, No. 2

    Get PDF
    Includes bibliographic references (p. 34-37)

    De-blending Deep Herschel Surveys: A Multi-wavelength Approach

    Get PDF
    Cosmological surveys in the far infrared are known to suffer from confusion. The Bayesian de-blending tool, XID+, currently provides one of the best ways to de-confuse deep Herschel SPIRE images, using a flat flux density prior. This work is to demonstrate that existing multi-wavelength data sets can be exploited to improve XID+ by providing an informed prior, resulting in more accurate and precise extracted flux densities. Photometric data for galaxies in the COSMOS field were used to constrain spectral energy distributions (SEDs) using the fitting tool CIGALE. These SEDs were used to create Gaussian prior estimates in the SPIRE bands for XID+. The multi-wavelength photometry and the extracted SPIRE flux densities were run through CIGALE again to allow us to compare the performance of the two priors. Inferred ALMA flux densities (Fi^i), at 870μ\mum and 1250μ\mum, from the best fitting SEDs from the second CIGALE run were compared with measured ALMA flux densities (Fm^m) as an independent performance validation. Similar validations were conducted with the SED modelling and fitting tool MAGPHYS and modified black body functions to test for model dependency. We demonstrate a clear improvement in agreement between the flux densities extracted with XID+ and existing data at other wavelengths when using the new informed Gaussian prior over the original uninformed prior. The residuals between Fm^m and Fi^i were calculated. For the Gaussian prior, these residuals, expressed as a multiple of the ALMA error (σ\sigma), have a smaller standard deviation, 7.95σ\sigma for the Gaussian prior compared to 12.21σ\sigma for the flat prior, reduced mean, 1.83σ\sigma compared to 3.44σ\sigma, and have reduced skew to positive values, 7.97 compared to 11.50. These results were determined to not be significantly model dependent. This results in statistically more reliable SPIRE flux densities.Comment: 8 pages, 7 figures, 3 tables. Accepted for publication in A&

    Extreme star formation events in quasar hosts over 0.5<z<4{\bf0.5<\textit{z}<4}

    Get PDF
    We explore the relationship between active galactic nuclei and star formation in a sample of 513 optically luminous type 1 quasars up to redshifts of \sim4 hosting extremely high star formation rates (SFRs). The quasars are selected to be individually detected by the \textit{Herschel} SPIRE instrument at >> 3σ\sigma at 250 μ\mum, leading to typical SFRs of order of 1000 M_{\odot}yr1^{-1}. We find the average SFRs to increase by almost a factor 10 from z0.5z\sim0.5 to z3z\sim3, mirroring the rise in the comoving SFR density over the same epoch. However, we find that the SFRs remain approximately constant with increasing accretion luminosity for accretion luminosities above 1012^{12} L_{\odot}. We also find that the SFRs do not correlate with black hole mass. Both of these results are most plausibly explained by the existence of a self-regulation process by the starburst at high SFRs, which controls SFRs on time-scales comparable to or shorter than the AGN or starburst duty cycles. We additionally find that SFRs do not depend on Eddington ratio at any redshift, consistent with no relation between SFR and black hole growth rate per unit black hole mass. Finally, we find that high-ionisation broad absorption line (HiBAL) quasars have indistinguishable far-infrared properties to those of classical quasars, consistent with HiBAL quasars being normal quasars observed along a particular line of sight, with the outflows in HiBAL quasars not having any measurable effect on the star formation in their hosts.Comment: 12 pages, 6 figure

    Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches

    Get PDF
    Background Identifying dementia early in time, using real world data, is a public health challenge. As only two-thirds of people with dementia now ultimately receive a formal diagnosis in United Kingdom health systems and many receive it late in the disease process, there is ample room for improvement. The policy of the UK government and National Health Service (NHS) is to increase rates of timely dementia diagnosis. We used data from general practice (GP) patient records to create a machine-learning model to identify patients who have or who are developing dementia, but are currently undetected as having the condition by the GP. Methods We used electronic patient records from Clinical Practice Research Datalink (CPRD). Using a case-control design, we selected patients aged >65y with a diagnosis of dementia (cases) and matched them 1:1 by sex and age to patients with no evidence of dementia (controls). We developed a list of 70 clinical entities related to the onset of dementia and recorded in the 5 years before diagnosis. After creating binary features, we trialled machine learning classifiers to discriminate between cases and controls (logistic regression, naïve Bayes, support vector machines, random forest and neural networks). We examined the most important features contributing to discrimination. Results The final analysis included data on 93,120 patients, with a median age of 82.6 years; 64.8% were female. The naïve Bayes model performed least well. The logistic regression, support vector machine, neural network and random forest performed very similarly with an AUROC of 0.74. The top features retained in the logistic regression model were disorientation and wandering, behaviour change, schizophrenia, self-neglect, and difficulty managing. Conclusions Our model could aid GPs or health service planners with the early detection of dementia. Future work could improve the model by exploring the longitudinal nature of patient data and modelling decline in function over time

    Learning the fundamental mid-infrared spectral components of galaxies with non-negative matrix factorization

    Get PDF
    The mid-infrared (MIR) spectra observed with the Spitzer Infrared Spectrograph (IRS) provide a valuable data set for untangling the physical processes and conditions within galaxies. This paper presents the first attempt to blindly learn fundamental spectral components of MIR galaxy spectra, using non-negative matrix factorization (NMF). NMF is a recently developed multivariate technique shown to be successful in blind source separation problems. Unlike the more popular multivariate analysis technique, principal component analysis, NMF imposes the condition that weights and spectral components are non-negative. This more closely resembles the physical process of emission in the MIR, resulting in physically intuitive components. By applying NMF to galaxy spectra in the Cornell Atlas of Spitzer/IRS sources, we find similar components amongst different NMF sets. These similar components include two for active galactic nucleus (AGN) emission and one for star formation. The first AGN component is dominated by fine structure emission lines and hot dust, the second by broad silicate emission at 10 and 18 μm. The star formation component contains all the polycyclic aromatic hydrocarbon features and molecular hydrogen lines. Other components include rising continuums at longer wavelengths, indicative of colder grey-body dust emission. We show an NMF set with seven components can reconstruct the general spectral shape of a wide variety of objects, though struggle to fit the varying strength of emission lines. We also show that the seven components can be used to separate out different types of objects. We model this separation with Gaussian mixtures modelling and use the result to provide a classification tool. We also show that the NMF components can be used to separate out the emission from AGN and star formation regions and define a new star formation/AGN diagnostic which is consistent with all MIR diagnostics already in use but has the advantage that it can be applied to MIR spectra with low signal-to-noise ratio or with limited spectral range. The seven NMF components and code for classification are available at https://github.com/pdh21/NMF_software/

    Main sequence of star forming galaxies beyond the Herschel confusion limit

    Get PDF
    Context. Deep far-infrared (FIR) cosmological surveys are known to be affected by confusion, causing issues when examining the main sequence of star forming galaxies (MS). In the past this has typically been partially tackled by the use of stacking. However, stacking only provides the average properties of the objects in the stack. Aims. This work aims to trace the MS over 0.2 ≤ z < 6.0 using the latest de-blended Herschel photometry, which reaches ≈ 10 times deeper than the 5σ confusion limit in SPIRE. This provides more reliable star formation rates (SFRs), especially for the fainter galaxies, and hence a more reliable MS. Methods. We built a pipeline that uses the spectral energy distribution (SED) modelling and fitting tool CIGALE to generate flux density priors in the Herschel SPIRE bands. These priors where then fed into the de-blending tool XID+ to extract flux densities from the SPIRE maps. In the final step, multi-wavelength data were combined with the extracted SPIRE flux densities to constrain SEDs and provide stellar mass (M☉) and SFRs. These M☉ and SFRs were then used to populate the SFR-M☉ plane over 0.2 ≤ z < 6.0. Results. No significant evidence of a high-mass turn-over was found, resulting in the best fit being a simple two-parameter power law of the form log(SFR) = α(log(M☉) - 10:5] + β. The normalisation of the power law increased with redshift, rapidly at z ≲ 1.8, from 0.58 ± 0.09 at z ≈ 0:37 to 1.31 ± 0.08 at z ≈ 1.8. The slope was also found to increase with redshift, perhaps with an excess around 1.8 ≤ z < 2.9. Conclusions. The increasing slope indicates that galaxies become more self-similar as redshift increases. This implies that high-mass galaxies’ specific SFR increases with redshift, from 0.2 to 6.0, becoming closer to that of low-mass galaxies. The excess in the slope at 1.8 ≤ z < 2.9, if present, coincides with the peak of the cosmic star formation history
    corecore