46 research outputs found

    Leveraging knowledge graphs to update scientific word embeddings using latent semantic imputation

    Full text link
    The most interesting words in scientific texts will often be novel or rare. This presents a challenge for scientific word embedding models to determine quality embedding vectors for useful terms that are infrequent or newly emerging. We demonstrate how \gls{lsi} can address this problem by imputing embeddings for domain-specific words from up-to-date knowledge graphs while otherwise preserving the original word embedding model. We use the MeSH knowledge graph to impute embedding vectors for biomedical terminology without retraining and evaluate the resulting embedding model on a domain-specific word-pair similarity task. We show that LSI can produce reliable embedding vectors for rare and OOV terms in the biomedical domain.Comment: Accepted for the Workshop on Information Extraction from Scientific Publications at AACL-IJCNLP 202

    The GALEX Arecibo SDSS Survey. VIII. Final Data Release -- The Effect of Group Environment on the Gas Content of Massive Galaxies

    Full text link
    We present the final data release from the GALEX Arecibo SDSS Survey (GASS), a large Arecibo program that measured the HI properties for an unbiased sample of ~800 galaxies with stellar masses greater than 10^10 Msun and redshifts 0.025<z<0.05. This release includes new Arecibo observations for 250 galaxies. We use the full GASS sample to investigate environmental effects on the cold gas content of massive galaxies at fixed stellar mass. The environment is characterized in terms of dark matter halo mass, obtained by cross-matching our sample with the SDSS group catalog of Yang et al. Our analysis provides, for the first time, clear statistical evidence that massive galaxies located in halos with masses of 10^13-10^14 Msun have at least 0.4 dex less HI than objects in lower density environments. The process responsible for the suppression of gas in group galaxies most likely drives the observed quenching of the star formation in these systems. Our findings strongly support the importance of the group environment for galaxy evolution, and have profound implications for semi-analytic models of galaxy formation, which currently do not allow for stripping of the cold interstellar medium in galaxy groups.Comment: 36 pages, 16 figures. Accepted for publication in MNRAS. Version with supplementary material available at http://www.mpa-garching.mpg.de/GASS/pubs.php . GASS released data can be found at http://www.mpa-garching.mpg.de/GASS/data.ph

    The Aromatic Features in Very Faint Dwarf Galaxies

    Full text link
    We present optical and mid-infrared photometry of a statistically complete sample of 29 very faint dwarf galaxies (M_r > -15 mag) selected from the SDSS spectroscopic sample and observed in the mid-infrared with Spitzer IRAC. This sample contains nearby (redshift z<0.005) galaxies three magnitudes fainter than previously studied samples. We compare our sample with other star-forming galaxies that have been observed with both IRAC and SDSS. We examine the relationship of the infrared color, sensitive to PAH abundance, with star-formation rates, gas-phase metallicities and radiation hardness, all estimated from optical emission lines. Consistent with studies of more luminous dwarfs, we find that the very faint dwarf galaxies show much weaker PAH emission than more luminous galaxies with similar specific star-formation rates. Unlike more luminous galaxies, we find that the very faint dwarf galaxies show no significant dependence at all of PAH emission on star-formation rate, metallicity, or radiation hardness, despite the fact that the sample spans a significant range in all of these quantities. When the very faint dwarfs in our sample are compared with more luminous (M_r ~ -18 mag) dwarfs, we find that PAH emission depends on metallicity and radiation hardness. These two parameters are correlated; we look at the PAH-metallicity relation at fixed radiation hardness and the PAH-hardness relation at fixed metallicity. This test shows that the PAH emission in dwarf galaxies depends most directly on metallicity.Comment: submitted to Ap

    Benchmarking and analyzing in-context learning, fine-tuning and supervised learning for biomedical knowledge curation: a focused study on chemical entities of biological interest

    Get PDF
    Automated knowledge curation for biomedical ontologies is key to ensure that they remain comprehensive, high-quality and up-to-date. In the era of foundational language models, this study compares and analyzes three NLP paradigms for curation tasks: in-context learning (ICL), fine-tuning (FT), and supervised learning (ML). Using the Chemical Entities of Biological Interest (ChEBI) database as a model ontology, three curation tasks were devised. For ICL, three prompting strategies were employed with GPT-4, GPT-3.5, BioGPT. PubmedBERT was chosen for the FT paradigm. For ML, six embedding models were utilized for training Random Forest and Long-Short Term Memory models. Five setups were designed to assess ML and FT model performance across different data availability scenarios.Datasets for curation tasks included: task 1 (620,386), task 2 (611,430), and task 3 (617,381), maintaining a 50:50 positive versus negative ratio. For ICL models, GPT-4 achieved best accuracy scores of 0.916, 0.766 and 0.874 for tasks 1-3 respectively. In a direct comparison, ML (trained on ~260,000 triples) outperformed ICL in accuracy across all tasks. (accuracy differences: +.11, +.22 and +.17). Fine-tuned PubmedBERT performed similarly to leading ML models in tasks 1 &amp; 2 (F1 differences: -.014 and +.002), but worse in task 3 (-.048). Simulations revealed performance declines in both ML and FT models with smaller and higher imbalanced training data. where ICL (particularly GPT-4) excelled in tasks 1 &amp; 3. GPT-4 excelled in tasks 1 and 3 with less than 6,000 triples, surpassing ML/FT. ICL underperformed ML/FT in task 2.ICL-augmented foundation models can be good assistants for knowledge curation with correct prompting, however, not making ML and FT paradigms obsolete. The latter two require task-specific data to beat ICL. In such cases, ML relies on small pretrained embeddings, minimizing computational demands

    Herschel SPIRE-FTS Observations of Excited CO and [CI] in the Antennae (NGC 4038/39): Warm and Cold Molecular Gas

    Get PDF
    We present Herschel SPIRE-FTS observations of the Antennae (NGC 4038/39), a well studied, nearby (2222 Mpc) ongoing merger between two gas rich spiral galaxies. We detect 5 CO transitions (J=43J=4-3 to J=87J=8-7), both [CI] transitions and the [NII]205μm205\mu m transition across the entire system, which we supplement with ground based observations of the CO J=10J=1-0, J=21J=2-1 and J=32J=3-2 transitions, and Herschel PACS observations of [CII] and [OI]63μm63\mu m. Using the CO and [CI] transitions, we perform both a LTE analysis of [CI], and a non-LTE radiative transfer analysis of CO and [CI] using the radiative transfer code RADEX along with a Bayesian likelihood analysis. We find that there are two components to the molecular gas: a cold (Tkin1030T_{kin}\sim 10-30 K) and a warm (Tkin100T_{kin} \gtrsim 100 K) component. By comparing the warm gas mass to previously observed values, we determine a CO abundance in the warm gas of xCO5×105x_{CO} \sim 5\times 10^{-5}. If the CO abundance is the same in the warm and cold gas phases, this abundance corresponds to a CO J=10J=1-0 luminosity-to-mass conversion factor of $\alpha_{CO} \sim 7 \ M_{\odot}{pc^{-2} \ (K \ km \ s^{-1})^{-1}}inthecoldcomponent,similartothevaluefornormalspiralgalaxies.WeestimatethecoolingfromH in the cold component, similar to the value for normal spiral galaxies. We estimate the cooling from H_2,[CII],COand[OI], [CII], CO and [OI]63\mu mtobe to be \sim 0.01 L_{\odot}/M_{\odot}.WecomparePDRmodelstotheratioofthefluxofvariousCOtransitions,alongwiththeratiooftheCOfluxtothefarinfraredfluxinNGC4038,NGC4039andtheoverlapregion.WefindthatthedensitiesrecoveredfromournonLTEanalysisareconsistentwithabackgroundfarultravioletfieldofstrength. We compare PDR models to the ratio of the flux of various CO transitions, along with the ratio of the CO flux to the far-infrared flux in NGC 4038, NGC 4039 and the overlap region. We find that the densities recovered from our non-LTE analysis are consistent with a background far-ultraviolet field of strength G_0\sim 1000$. Finally, we find that a combination of turbulent heating, due to the ongoing merger, and supernova and stellar winds are sufficient to heat the molecular gas.Comment: 50 pages, 15 figures, 8 tables, Accepted for publication in The Astrophysical Journa

    Observing Extended Sources with the \Herschel SPIRE Fourier Transform Spectrometer

    Get PDF
    The Spectral and Photometric Imaging Receiver (SPIRE) on the European Space Agency's Herschel Space Observatory utilizes a pioneering design for its imaging spectrometer in the form of a Fourier Transform Spectrometer (FTS). The standard FTS data reduction and calibration schemes are aimed at objects with either a spatial extent much larger than the beam size or a source that can be approximated as a point source within the beam. However, when sources are of intermediate spatial extent, neither of these calibrations schemes is appropriate and both the spatial response of the instrument and the source's light profile must be taken into account and the coupling between them explicitly derived. To that end, we derive the necessary corrections using an observed spectrum of a fully extended source with the beam profile and the source's light profile taken into account. We apply the derived correction to several observations of planets and compare the corrected spectra with their spectral models to study the beam coupling efficiency of the instrument in the case of partially extended sources. We find that we can apply these correction factors for sources with angular sizes up to \theta_{D} ~ 17". We demonstrate how the angular size of an extended source can be estimated using the difference between the sub-spectra observed at the overlap bandwidth of the two frequency channels in the spectrometer, at 959<\nu<989 GHz. Using this technique on an observation of Saturn, we estimate a size of 17.2", which is 3% larger than its true size on the day of observation. Finally, we show the results of the correction applied on observations of a nearby galaxy, M82, and the compact core of a Galactic molecular cloud, Sgr B2.Comment: Accepted for publication by A&

    Radiative and mechanical feedback into the molecular gas in the Large Magellanic Cloud. I. N159W

    Get PDF
    We present Herschel SPIRE Fourier Transform Spectrometer (FTS) observations of N159W, an active star-forming region in the Large Magellanic Cloud (LMC). In our observations, a number of far-infrared cooling lines including CO(4-3) to CO(12-11), [CI] 609 and 370 micron, and [NII] 205 micron are clearly detected. With an aim of investigating the physical conditions and excitation processes of molecular gas, we first construct CO spectral line energy distributions (SLEDs) on 10 pc scales by combining the FTS CO transitions with ground-based low-J CO data and analyze the observed CO SLEDs using non-LTE radiative transfer models. We find that the CO-traced molecular gas in N159W is warm (kinetic temperature of 153-754 K) and moderately dense (H2 number density of (1.1-4.5)e3 cm-3). To assess the impact of the energetic processes in the interstellar medium on the physical conditions of the CO-emitting gas, we then compare the observed CO line intensities with the models of photodissociation regions (PDRs) and shocks. We first constrain the properties of PDRs by modelling Herschel observations of [OI] 145, [CII] 158, and [CI] 370 micron fine-structure lines and find that the constrained PDR components emit very weak CO emission. X-rays and cosmic-rays are also found to provide a negligible contribution to the CO emission, essentially ruling out ionizing sources (ultraviolet photons, X-rays, and cosmic-rays) as the dominant heating source for CO in N159W. On the other hand, mechanical heating by low-velocity C-type shocks with ~10 km/s appears sufficient enough to reproduce the observed warm CO.Comment: accepted for publication in A&

    The GALEX Arecibo SDSS Survey. IV. Baryonic Mass-Velocity-Size Relations of Massive Galaxies

    Full text link
    We present dynamical scaling relations for a homogeneous and representative sample of ~500 massive galaxies, selected only by stellar mass (>10^10 Msun) and redshift (0.025<z<0.05) as part of the ongoing GALEX Arecibo SDSS Survey. We compare baryonic Tully-Fisher (BTF) and Faber-Jackson (BFJ) relations for this sample, and investigate how galaxies scatter around the best fits obtained for pruned subsets of disk-dominated and bulge-dominated systems. The BFJ relation is significantly less scattered than the BTF when the relations are applied to their maximum samples, and is not affected by the inclination problems that plague the BTF. Disk-dominated, gas-rich galaxies systematically deviate from the BFJ relation defined by the spheroids. We demonstrate that by applying a simple correction to the stellar velocity dispersions that depends only on the concentration index of the galaxy, we are able to bring disks and spheroids onto the same dynamical relation -- in other words, we obtain a generalized BFJ relation that holds for all the galaxies in our sample, regardless of morphology, inclination or gas content, and has a scatter smaller than 0.1 dex. We find that disks and spheroids are offset in the stellar dispersion-size relation, and that the offset is removed when corrected dispersions are used instead. The generalized BFJ relation represents a fundamental correlation between the global dark matter and baryonic content of galaxies, which is obeyed by all (massive) systems regardless of morphology. [abridged]Comment: 20 pages, 15 figures. Accepted for publication in MNRAS. GASS publications and released data can be found at http://www.mpa-garching.mpg.de/GASS/index.ph
    corecore