46 research outputs found
Generalizability in Document Layout Analysis for Scientific Article Figure & Caption Extraction
The lack of generalizability -- in which a model trained on one dataset
cannot provide accurate results for a different dataset -- is a known problem
in the field of document layout analysis. Thus, when a model is used to locate
important page objects in scientific literature such as figures, tables,
captions, and math formulas, the model often cannot be applied successfully to
new domains. While several solutions have been proposed, including newer and
updated deep learning models, larger hand-annotated datasets, and the
generation of large synthetic datasets, so far there is no "magic bullet" for
translating a model trained on a particular domain or historical time period to
a new field. Here we present our ongoing work in translating our document
layout analysis model from the historical astrophysical literature to the
larger corpus of scientific documents within the HathiTrust U.S. Federal
Documents collection. We use this example as an avenue to highlight some of the
problems with generalizability in the document layout analysis community and
discuss several challenges and possible solutions to address these issues. All
code for this work is available on The Reading Time Machine GitHub repository
(https://github.com/ReadingTimeMachine/htrc_short_conf).Comment: 9 pages, 3 figures, submitted as part of AEOLIAN Workshop 5: Making
More Sense With Machines: AI/ML Methods for Interrogating and Understanding
Our Textual Heritage in the Humanities, Natural Sciences, and Social Science
First results from the IllustrisTNG simulations: the stellar mass content of groups and clusters of galaxies
The IllustrisTNG project is a new suite of cosmological
magneto-hydrodynamical simulations of galaxy formation performed with the Arepo
code and updated models for feedback physics. Here we introduce the first two
simulations of the series, TNG100 and TNG300, and quantify the stellar mass
content of about 4000 massive galaxy groups and clusters () at recent times (). The richest
clusters have half of their total stellar mass bound to satellite galaxies,
with the other half being associated with the central galaxy and the diffuse
intra-cluster light. The exact ICL fraction depends sensitively on the
definition of a central galaxy's mass and varies in our most massive clusters
between 20 to 40% of the total stellar mass. Haloes of and above have more diffuse stellar mass outside 100 kpc than within 100
kpc, with power-law slopes of the radial mass density distribution as shallow
as the dark matter's ( ). Total halo mass is a
very good predictor of stellar mass, and vice versa: at , the 3D stellar
mass measured within 30 kpc scales as with a
dex scatter. This is possibly too steep in comparison to the
available observational constraints, even though the abundance of TNG less
massive galaxies ( in stars) is in good agreement with
the measured galaxy stellar mass functions at recent epochs. The 3D sizes of
massive galaxies fall too on a tight (0.16 dex scatter) power-law
relation with halo mass, with . Even more fundamentally, halo mass alone is a good predictor
for the whole stellar mass profiles beyond the inner few kpc, and we show how
on average these can be precisely recovered given a single mass measurement of
the galaxy or its halo.Comment: Accepted by MNRAS, updated to match published version. Highlights:
Figures 5, 9, 11. The IllustrisTNG website can be found at
http://www.tng-project.org
Supermassive black holes and their feedback effects in the IllustrisTNG simulation
We study the population of supermassive black holes (SMBHs) and their effects
on massive central galaxies in the IllustrisTNG cosmological hydrodynamical
simulations of galaxy formation. The employed model for SMBH growth and
feedback assumes a two-mode scenario in which the feedback from active galactic
nuclei occurs through a kinetic, comparatively efficient mode at low accretion
rates relative to the Eddington limit, and in the form of a thermal, less
efficient mode at high accretion rates. We show that the quenching of massive
central galaxies happens coincidently with kinetic-mode feedback, consistent
with the notion that active supermassive black cause the low specific star
formation rates observed in massive galaxies. However, major galaxy mergers are
not responsible for initiating most of the quenching events in our model. Up to
black hole masses of about , the dominant growth
channel for SMBHs is in the thermal mode. Higher mass black holes stay mainly
in the kinetic mode and gas accretion is self-regulated via their feedback,
which causes their Eddington ratios to drop, with SMBH mergers becoming the
main channel for residual mass growth. As a consequence, the quasar luminosity
function is dominated by rapidly accreting, moderately massive black holes in
the thermal mode. We show that the associated growth history of SMBHs produces
a low-redshift quasar luminosity function and a redshift zero black hole
mass-stellar bulge mass relation in good agreement with observations, whereas
the simulation tends to over-predict the high-redshift quasar luminosity
function.Comment: 16 pages, 11 figures, submitted to MNRAS, the IllustrisTNG project
website: www.tng-project.org, comments welcom
First results from the IllustrisTNG simulations: radio haloes and magnetic fields
We introduce the IllustrisTNG project, a new suite of cosmological
magnetohydrodynamical simulations performed with the moving-mesh code AREPO
employing an updated Illustris galaxy formation model. Here we focus on the
general properties of magnetic fields and the diffuse radio emission in galaxy
clusters. Magnetic fields are prevalent in galaxies, and their build-up is
closely linked to structure formation. We find that structure formation
amplifies the initial seed fields ( comoving Gauss) to the values
observed in low-redshift galaxies (). The magnetic field
topology is closely connected to galaxy morphology such that irregular fields
are hosted by early-type galaxies, while large-scale, ordered fields are
present in disc galaxies. Using two simple models for the energy distribution
of relativistic electrons we predict the diffuse radio emission of
clusters with a baryonic mass resolution of , and generate mock observations for VLA, LOFAR, ASKAP and SKA. Our
simulated clusters show extended radio emission, whose detectability correlates
with their virial mass. We reproduce the observed scaling relations between
total radio power and X-ray emission, , and the Sunyaev-Zel'dovich
parameter. The radio emission surface brightness profiles of our
most massive clusters are in reasonable agreement with VLA measurements of Coma
and Perseus. Finally, we discuss the fraction of detected extended radio haloes
as a function of virial mass and source count functions for different
instruments. Overall our results agree encouragingly well with observations,
but a refined analysis requires a more sophisticated treatment of relativistic
particles in large-scale galaxy formation simulations.Comment: 28 pages, 18 figures, 2 tables, 3 appendices. Added a new
relativistic electron energy parametrization and text modifications to match
the accepted version for publication in MNRAS. More information, images and
movies of the IllustrisTNG project can be found at http://www.tng-project.or