62 research outputs found
Progressive Neural Architecture Search
We propose a new method for learning the structure of convolutional neural
networks (CNNs) that is more efficient than recent state-of-the-art methods
based on reinforcement learning and evolutionary algorithms. Our approach uses
a sequential model-based optimization (SMBO) strategy, in which we search for
structures in order of increasing complexity, while simultaneously learning a
surrogate model to guide the search through structure space. Direct comparison
under the same search space shows that our method is up to 5 times more
efficient than the RL method of Zoph et al. (2018) in terms of number of models
evaluated, and 8 times faster in terms of total compute. The structures we
discover in this way achieve state of the art classification accuracies on
CIFAR-10 and ImageNet.Comment: To appear in ECCV 2018 as oral. The code and checkpoint for PNASNet-5
trained on ImageNet (both Mobile and Large) can now be downloaded from
https://github.com/tensorflow/models/tree/master/research/slim#Pretrained.
Also see https://github.com/chenxi116/PNASNet.TF for refactored and
simplified TensorFlow code; see https://github.com/chenxi116/PNASNet.pytorch
for exact conversion to PyTorc
Pore water exchange-driven inorganic carbon export from intertidal salt marshes
© The Author(s), 2021. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Tamborski, J. J., Eagle, M., Kurylyk, B. L., Kroeger, K. D., Wang, Z. A., Henderson, P., & Charette: 1774-1792, https://doi.org/10.1002/lno.11721.Respiration in intertidal salt marshes generates dissolved inorganic carbon (DIC) that is exported to the coastal ocean by tidal exchange with the marsh platform. Understanding the link between physical drivers of water exchange and chemical flux is a key to constraining coastal wetland contributions to regional carbon budgets. The spatial and temporal (seasonal, annual) variability of marsh pore water exchange and DIC export was assessed from a microtidal salt marsh (Sage Lot Pond, Massachusetts). Spatial variability was constrained from 224Ra : 228Th disequilibria across two hydrologic units within the marsh sediments. Disequilibrium between the more soluble 224Ra and its sediment-bound parent 228Th reveals significant pore water exchange in the upper 5âcm of the marsh surface (0â36âLâmâ2 dâ1) that is most intense in low marsh elevation zones, driven by tidal overtopping. Surficial sediment DIC transport ranges from 0.0 to 0.7 gâCâmâ2 dâ1. The sub-surface sediment horizon intersected by mean low tide was disproportionately impacted by tidal pumping (20â80âLâmâ2 dâ1) and supplied a seasonal DIC flux of 1.7â5.4 gâCâmâ2 dâ1. Export exceeded 10 gâCâmâ2 dâ1 for another marsh unit, demonstrating that fluxes can vary substantially across salt marshes under similar conditions within the same estuary. Seasonal and annual variability in marsh pore water exchange, constrained from tidal time-series of radium isotopes, was driven in part by variability in mean sea level. Rising sea levels will further inundate high marsh elevation zones, which may lead to greater DIC export.This research was undertaken thanks in part to funding from the Canada First Research Excellence Fund, through the Ocean Frontier Institute. Additional funding was provided by the U.S. Geological Survey (USGS) Coastal & Marine Geology Program and the USGS Land Change Science Program's LandCarbon program
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Pretraining is the preliminary and fundamental step in developing capable
language models (LM). Despite this, pretraining data design is critically
under-documented and often guided by empirically unsupported intuitions. To
address this, we pretrain 28 1.5B parameter decoder-only models, training on
data curated (1) at different times, (2) with varying toxicity and quality
filters, and (3) with different domain compositions. First, we quantify the
effect of pretraining data age. A temporal shift between evaluation data and
pretraining data leads to performance degradation, which is not overcome by
finetuning. Second, we explore the effect of quality and toxicity filters,
showing a trade-off between performance on standard benchmarks and risk of
toxic generations. Our findings indicate there does not exist a
one-size-fits-all solution to filtering training data. We also find that the
effects of different types of filtering are not predictable from text domain
characteristics. Lastly, we empirically validate that the inclusion of
heterogeneous data sources, like books and web, is broadly beneficial and
warrants greater prioritization. These findings constitute the largest set of
experiments to validate, quantify, and expose many undocumented intuitions
about text pretraining, which we hope will help support more informed
data-centric decisions in LM development
PaLM: Scaling Language Modeling with Pathways
Large language models have been shown to achieve remarkable performance
across a variety of natural language tasks using few-shot learning, which
drastically reduces the number of task-specific training examples needed to
adapt the model to a particular application. To further our understanding of
the impact of scale on few-shot learning, we trained a 540-billion parameter,
densely activated, Transformer language model, which we call Pathways Language
Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML
system which enables highly efficient training across multiple TPU Pods. We
demonstrate continued benefits of scaling by achieving state-of-the-art
few-shot learning results on hundreds of language understanding and generation
benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough
performance, outperforming the finetuned state-of-the-art on a suite of
multi-step reasoning tasks, and outperforming average human performance on the
recently released BIG-bench benchmark. A significant number of BIG-bench tasks
showed discontinuous improvements from model scale, meaning that performance
steeply increased as we scaled to our largest model. PaLM also has strong
capabilities in multilingual tasks and source code generation, which we
demonstrate on a wide array of benchmarks. We additionally provide a
comprehensive analysis on bias and toxicity, and study the extent of training
data memorization with respect to model scale. Finally, we discuss the ethical
considerations related to large language models and discuss potential
mitigation strategies
The Athena X-ray Integral Field Unit (X-IFU)
The X-ray Integral Field Unit (X-IFU) is the high resolution X-ray spectrometer of the ESA Athena X-ray observatory. Over a field of view of 5' equivalent diameter, it will deliver X-ray spectra from 0.2 to 12 keV with a spectral resolution of 2.5 eV up to 7 keV on similar to 5 '' pixels. The X-IFU is based on a large format array of super-conducting molybdenum-gold Transition Edge Sensors cooled at similar to 90 mK, each coupled with an absorber made of gold and bismuth with a pitch of 249 mu m. A cryogenic anti-coincidence detector located underneath the prime TES array enables the non X-ray background to be reduced. A bath temperature of similar to 50 mK is obtained by a series of mechanical coolers combining 15K Pulse Tubes, 4K and 2K Joule-Thomson coolers which pre-cool a sub Kelvin cooler made of a He-3 sorption cooler coupled with an Adiabatic Demagnetization Refrigerator. Frequency domain multiplexing enables to read out 40 pixels in one single channel. A photon interacting with an absorber leads to a current pulse, amplified by the readout electronics and whose shape is reconstructed on board to recover its energy with high accuracy. The defocusing capability offered by the Athena movable mirror assembly enables the X-IFU to observe the brightest X-ray sources of the sky (up to Crab-like intensities) by spreading the telescope point spread function over hundreds of pixels. Thus the X-IFU delivers low pile-up, high throughput (> 50%), and typically 10 eV spectral resolution at 1 Crab intensities, i.e. a factor of 10 or more better than Silicon based X-ray detectors. In this paper, the current X-IFU baseline is presented, together with an assessment of its anticipated performance in terms of spectral resolution, background, and count rate capability. The X-IFU baseline configuration will be subject to a preliminary requirement review that is scheduled at the end of 2018. The X-IFU will be provided by an international consortium led by France, the Netherlands and Italy, with further ESA member state contributions from Belgium, Czech Republic, Finland, Germany, Ireland, Poland, Spain, Switzerland and contributions from Japan and the United States.Peer reviewe
The Athena X-ray Integral Field Unit: a consolidated design for the system requirement review of the preliminary definition phase
The Athena X-ray Integral Unit (X-IFU) is the high resolution X-ray
spectrometer, studied since 2015 for flying in the mid-30s on the Athena space
X-ray Observatory, a versatile observatory designed to address the Hot and
Energetic Universe science theme, selected in November 2013 by the Survey
Science Committee. Based on a large format array of Transition Edge Sensors
(TES), it aims to provide spatially resolved X-ray spectroscopy, with a
spectral resolution of 2.5 eV (up to 7 keV) over an hexagonal field of view of
5 arc minutes (equivalent diameter). The X-IFU entered its System Requirement
Review (SRR) in June 2022, at about the same time when ESA called for an
overall X-IFU redesign (including the X-IFU cryostat and the cooling chain),
due to an unanticipated cost overrun of Athena. In this paper, after
illustrating the breakthrough capabilities of the X-IFU, we describe the
instrument as presented at its SRR, browsing through all the subsystems and
associated requirements. We then show the instrument budgets, with a particular
emphasis on the anticipated budgets of some of its key performance parameters.
Finally we briefly discuss on the ongoing key technology demonstration
activities, the calibration and the activities foreseen in the X-IFU Instrument
Science Center, and touch on communication and outreach activities, the
consortium organisation, and finally on the life cycle assessment of X-IFU
aiming at minimising the environmental footprint, associated with the
development of the instrument. Thanks to the studies conducted so far on X-IFU,
it is expected that along the design-to-cost exercise requested by ESA, the
X-IFU will maintain flagship capabilities in spatially resolved high resolution
X-ray spectroscopy, enabling most of the original X-IFU related scientific
objectives of the Athena mission to be retained. (abridged).Comment: 48 pages, 29 figures, Accepted for publication in Experimental
Astronomy with minor editin
The Athena X-ray Integral Field Unit: a consolidated design for the system requirement review of the preliminary definition phase
The Athena X-ray Integral Unit (X-IFU) is the high resolution X-ray spectrometer studied since 2015 for flying in the mid-30s on the Athena space X-ray Observatory. Athena is a versatile observatory designed to address the Hot and Energetic Universe science theme, as selected in November 2013 by the Survey Science Committee. Based on a large format array of Transition Edge Sensors (TES), X-IFU aims to provide spatially resolved X-ray spectroscopy, with a spectral resolution of 2.5 eV (up to 7 keV) over a hexagonal field of view of 5 arc minutes (equivalent diameter). The X-IFU entered its System Requirement Review (SRR) in June 2022, at about the same time when ESA called for an overall X-IFU redesign (including the X-IFU cryostat and the cooling chain), due to an unanticipated cost overrun of Athena. In this paper, after illustrating the breakthrough capabilities of the X-IFU, we describe the instrument as presented at its SRR (i.e. in the course of its preliminary definition phase, so-called B1), browsing through all the subsystems and associated requirements. We then show the instrument budgets, with a particular emphasis on the anticipated budgets of some of its key performance parameters, such as the instrument efficiency, spectral resolution, energy scale knowledge, count rate capability, non X-ray background and target of opportunity efficiency. Finally, we briefly discuss the ongoing key technology demonstration activities, the calibration and the activities foreseen in the X-IFU Instrument Science Center, touch on communication and outreach activities, the consortium organisation and the life cycle assessment of X-IFU aiming at minimising the environmental footprint, associated with the development of the instrument. Thanks to the studies conducted so far on X-IFU, it is expected that along the design-to-cost exercise requested by ESA, the X-IFU will maintain flagship capabilities in spatially resolved high resolution X-ray spectroscopy, enabling most of the original X-IFU related scientific objectives of the Athena mission to be retained. The X-IFU will be provided by an international consortium led by France, The Netherlands and Italy, with ESA member state contributions from Belgium, Czech Republic, Finland, Germany, Poland, Spain, Switzerland, with additional contributions from the United States and Japan.The French contribution to X-IFU is funded by CNES, CNRS and CEA. This work has been also supported by ASI (Italian Space Agency) through the Contract 2019-27-HH.0, and by the ESA (European Space Agency) Core Technology Program (CTP) Contract No. 4000114932/15/NL/BW and the AREMBES - ESA CTP No.4000116655/16/NL/BW. This publication is part of grant RTI2018-096686-B-C21 funded by MCIN/AEI/10.13039/501100011033 and by âERDF A way of making Europeâ. This publication is part of grant RTI2018-096686-B-C21 and PID2020-115325GB-C31 funded by MCIN/AEI/10.13039/501100011033
- âŠ