62 research outputs found

    Progressive Neural Architecture Search

    Full text link
    We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. Our approach uses a sequential model-based optimization (SMBO) strategy, in which we search for structures in order of increasing complexity, while simultaneously learning a surrogate model to guide the search through structure space. Direct comparison under the same search space shows that our method is up to 5 times more efficient than the RL method of Zoph et al. (2018) in terms of number of models evaluated, and 8 times faster in terms of total compute. The structures we discover in this way achieve state of the art classification accuracies on CIFAR-10 and ImageNet.Comment: To appear in ECCV 2018 as oral. The code and checkpoint for PNASNet-5 trained on ImageNet (both Mobile and Large) can now be downloaded from https://github.com/tensorflow/models/tree/master/research/slim#Pretrained. Also see https://github.com/chenxi116/PNASNet.TF for refactored and simplified TensorFlow code; see https://github.com/chenxi116/PNASNet.pytorch for exact conversion to PyTorc

    Pore water exchange-driven inorganic carbon export from intertidal salt marshes

    Get PDF
    © The Author(s), 2021. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Tamborski, J. J., Eagle, M., Kurylyk, B. L., Kroeger, K. D., Wang, Z. A., Henderson, P., & Charette: 1774-1792, https://doi.org/10.1002/lno.11721.Respiration in intertidal salt marshes generates dissolved inorganic carbon (DIC) that is exported to the coastal ocean by tidal exchange with the marsh platform. Understanding the link between physical drivers of water exchange and chemical flux is a key to constraining coastal wetland contributions to regional carbon budgets. The spatial and temporal (seasonal, annual) variability of marsh pore water exchange and DIC export was assessed from a microtidal salt marsh (Sage Lot Pond, Massachusetts). Spatial variability was constrained from 224Ra : 228Th disequilibria across two hydrologic units within the marsh sediments. Disequilibrium between the more soluble 224Ra and its sediment-bound parent 228Th reveals significant pore water exchange in the upper 5 cm of the marsh surface (0–36 L m−2 d−1) that is most intense in low marsh elevation zones, driven by tidal overtopping. Surficial sediment DIC transport ranges from 0.0 to 0.7 g C m−2 d−1. The sub-surface sediment horizon intersected by mean low tide was disproportionately impacted by tidal pumping (20–80 L m−2 d−1) and supplied a seasonal DIC flux of 1.7–5.4 g C m−2 d−1. Export exceeded 10 g C m−2 d−1 for another marsh unit, demonstrating that fluxes can vary substantially across salt marshes under similar conditions within the same estuary. Seasonal and annual variability in marsh pore water exchange, constrained from tidal time-series of radium isotopes, was driven in part by variability in mean sea level. Rising sea levels will further inundate high marsh elevation zones, which may lead to greater DIC export.This research was undertaken thanks in part to funding from the Canada First Research Excellence Fund, through the Ocean Frontier Institute. Additional funding was provided by the U.S. Geological Survey (USGS) Coastal & Marine Geology Program and the USGS Land Change Science Program's LandCarbon program

    A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

    Full text link
    Pretraining is the preliminary and fundamental step in developing capable language models (LM). Despite this, pretraining data design is critically under-documented and often guided by empirically unsupported intuitions. To address this, we pretrain 28 1.5B parameter decoder-only models, training on data curated (1) at different times, (2) with varying toxicity and quality filters, and (3) with different domain compositions. First, we quantify the effect of pretraining data age. A temporal shift between evaluation data and pretraining data leads to performance degradation, which is not overcome by finetuning. Second, we explore the effect of quality and toxicity filters, showing a trade-off between performance on standard benchmarks and risk of toxic generations. Our findings indicate there does not exist a one-size-fits-all solution to filtering training data. We also find that the effects of different types of filtering are not predictable from text domain characteristics. Lastly, we empirically validate that the inclusion of heterogeneous data sources, like books and web, is broadly beneficial and warrants greater prioritization. These findings constitute the largest set of experiments to validate, quantify, and expose many undocumented intuitions about text pretraining, which we hope will help support more informed data-centric decisions in LM development

    PaLM: Scaling Language Modeling with Pathways

    Full text link
    Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM. We trained PaLM on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. We demonstrate continued benefits of scaling by achieving state-of-the-art few-shot learning results on hundreds of language understanding and generation benchmarks. On a number of these tasks, PaLM 540B achieves breakthrough performance, outperforming the finetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark. A significant number of BIG-bench tasks showed discontinuous improvements from model scale, meaning that performance steeply increased as we scaled to our largest model. PaLM also has strong capabilities in multilingual tasks and source code generation, which we demonstrate on a wide array of benchmarks. We additionally provide a comprehensive analysis on bias and toxicity, and study the extent of training data memorization with respect to model scale. Finally, we discuss the ethical considerations related to large language models and discuss potential mitigation strategies

    The Athena X-ray Integral Field Unit (X-IFU)

    Get PDF
    The X-ray Integral Field Unit (X-IFU) is the high resolution X-ray spectrometer of the ESA Athena X-ray observatory. Over a field of view of 5' equivalent diameter, it will deliver X-ray spectra from 0.2 to 12 keV with a spectral resolution of 2.5 eV up to 7 keV on similar to 5 '' pixels. The X-IFU is based on a large format array of super-conducting molybdenum-gold Transition Edge Sensors cooled at similar to 90 mK, each coupled with an absorber made of gold and bismuth with a pitch of 249 mu m. A cryogenic anti-coincidence detector located underneath the prime TES array enables the non X-ray background to be reduced. A bath temperature of similar to 50 mK is obtained by a series of mechanical coolers combining 15K Pulse Tubes, 4K and 2K Joule-Thomson coolers which pre-cool a sub Kelvin cooler made of a He-3 sorption cooler coupled with an Adiabatic Demagnetization Refrigerator. Frequency domain multiplexing enables to read out 40 pixels in one single channel. A photon interacting with an absorber leads to a current pulse, amplified by the readout electronics and whose shape is reconstructed on board to recover its energy with high accuracy. The defocusing capability offered by the Athena movable mirror assembly enables the X-IFU to observe the brightest X-ray sources of the sky (up to Crab-like intensities) by spreading the telescope point spread function over hundreds of pixels. Thus the X-IFU delivers low pile-up, high throughput (> 50%), and typically 10 eV spectral resolution at 1 Crab intensities, i.e. a factor of 10 or more better than Silicon based X-ray detectors. In this paper, the current X-IFU baseline is presented, together with an assessment of its anticipated performance in terms of spectral resolution, background, and count rate capability. The X-IFU baseline configuration will be subject to a preliminary requirement review that is scheduled at the end of 2018. The X-IFU will be provided by an international consortium led by France, the Netherlands and Italy, with further ESA member state contributions from Belgium, Czech Republic, Finland, Germany, Ireland, Poland, Spain, Switzerland and contributions from Japan and the United States.Peer reviewe

    The Athena X-ray Integral Field Unit: a consolidated design for the system requirement review of the preliminary definition phase

    Full text link
    The Athena X-ray Integral Unit (X-IFU) is the high resolution X-ray spectrometer, studied since 2015 for flying in the mid-30s on the Athena space X-ray Observatory, a versatile observatory designed to address the Hot and Energetic Universe science theme, selected in November 2013 by the Survey Science Committee. Based on a large format array of Transition Edge Sensors (TES), it aims to provide spatially resolved X-ray spectroscopy, with a spectral resolution of 2.5 eV (up to 7 keV) over an hexagonal field of view of 5 arc minutes (equivalent diameter). The X-IFU entered its System Requirement Review (SRR) in June 2022, at about the same time when ESA called for an overall X-IFU redesign (including the X-IFU cryostat and the cooling chain), due to an unanticipated cost overrun of Athena. In this paper, after illustrating the breakthrough capabilities of the X-IFU, we describe the instrument as presented at its SRR, browsing through all the subsystems and associated requirements. We then show the instrument budgets, with a particular emphasis on the anticipated budgets of some of its key performance parameters. Finally we briefly discuss on the ongoing key technology demonstration activities, the calibration and the activities foreseen in the X-IFU Instrument Science Center, and touch on communication and outreach activities, the consortium organisation, and finally on the life cycle assessment of X-IFU aiming at minimising the environmental footprint, associated with the development of the instrument. Thanks to the studies conducted so far on X-IFU, it is expected that along the design-to-cost exercise requested by ESA, the X-IFU will maintain flagship capabilities in spatially resolved high resolution X-ray spectroscopy, enabling most of the original X-IFU related scientific objectives of the Athena mission to be retained. (abridged).Comment: 48 pages, 29 figures, Accepted for publication in Experimental Astronomy with minor editin

    The Athena X-ray Integral Field Unit: a consolidated design for the system requirement review of the preliminary definition phase

    Get PDF
    The Athena X-ray Integral Unit (X-IFU) is the high resolution X-ray spectrometer studied since 2015 for flying in the mid-30s on the Athena space X-ray Observatory. Athena is a versatile observatory designed to address the Hot and Energetic Universe science theme, as selected in November 2013 by the Survey Science Committee. Based on a large format array of Transition Edge Sensors (TES), X-IFU aims to provide spatially resolved X-ray spectroscopy, with a spectral resolution of 2.5 eV (up to 7 keV) over a hexagonal field of view of 5 arc minutes (equivalent diameter). The X-IFU entered its System Requirement Review (SRR) in June 2022, at about the same time when ESA called for an overall X-IFU redesign (including the X-IFU cryostat and the cooling chain), due to an unanticipated cost overrun of Athena. In this paper, after illustrating the breakthrough capabilities of the X-IFU, we describe the instrument as presented at its SRR (i.e. in the course of its preliminary definition phase, so-called B1), browsing through all the subsystems and associated requirements. We then show the instrument budgets, with a particular emphasis on the anticipated budgets of some of its key performance parameters, such as the instrument efficiency, spectral resolution, energy scale knowledge, count rate capability, non X-ray background and target of opportunity efficiency. Finally, we briefly discuss the ongoing key technology demonstration activities, the calibration and the activities foreseen in the X-IFU Instrument Science Center, touch on communication and outreach activities, the consortium organisation and the life cycle assessment of X-IFU aiming at minimising the environmental footprint, associated with the development of the instrument. Thanks to the studies conducted so far on X-IFU, it is expected that along the design-to-cost exercise requested by ESA, the X-IFU will maintain flagship capabilities in spatially resolved high resolution X-ray spectroscopy, enabling most of the original X-IFU related scientific objectives of the Athena mission to be retained. The X-IFU will be provided by an international consortium led by France, The Netherlands and Italy, with ESA member state contributions from Belgium, Czech Republic, Finland, Germany, Poland, Spain, Switzerland, with additional contributions from the United States and Japan.The French contribution to X-IFU is funded by CNES, CNRS and CEA. This work has been also supported by ASI (Italian Space Agency) through the Contract 2019-27-HH.0, and by the ESA (European Space Agency) Core Technology Program (CTP) Contract No. 4000114932/15/NL/BW and the AREMBES - ESA CTP No.4000116655/16/NL/BW. This publication is part of grant RTI2018-096686-B-C21 funded by MCIN/AEI/10.13039/501100011033 and by “ERDF A way of making Europe”. This publication is part of grant RTI2018-096686-B-C21 and PID2020-115325GB-C31 funded by MCIN/AEI/10.13039/501100011033
    • 

    corecore