1,219 research outputs found

    Small-scale proxies for large-scale Transformer training instabilities

    Full text link
    Teams that have trained large Transformer-based models have reported training instabilities at large scale that did not appear when training with the same hyperparameters at smaller scales. Although the causes of such instabilities are of scientific interest, the amount of resources required to reproduce them has made investigation difficult. In this work, we seek ways to reproduce and study training stability and instability at smaller scales. First, we focus on two sources of training instability described in previous work: the growth of logits in attention layers (Dehghani et al., 2023) and divergence of the output logits from the log probabilities (Chowdhery et al., 2022). By measuring the relationship between learning rate and loss across scales, we show that these instabilities also appear in small models when training at high learning rates, and that mitigations previously employed at large scales are equally effective in this regime. This prompts us to investigate the extent to which other known optimizer and model interventions influence the sensitivity of the final loss to changes in the learning rate. To this end, we study methods such as warm-up, weight decay, and the μ\muParam (Yang et al., 2022), and combine techniques to train small models that achieve similar losses across orders of magnitude of learning rate variation. Finally, to conclude our exploration we study two cases where instabilities can be predicted before they emerge by examining the scaling behavior of model activation and gradient norms

    Measurements of elliptic and triangular flow in high-multiplicity 3^{3}He++Au collisions at sNN=200\sqrt{s_{_{NN}}}=200 GeV

    Full text link
    We present the first measurement of elliptic (v2v_2) and triangular (v3v_3) flow in high-multiplicity 3^{3}He++Au collisions at sNN=200\sqrt{s_{_{NN}}}=200 GeV. Two-particle correlations, where the particles have a large separation in pseudorapidity, are compared in 3^{3}He++Au and in pp++pp collisions and indicate that collective effects dominate the second and third Fourier components for the correlations observed in the 3^{3}He++Au system. The collective behavior is quantified in terms of elliptic v2v_2 and triangular v3v_3 anisotropy coefficients measured with respect to their corresponding event planes. The v2v_2 values are comparable to those previously measured in dd++Au collisions at the same nucleon-nucleon center-of-mass energy. Comparison with various theoretical predictions are made, including to models where the hot spots created by the impact of the three 3^{3}He nucleons on the Au nucleus expand hydrodynamically to generate the triangular flow. The agreement of these models with data may indicate the formation of low-viscosity quark-gluon plasma even in these small collision systems.Comment: 630 authors, 9 pages, 4 figures, 2 tables. v2 is the version accepted for publication by Physical Review Letters. Plain text data tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.htm

    Natural selection shaped the rise and fall of passenger pigeon genomic diversity.

    Get PDF
    The extinct passenger pigeon was once the most abundant bird in North America, and possibly the world. Although theory predicts that large populations will be more genetically diverse, passenger pigeon genetic diversity was surprisingly low. To investigate this disconnect, we analyzed 41 mitochondrial and 4 nuclear genomes from passenger pigeons and 2 genomes from band-tailed pigeons, which are passenger pigeons' closest living relatives. Passenger pigeons' large population size appears to have allowed for faster adaptive evolution and removal of harmful mutations, driving a huge loss in their neutral genetic diversity. These results demonstrate the effect that selection can have on a vertebrate genome and contradict results that suggested that population instability contributed to this species's surprisingly rapid extinction

    Bistability and Oscillations in Gene Regulation Mediated by Small Noncoding RNAs

    Get PDF
    The interplay of small noncoding RNAs (sRNAs), mRNAs, and proteins has been shown to play crucial roles in almost all cellular processes. As key post-transcriptional regulators of gene expression, the mechanisms and roles of sRNAs in various cellular processes still need to be fully understood. When participating in cellular processes, sRNAs mainly mediate mRNA degradation or translational repression. Here, we show how the dynamics of two minimal architectures is drastically affected by these two mechanisms. A comparison is also given to reveal the implication of the fundamental differences. This study may help us to analyze complex networks assembled by simple modules more easily. A better knowledge of the sRNA-mediated motifs is also of interest for bio-engineering and artificial control

    Snazer: the simulations and networks analyzer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Networks are widely recognized as key determinants of structure and function in systems that span the biological, physical, and social sciences. They are static pictures of the interactions among the components of complex systems. Often, much effort is required to identify networks as part of particular patterns as well as to visualize and interpret them.</p> <p>From a pure dynamical perspective, simulation represents a relevant <it>way</it>-<it>out</it>. Many simulator tools capitalized on the "noisy" behavior of some systems and used formal models to represent cellular activities as temporal trajectories. Statistical methods have been applied to a fairly large number of replicated trajectories in order to infer knowledge.</p> <p>A tool which both graphically manipulates reactive models and deals with sets of simulation time-course data by aggregation, interpretation and statistical analysis is missing and could add value to simulators.</p> <p>Results</p> <p>We designed and implemented <it>Snazer</it>, the simulations and networks analyzer. Its goal is to aid the processes of visualizing and manipulating reactive models, as well as to share and interpret time-course data produced by stochastic simulators or by any other means.</p> <p>Conclusions</p> <p><it>Snazer </it>is a solid prototype that integrates biological network and simulation time-course data analysis techniques.</p

    Quality of Reporting and Study Design of CKD Cohort Studies Assessing Mortality in the Elderly Before and After STROBE:A Systematic Review

    Get PDF
    BACKGROUND:The STrengthening the Reporting of OBservational studies in Epidemiology (STROBE) statement was published in October 2007 to improve quality of reporting of observational studies. The aim of this review was to assess the impact of the STROBE statement on observational study reporting and study design quality in the nephrology literature. STUDY DESIGN:Systematic literature review. SETTING & POPULATION:European and North American, Pre-dialysis Chronic Kidney Disease (CKD) cohort studies. SELECTION CRITERIA FOR STUDIES:Studies assessing the association between CKD and mortality in the elderly (>65 years) published from 1st January 2002 to 31st December 2013 were included, following systematic searching of MEDLINE & EMBASE. PREDICTOR:Time period before and after the publication of the STROBE statement. OUTCOME:Quality of study reporting using the STROBE statement and quality of study design using the Newcastle Ottawa Scale (NOS), Scottish Intercollegiate Guidelines Network (SIGN) and Critical Appraisal Skills Programme (CASP) tools. RESULTS:37 papers (11 Pre & 26 Post STROBE) were identified from 3621 potential articles. Only four of the 22 STROBE items and their sub-criteria (objectives reporting, choice of quantitative groups and description of and carrying out sensitivity analysis) showed improvements, with the majority of items showing little change between the period before and after publication of the STROBE statement. Pre- and post-period analysis revealed a Manuscript STROBE score increase (median score 77.8% (Inter-quartile range [IQR], 64.7-82.0) vs 83% (IQR, 78.4-84.9, p = 0.05). There was no change in quality of study design with identical median scores in the two periods for NOS (Manuscript NOS score 88.9), SIGN (Manuscript SIGN score 83.3) and CASP (Manuscript CASP score 91.7) tools. LIMITATIONS:Only 37 Studies from Europe and North America were included from one medical specialty. Assessment of study design largely reliant on good reporting. CONCLUSIONS:This study highlights continuing deficiencies in the reporting of STROBE items and their sub-criteria in cohort studies in nephrology. There was weak evidence of improvement in the overall reporting quality, with no improvement in methodological quality of CKD cohort studies between the period before and after publication of the STROBE statement

    Intended Consequences Statement in Conservation Science and Practice

    Get PDF
    As the biodiversity crisis accelerates, the stakes are higher for threatened plants and animals. Rebuilding the health of our planet will require addressing underlying threats at many scales, including habitat loss and climate change. Conservation interventions such as habitat protection, management, restoration, predator control, trans location, genetic rescue, and biological control have the potential to help threatened or endangered species avert extinction. These existing, well-tested methods can be complemented and augmented by more frequent and faster adoption of new technologies, such as powerful new genetic tools. In addition, synthetic biology might offer solutions to currently intractable conservation problems. We believe that conservation needs to be bold and clear-eyed in this moment of great urgency
    corecore