75 research outputs found
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers
Transformers have achieved great success in machine learning applications.
Normalization techniques, such as Layer Normalization (LayerNorm, LN) and Root
Mean Square Normalization (RMSNorm), play a critical role in accelerating and
stabilizing the training of Transformers. While LayerNorm recenters and
rescales input vectors, RMSNorm only rescales the vectors by their RMS value.
Despite being more computationally efficient, RMSNorm may compromise the
representation ability of Transformers. There is currently no consensus
regarding the preferred normalization technique, as some models employ
LayerNorm while others utilize RMSNorm, especially in recent large language
models. It is challenging to convert Transformers with one normalization to the
other type. While there is an ongoing disagreement between the two
normalization types, we propose a solution to unify two mainstream Transformer
architectures, Pre-LN and Pre-RMSNorm Transformers. By removing the inherent
redundant mean information in the main branch of Pre-LN Transformers, we can
reduce LayerNorm to RMSNorm, achieving higher efficiency. We further propose
the Compressed RMSNorm (CRMSNorm) and Pre-CRMSNorm Transformer based on a
lossless compression of the zero-mean vectors. We formally establish the
equivalence of Pre-LN, Pre-RMSNorm, and Pre-CRMSNorm Transformer variants in
both training and inference. It implies that Pre-LN Transformers can be
substituted with Pre-(C)RMSNorm counterparts at almost no cost, offering the
same arithmetic functionality along with free efficiency improvement.
Experiments demonstrate that we can reduce the training and inference time of
Pre-LN Transformers by up to 10%.Comment: 15 pages, 5 tables, code available at
https://github.com/ZixuanJiang/pre-rmsnorm-transforme
Crowdfunding Dynamics Tracking: A Reinforcement Learning Approach
Recent years have witnessed the increasing interests in research of
crowdfunding mechanism. In this area, dynamics tracking is a significant issue
but is still under exploration. Existing studies either fit the fluctuations of
time-series or employ regularization terms to constrain learned tendencies.
However, few of them take into account the inherent decision-making process
between investors and crowdfunding dynamics. To address the problem, in this
paper, we propose a Trajectory-based Continuous Control for Crowdfunding (TC3)
algorithm to predict the funding progress in crowdfunding. Specifically,
actor-critic frameworks are employed to model the relationship between
investors and campaigns, where all of the investors are viewed as an agent that
could interact with the environment derived from the real dynamics of
campaigns. Then, to further explore the in-depth implications of patterns
(i.e., typical characters) in funding series, we propose to subdivide them into
and ones. Moreover, for the
purpose of switching from different kinds of patterns, the actor component of
TC3 is extended with a structure of options, which comes to the TC3-Options.
Finally, extensive experiments on the Indiegogo dataset not only demonstrate
the effectiveness of our methods, but also validate our assumption that the
entire pattern learned by TC3-Options is indeed the U-shaped one
DREAMPlaceFPGA-MP: An Open-Source GPU-Accelerated Macro Placer for Modern FPGAs with Cascade Shapes and Region Constraints
FPGA macro placement plays a pivotal role in routability and timing closer to
the modern FPGA physical design flow. In modern FPGAs, macros could be subject
to complex cascade shape constraints requiring instances to be placed in
consecutive sites. In addition, in real-world FPGA macro placement scenarios,
designs could have various region constraints that specify boundaries within
which certain design instances and macros should be placed. In this work, we
present DREAMPlaceFPGA-MP, an open-source GPU-accelerated FPGA macro-placer
that efficiently generates legal placements for macros while honoring cascade
shape requirements and region constraints. Treating multiple macros in a
cascade shape as a large single instance and restricting instances to their
respective regions, DREAMPlaceFPGA-MP obtains roughly legal placements. The
macros are legalized in multiple steps to efficiently handle cascade shapes and
region constraints. Our experimental results demonstrate that DREAMPlaceFPGA-MP
is among the top contestants of the MLCAD 2023 FPGA Macro-Placement Contest
A compact butterfly-style silicon photonic-electronic neural chip for hardware-efficient deep learning
The optical neural network (ONN) is a promising hardware platform for
next-generation neurocomputing due to its high parallelism, low latency, and
low energy consumption. Previous ONN architectures are mainly designed for
general matrix multiplication (GEMM), leading to unnecessarily large area cost
and high control complexity. Here, we move beyond classical GEMM-based ONNs and
propose an optical subspace neural network (OSNN) architecture, which trades
the universality of weight representation for lower optical component usage,
area cost, and energy consumption. We devise a butterfly-style
photonic-electronic neural chip to implement our OSNN with up to 7x fewer
trainable optical components compared to GEMM-based ONNs. Additionally, a
hardware-aware training framework is provided to minimize the required device
programming precision, lessen the chip area, and boost the noise robustness. We
experimentally demonstrate the utility of our neural chip in practical image
recognition tasks, showing that a measured accuracy of 94.16% can be achieved
in hand-written digit recognition tasks with 3-bit weight programming
precision.Comment: 17 pages,5 figure
ADEPT: Automatic Differentiable DEsign of Photonic Tensor Cores
Photonic tensor cores (PTCs) are essential building blocks for optical
artificial intelligence (AI) accelerators based on programmable photonic
integrated circuits. PTCs can achieve ultra-fast and efficient tensor
operations for neural network (NN) acceleration. Current PTC designs are either
manually constructed or based on matrix decomposition theory, which lacks the
adaptability to meet various hardware constraints and device specifications. To
our best knowledge, automatic PTC design methodology is still unexplored. It
will be promising to move beyond the manual design paradigm and "nurture"
photonic neurocomputing with AI and design automation. Therefore, in this work,
for the first time, we propose a fully differentiable framework, dubbed ADEPT,
that can efficiently search PTC designs adaptive to various circuit footprint
constraints and foundry PDKs. Extensive experiments show superior flexibility
and effectiveness of the proposed ADEPT framework to explore a large PTC design
space. On various NN models and benchmarks, our searched PTC topology
outperforms prior manually-designed structures with competitive matrix
representability, 2-30x higher footprint compactness, and better noise
robustness, demonstrating a new paradigm in photonic neural chip design. The
code of ADEPT is available at https://github.com/JeremieMelo/ADEPT using the
https://github.com/JeremieMelo/pytorch-onn (TorchONN) library.Comment: Accepted to ACM/IEEE Design Automation Conference (DAC), 202
DOTA: A Dynamically-Operated Photonic Tensor Core for Energy-Efficient Transformer Accelerator
The wide adoption and significant computing resource consumption of
attention-based Transformers, e.g., Vision Transformer and large language
models, have driven the demands for efficient hardware accelerators. While
electronic accelerators have been commonly used, there is a growing interest in
exploring photonics as an alternative technology due to its high energy
efficiency and ultra-fast processing speed. Optical neural networks (ONNs) have
demonstrated promising results for convolutional neural network (CNN) workloads
that only require weight-static linear operations. However, they fail to
efficiently support Transformer architectures with attention operations due to
the lack of ability to process dynamic full-range tensor multiplication. In
this work, we propose a customized high-performance and energy-efficient
photonic Transformer accelerator, DOTA. To overcome the fundamental limitation
of existing ONNs, we introduce a novel photonic tensor core, consisting of a
crossbar array of interference-based optical vector dot-product engines, that
supports highly-parallel, dynamic, and full-range matrix-matrix multiplication.
Our comprehensive evaluation demonstrates that DOTA achieves a >4x energy and a
>10x latency reduction compared to prior photonic accelerators, and delivers
over 20x energy reduction and 2 to 3 orders of magnitude lower latency compared
to the electronic Transformer accelerator. Our work highlights the immense
potential of photonic computing for efficient hardware accelerators,
particularly for advanced machine learning workloads.Comment: The short version is accepted by Next-Gen AI System Workshop at MLSys
202
Thermal Activation of Methane by MgO+: Temperature Dependent Kinetics, Reactive Molecular Dynamics Simulations and Statistical Modeling
The kinetics of MgO + + CH 4 was studied experimentally using the variable ion source, temperature adjustable selected ion flow tube (VISTA-SIFT) apparatus from 300 − 600 K and computationally by running and analyzing reactive atomistic simula- tions. Rate coefficients and product branching fractions were determined as a function of temperature. The reaction proceeded with a rate of k = 5 . 9 ± 1 . 5 × 10 − 10 ( T/ 300 K) − 0 . 5 ± 0 . 2 cm 3 s − 1 . MgOH + was the dominant product at all temperatures, but Mg + , the co-product of oxygen-atom transfer to form methanol, was observed with a product branching fraction of 0 . 08 ± 0 . 03( T/ 300 K) − 0 . 8 ± 0 . 7 . Reactive molecular dynamics simulations using a reactive force field, as well as a neural network trained on thousands of structures yield rate coefficients about one order of magnitude lower. This underestimation of the rates is traced back to the multireference character of the transition state [MgOCH 4 ] + . Statistical modeling of the temperature-dependent kinetics provides further insight into the reactive potential surface. The rate limiting step was found to be consistent with a four-centered activation of the C-H bond, consistent with previous calculations. The product branching was modeled as a competition between dissociation of an insertion intermediate directly after the rate- limiting transition state, and traversing a transition state corresponding to a methyl migration leading to a Mg-CH 3 OH + complex, though only if this transition state is stabilized significantly relative to the dissociated MgOH + + CH 3 product channel. An alternative non-statistical mechanism is discussed, whereby a post-transition state bifurcation in the potential surface could allow the reaction to proceed directly from the four-centered TS to the Mg-CH 3 OH + complex thereby allowing a more robust competition between the product channels
Global, regional, and national burden of disorders affecting the nervous system, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021
BackgroundDisorders affecting the nervous system are diverse and include neurodevelopmental disorders, late-life neurodegeneration, and newly emergent conditions, such as cognitive impairment following COVID-19. Previous publications from the Global Burden of Disease, Injuries, and Risk Factor Study estimated the burden of 15 neurological conditions in 2015 and 2016, but these analyses did not include neurodevelopmental disorders, as defined by the International Classification of Diseases (ICD)-11, or a subset of cases of congenital, neonatal, and infectious conditions that cause neurological damage. Here, we estimate nervous system health loss caused by 37 unique conditions and their associated risk factors globally, regionally, and nationally from 1990 to 2021.MethodsWe estimated mortality, prevalence, years lived with disability (YLDs), years of life lost (YLLs), and disability-adjusted life-years (DALYs), with corresponding 95% uncertainty intervals (UIs), by age and sex in 204 countries and territories, from 1990 to 2021. We included morbidity and deaths due to neurological conditions, for which health loss is directly due to damage to the CNS or peripheral nervous system. We also isolated neurological health loss from conditions for which nervous system morbidity is a consequence, but not the primary feature, including a subset of congenital conditions (ie, chromosomal anomalies and congenital birth defects), neonatal conditions (ie, jaundice, preterm birth, and sepsis), infectious diseases (ie, COVID-19, cystic echinococcosis, malaria, syphilis, and Zika virus disease), and diabetic neuropathy. By conducting a sequela-level analysis of the health outcomes for these conditions, only cases where nervous system damage occurred were included, and YLDs were recalculated to isolate the non-fatal burden directly attributable to nervous system health loss. A comorbidity correction was used to calculate total prevalence of all conditions that affect the nervous system combined.FindingsGlobally, the 37 conditions affecting the nervous system were collectively ranked as the leading group cause of DALYs in 2021 (443 million, 95% UI 378–521), affecting 3·40 billion (3·20–3·62) individuals (43·1%, 40·5–45·9 of the global population); global DALY counts attributed to these conditions increased by 18·2% (8·7–26·7) between 1990 and 2021. Age-standardised rates of deaths per 100 000 people attributed to these conditions decreased from 1990 to 2021 by 33·6% (27·6–38·8), and age-standardised rates of DALYs attributed to these conditions decreased by 27·0% (21·5–32·4). Age-standardised prevalence was almost stable, with a change of 1·5% (0·7–2·4). The ten conditions with the highest age-standardised DALYs in 2021 were stroke, neonatal encephalopathy, migraine, Alzheimer's disease and other dementias, diabetic neuropathy, meningitis, epilepsy, neurological complications due to preterm birth, autism spectrum disorder, and nervous system cancer.InterpretationAs the leading cause of overall disease burden in the world, with increasing global DALY counts, effective prevention, treatment, and rehabilitation strategies for disorders affecting the nervous system are needed
Recommended from our members
Global burden of 288 causes of death and life expectancy decomposition in 204 countries and territories and 811 subnational locations, 1990–2021: a systematic analysis for the Global Burden of Disease Study 2021
BACKGROUND Regular, detailed reporting on population health by underlying cause of death is fundamental for public health decision making. Cause-specific estimates of mortality and the subsequent effects on life expectancy worldwide are valuable metrics to gauge progress in reducing mortality rates. These estimates are particularly important following large-scale mortality spikes, such as the COVID-19 pandemic. When systematically analysed, mortality rates and life expectancy allow comparisons of the consequences of causes of death globally and over time, providing a nuanced understanding of the effect of these causes on global populations. METHODS The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2021 cause-of-death analysis estimated mortality and years of life lost (YLLs) from 288 causes of death by age-sex-location-year in 204 countries and territories and 811 subnational locations for each year from 1990 until 2021. The analysis used 56 604 data sources, including data from vital registration and verbal autopsy as well as surveys, censuses, surveillance systems, and cancer registries, among others. As with previous GBD rounds, cause-specific death rates for most causes were estimated using the Cause of Death Ensemble model-a modelling tool developed for GBD to assess the out-of-sample predictive validity of different statistical models and covariate permutations and combine those results to produce cause-specific mortality estimates-with alternative strategies adapted to model causes with insufficient data, substantial changes in reporting over the study period, or unusual epidemiology. YLLs were computed as the product of the number of deaths for each cause-age-sex-location-year and the standard life expectancy at each age. As part of the modelling process, uncertainty intervals (UIs) were generated using the 2·5th and 97·5th percentiles from a 1000-draw distribution for each metric. We decomposed life expectancy by cause of death, location, and year to show cause-specific effects on life expectancy from 1990 to 2021. We also used the coefficient of variation and the fraction of population affected by 90% of deaths to highlight concentrations of mortality. Findings are reported in counts and age-standardised rates. Methodological improvements for cause-of-death estimates in GBD 2021 include the expansion of under-5-years age group to include four new age groups, enhanced methods to account for stochastic variation of sparse data, and the inclusion of COVID-19 and other pandemic-related mortality-which includes excess mortality associated with the pandemic, excluding COVID-19, lower respiratory infections, measles, malaria, and pertussis. For this analysis, 199 new country-years of vital registration cause-of-death data, 5 country-years of surveillance data, 21 country-years of verbal autopsy data, and 94 country-years of other data types were added to those used in previous GBD rounds. FINDINGS The leading causes of age-standardised deaths globally were the same in 2019 as they were in 1990; in descending order, these were, ischaemic heart disease, stroke, chronic obstructive pulmonary disease, and lower respiratory infections. In 2021, however, COVID-19 replaced stroke as the second-leading age-standardised cause of death, with 94·0 deaths (95% UI 89·2-100·0) per 100 000 population. The COVID-19 pandemic shifted the rankings of the leading five causes, lowering stroke to the third-leading and chronic obstructive pulmonary disease to the fourth-leading position. In 2021, the highest age-standardised death rates from COVID-19 occurred in sub-Saharan Africa (271·0 deaths [250·1-290·7] per 100 000 population) and Latin America and the Caribbean (195·4 deaths [182·1-211·4] per 100 000 population). The lowest age-standardised death rates from COVID-19 were in the high-income super-region (48·1 deaths [47·4-48·8] per 100 000 population) and southeast Asia, east Asia, and Oceania (23·2 deaths [16·3-37·2] per 100 000 population). Globally, life expectancy steadily improved between 1990 and 2019 for 18 of the 22 investigated causes. Decomposition of global and regional life expectancy showed the positive effect that reductions in deaths from enteric infections, lower respiratory infections, stroke, and neonatal deaths, among others have contributed to improved survival over the study period. However, a net reduction of 1·6 years occurred in global life expectancy between 2019 and 2021, primarily due to increased death rates from COVID-19 and other pandemic-related mortality. Life expectancy was highly variable between super-regions over the study period, with southeast Asia, east Asia, and Oceania gaining 8·3 years (6·7-9·9) overall, while having the smallest reduction in life expectancy due to COVID-19 (0·4 years). The largest reduction in life expectancy due to COVID-19 occurred in Latin America and the Caribbean (3·6 years). Additionally, 53 of the 288 causes of death were highly concentrated in locations with less than 50% of the global population as of 2021, and these causes of death became progressively more concentrated since 1990, when only 44 causes showed this pattern. The concentration phenomenon is discussed heuristically with respect to enteric and lower respiratory infections, malaria, HIV/AIDS, neonatal disorders, tuberculosis, and measles. INTERPRETATION Long-standing gains in life expectancy and reductions in many of the leading causes of death have been disrupted by the COVID-19 pandemic, the adverse effects of which were spread unevenly among populations. Despite the pandemic, there has been continued progress in combatting several notable causes of death, leading to improved global life expectancy over the study period. Each of the seven GBD super-regions showed an overall improvement from 1990 and 2021, obscuring the negative effect in the years of the pandemic. Additionally, our findings regarding regional variation in causes of death driving increases in life expectancy hold clear policy utility. Analyses of shifting mortality trends reveal that several causes, once widespread globally, are now increasingly concentrated geographically. These changes in mortality concentration, alongside further investigation of changing risks, interventions, and relevant policy, present an important opportunity to deepen our understanding of mortality-reduction strategies. Examining patterns in mortality concentration might reveal areas where successful public health interventions have been implemented. Translating these successes to locations where certain causes of death remain entrenched can inform policies that work to improve life expectancy for people everywhere. FUNDING Bill & Melinda Gates Foundation
- …