255 research outputs found

    Taming Fat-Tailed ("Heavier-Tailed'' with Potentially Infinite Variance) Noise in Federated Learning

    Full text link
    A key assumption in most existing works on FL algorithms' convergence analysis is that the noise in stochastic first-order information has a finite variance. Although this assumption covers all light-tailed (i.e., sub-exponential) and some heavy-tailed noise distributions (e.g., log-normal, Weibull, and some Pareto distributions), it fails for many fat-tailed noise distributions (i.e., ``heavier-tailed'' with potentially infinite variance) that have been empirically observed in the FL literature. To date, it remains unclear whether one can design convergent algorithms for FL systems that experience fat-tailed noise. This motivates us to fill this gap in this paper by proposing an algorithmic framework called FAT-Clipping (\ul{f}ederated \ul{a}veraging with \ul{t}wo-sided learning rates and \ul{clipping}), which contains two variants: FAT-Clipping per-round (FAT-Clipping-PR) and FAT-Clipping per-iteration (FAT-Clipping-PI). Specifically, for the largest α∈(1,2]\alpha \in (1,2] such that the fat-tailed noise in FL still has a bounded α\alpha-moment, we show that both variants achieve O((mT)2−αα)\mathcal{O}((mT)^{\frac{2-\alpha}{\alpha}}) and O((mT)1−α3α−2)\mathcal{O}((mT)^{\frac{1-\alpha}{3\alpha-2}}) convergence rates in the strongly-convex and general non-convex settings, respectively, where mm and TT are the numbers of clients and communication rounds. Moreover, at the expense of more clipping operations compared to FAT-Clipping-PR, FAT-Clipping-PI further enjoys a linear speedup effect with respect to the number of local updates at each client and being lower-bound-matching (i.e., order-optimal). Collectively, our results advance the understanding of designing efficient algorithms for FL systems that exhibit fat-tailed first-order oracle information.Comment: Published as a conference paper at NeurIPS 202

    Ultrafast Optical Control And Characterization Of Carrier And Spin Dynamics In Novel Magnetic Topological Insulator Systems

    Get PDF
    Magnetic topological insulators (MTIs) are of considerable interest in developing novel spintronics and quantum computing applications. Under the topological protection by time-reversal Z2 invariant number, magnetic topological insulators are provided with robust electronic and magnetic properties against local perturbations. The quantum anomalous Hall effect (QAHE), which harbors dissipationless chiral edge states in MTIs, provides a competitive platform for future low-power consumption and high-speed spintronic devices. Although the present studies on both bulk and surface magnetic properties in MTIs have made significant progress, the in-depth understanding of the exchange couplings and the interaction between the two magnetization sources is far from completion. In addition, the optical control of the non-trivial properties in MTIs is important in achieving novel applications for ultrafast optoelectronics and optical spintronics. The goal of this dissertation is to understand and manipulate the dynamical spin coupling as well as the carrier relaxation dynamics in MTIs, using the static magneto-optical Kerr effect (MOKE) and the time-resolved magneto optical Kerr effect (TRMOKE) techniques. First, a pronounced spin-valve-like structure of dynamical magnetization is observed in Cr-(Bi,Sb)2Te3/CrSb bilayer heterostructure through the pump-modulated MOKE characterization. The characters of the soft and resilient ferromagnetic phases are distinguished in terms of the spin coupling between the dynamical surface and bulk ferromagnetism. The dynamical bulk ferromagnetic ordering is softened by a laser-induced heat effect on the lattice, while the dynamical surface magnetization is enhanced via the strengthening Dirac-hole-mediated exchange coupling. In addition, the pump-fluence-dependent measurement of the exchange-bias effect provides further evidence for the enhancement of MTI surface magnetization at the MTI/AFM interface. Lastly, we propose a theoretical model that includes the long-range p-d exchange coupling and a Dirac-hole-mediated exchange interaction and estimate the exchange coupling energies in the MTI/AFM bilayer structure. Second, ultralong carrier lifetimes (3~20 ns) of the optically pumped surface states are observed in Cr-(Bi,Sb)2Te3 MTI which corresponds to the slow radiative recombination within the gapped Dirac cone. The photoinjection dependency of radiative lifetimes suggests a strong Coulomb screening effect of electron-hole plasma on surface excitons. The experimental results are consistent with the theoretical simulation. On the other hand, the nonradiative nature of bulk electron relaxation is identified with a lifetime of ~1000 ps by photoinjection- and temperature- dependent reflectivity measurements. The finding of long-lived excited carriers in MTI improves the understanding of the general carrier dynamics in topological insulators-based materials

    ANAct: Adaptive Normalization for Activation Functions

    Full text link
    In this paper, we investigate the negative effect of activation functions on forward and backward propagation and how to counteract this effect. First, We examine how activation functions affect the forward and backward propagation of neural networks and derive a general form for gradient variance that extends the previous work in this area. We try to use mini-batch statistics to dynamically update the normalization factor to ensure the normalization property throughout the training process, rather than only accounting for the state of the neural network after weight initialization. Second, we propose ANAct, a method that normalizes activation functions to maintain consistent gradient variance across layers and demonstrate its effectiveness through experiments. We observe that the convergence rate is roughly related to the normalization property. We compare ANAct with several common activation functions on CNNs and residual networks and show that ANAct consistently improves their performance. For instance, normalized Swish achieves 1.4\% higher top-1 accuracy than vanilla Swish on ResNet50 with the Tiny ImageNet dataset and more than 1.2\% higher with CIFAR-100.Comment: 14 pages, 6 figure

    More than Vanilla Fusion: a Simple, Decoupling-free, Attention Module for Multimodal Fusion Based on Signal Theory

    Full text link
    The vanilla fusion methods still dominate a large percentage of mainstream audio-visual tasks. However, the effectiveness of vanilla fusion from a theoretical perspective is still worth discussing. Thus, this paper reconsiders the signal fused in the multimodal case from a bionics perspective and proposes a simple, plug-and-play, attention module for vanilla fusion based on fundamental signal theory and uncertainty theory. In addition, previous work on multimodal dynamic gradient modulation still relies on decoupling the modalities. So, a decoupling-free gradient modulation scheme has been designed in conjunction with the aforementioned attention module, which has various advantages over the decoupled one. Experiment results show that just a few lines of code can achieve up to 2.0% performance improvements to several multimodal classification methods. Finally, quantitative evaluation of other fusion tasks reveals the potential for additional application scenarios

    Full one-loop electroweak corrections to e+e−→ZHγe^+e^- \to ZH\gamma at a Higgs factory

    Get PDF
    Motivated by the future precision test of the Higgs boson at an e+e−e^+e^- Higgs factory, we calculate the production e+e−→ZHγe^+e^- \to ZH\gamma in the Standard Model with complete next-to-leading order electroweak corrections. We find that for s=240\sqrt{s}=240 (350) GeV the cross section of this production is sizably reduced by the electroweak corrections, which is 1.031.03 (5.32) fb at leading order and 0.72 (4.79) fb at next-to-leading order. The transverse momentum distribution of the photon in the final states is also presented.Comment: discussions added, version accepted by JHE

    DIAMOND: Taming Sample and Communication Complexities in Decentralized Bilevel Optimization

    Full text link
    Decentralized bilevel optimization has received increasing attention recently due to its foundational role in many emerging multi-agent learning paradigms (e.g., multi-agent meta-learning and multi-agent reinforcement learning) over peer-to-peer edge networks. However, to work with the limited computation and communication capabilities of edge networks, a major challenge in developing decentralized bilevel optimization techniques is to lower sample and communication complexities. This motivates us to develop a new decentralized bilevel optimization called DIAMOND (decentralized single-timescale stochastic approximation with momentum and gradient-tracking). The contributions of this paper are as follows: i) our DIAMOND algorithm adopts a single-loop structure rather than following the natural double-loop structure of bilevel optimization, which offers low computation and implementation complexity; ii) compared to existing approaches, the DIAMOND algorithm does not require any full gradient evaluations, which further reduces both sample and computational complexities; iii) through a careful integration of momentum information and gradient tracking techniques, we show that the DIAMOND algorithm enjoys O(ϵ−3/2)\mathcal{O}(\epsilon^{-3/2}) in sample and communication complexities for achieving an ϵ\epsilon-stationary solution, both of which are independent of the dataset sizes and significantly outperform existing works. Extensive experiments also verify our theoretical findings

    Metagenomic analysis reveals the virome profiles of Aedes albopictus in Guangzhou, China

    Get PDF
    IntroductionAedes albopictus is an aggressive invasive mosquito species widely distributed around the world, and it is also a known vector of arboviruses. Virus metagenomics and RNA interference (RNAi) are important in studying the biology and antiviral defense of Ae. albopictus. However, the virome and potential transmission of plant viruses by Ae. albopictus remain understudied.MethodsMosquito samples of Ae. albopictus were collected from Guangzhou, China, and small RNA sequencing was performed. Raw data were filtered, and virus-associated contigs were generated using VirusDetect. The small RNA profiles were analyzed, and maximum-likelihood phylogenetic trees were constructed.ResultsThe small RNA sequencing of pooled Ae. albopictus revealed the presence of five known viruses, including Wenzhou sobemo-like virus 4, mosquito nodavirus, Aedes flavivirus, Hubei chryso-like virus 1, and Tobacco rattle virus RNA1. Additionally, 21 new viruses that had not been previously reported were identified. The mapping of reads and contig assembly provided insights into the viral diversity and genomic characteristics of these viruses. Field survey confirmed the detection of the identified viruses in Ae. albopictus collected from Guangzhou.DiscussionThe comprehensive analysis of the virus metagenomics of Ae. albopictus in this study sheds light on the diversity and prevalence of viruses in mosquito populations. The presence of known and novel viruses highlights the need for continued surveillance and investigation into their potential impact on public health. The findings also emphasize the importance of understanding the virome and potential transmission of plant viruses by Ae. albopictus.ConclusionThis study provides valuable insights into the virome of Ae. albopictus and its potential role as a vector for both known and novel viruses. Further research is needed to expand the sample size, explore additional viruses, and investigate the implications for public health

    Examining resilience in local adaptation policies – pilot studies in Taipei and Tainan, Taiwan

    Full text link
    Resilience has gained considerable attention over recent years in both theories and decision-making practices. In Taiwan, the term resilience is generally considered as a synonym for adaptation. This may limit the use of the notion. By understanding resilience in terms of adaptation and mitigation, we identify six attributes for assessment. The assessment is addressed in local level climate change adaptation policies in two selected cities. The city of Taipei represents places where local adaptation policies were directed mainly by the national government. The city of Tainan represents places where the municipal government plays a more critical role in framing these policies. This can result in different policymaking considerations. The assessment points out that the proposed actions of these policies are broader than a general understanding of adaptation. Mitigation strategies are addressed and sometimes highly recommended. Because of this, we can interpret these actions as resilience strategies covered under the use of the term adaptation. The notion of resilience does not stay on the rhetorical level alone. It is happening in shaping decisions – without using the terminology directly. The broadness of the resilience notion, in spite of being abstract, can provide a more general framework for cross-sectorial discussion and collaboration in policy-making. This is particularly important for dealing with complex issues, such as climate-related disturbances, which cannot be managed by a single group of professions

    The antioxidant activity of polysaccharides from Armillaria gallica

    Get PDF
    The purpose of this study was to investigate the antioxidant activity of Armillaria gallica polysaccharides. It explored whether Armillaria gallica polysaccharides (AgP) could prevent HepG2 cells from H2O2-induced oxidative damage. The results demonstrated that HepG2 cells were significantly protected by AgP, and efficiently suppressed the production of reactive oxygen species (ROS) in HepG2 cells. Additionally, AgP significantly decreased the abnormal leakage of alanine aminotransferase (ALT) and lactate dehydrogenase (LDH) caused by H2O2, protecting cell membrane integrity. It was discovered that AgP was also found to regulate the activities of antioxidant enzymes, superoxide dismutase (SOD), catalase (CAT), and glutathione peroxidase (GSH-PX), while reducing malondialdehyde (MDA), thus protecting cells from oxidative damage. According to the flow cytometry analysis and measurement of caspase-3, caspase-8, and caspase-9 activities, AgP could modulate apoptosis-related proteins and attenuate ROS-mediated cell apoptosis
    • …
    corecore