116 research outputs found

    PUMA: Secure Inference of LLaMA-7B in Five Minutes

    Full text link
    With ChatGPT as a representative, tons of companies have began to provide services based on large Transformers models. However, using such a service inevitably leak users' prompts to the model provider. Previous studies have studied secure inference for Transformer models using secure multiparty computation (MPC), where model parameters and clients' prompts are kept secret. Despite this, these frameworks are still limited in terms of model performance, efficiency, and deployment. To address these limitations, we propose framework PUMA to enable fast and secure Transformer model inference. Our framework designs high quality approximations for expensive functions, such as GeLU and Softmax, which significantly reduce the cost of secure inference while preserving the model performance. Additionally, we design secure Embedding and LayerNorm procedures that faithfully implement the desired functionality without undermining the Transformer architecture. PUMA is about 2x faster than the state-of-the-art MPC framework MPCFORMER(ICLR 2023) and has similar accuracy as plaintext models without fine-tuning (which the previous works failed to achieve). One more thing, PUMA can evaluate LLaMA-7B in around 5 minutes to generate 1 token. To our best knowledge, this is the first time that a model with such a parameter size is able to be evaluated under MPC. PUMA has been open-sourced in the Github repository of SecretFlow-SPU

    WebBrain: Learning to Generate Factually Correct Articles for Queries by Grounding on Large Web Corpus

    Full text link
    In this paper, we introduce a new NLP task -- generating short factual articles with references for queries by mining supporting evidence from the Web. In this task, called WebBrain, the ultimate goal is to generate a fluent, informative, and factually-correct short article (e.g., a Wikipedia article) for a factual query unseen in Wikipedia. To enable experiments on WebBrain, we construct a large-scale dataset WebBrain-Raw by extracting English Wikipedia articles and their crawlable Wikipedia references. WebBrain-Raw is ten times larger than the previous biggest peer dataset, which can greatly benefit the research community. From WebBrain-Raw, we construct two task-specific datasets: WebBrain-R and WebBrain-G, which are used to train in-domain retriever and generator, respectively. Besides, we empirically analyze the performances of the current state-of-the-art NLP techniques on WebBrain and introduce a new framework ReGen, which enhances the generation factualness by improved evidence retrieval and task-specific pre-training for generation. Experiment results show that ReGen outperforms all baselines in both automatic and human evaluations.Comment: Codes in https://github.com/qhjqhj00/WebBrai

    Elastic MSM: A Fast, Elastic and Modular Preprocessing Technique for Multi-Scalar Multiplication Algorithm on GPUs

    Get PDF
    Zero-knowledge proof (ZKP) is a cryptographic primitive that enable a prover to convince a verifier that a statement is true without revealing any other information beyond the correctness of the statement itself. Due to its powerful capabilities, its most practical type, called zero-knowledge Succinct Non-interactive ARgument of Knowledge (zkSNARK), has been widely deployed in various privacy-preserving applications such as cryptocurrencies and verifiable computation. Although state-of-the-art zkSNARKs are highly efficient for the verifier, the computational overhead for the prover is still orders of magnitude too high to warrant use in many applications. This overhead is due to several time-consuming operations, including large-scale matrix-vector multiplication (MUL), number-theoretic transform (NTT), and especially the multi-scalar multiplication (MSM) with the highest proportion. Thus, further efficiency improvements are needed. In this paper we focus on comprehensive optimization of running time and storage space needed by the MSM algorithm on GPUs. Specifically, we propose a new modular and adaptive parameter configuration technique—elastic MSM to enable us to change the scale of MSM according to our own wishes by performing a corresponding amount of preprocessing. This technique enable us to fully unleash the potential of various efficient parallel MSM algorithms. From another perspective, our technique could also be regarded as a preprocessing technique over the well-known Pippenger algorithm, which is modular and could be used to accelerate almost all the most advanced parallel Pippenger algorithms on GPUs. Meanwhile, our technique provides an adaptive trade-off between the running time and the extra storage space needed by parallel Pippenger algorithms on GPUs. We implemented and tested elastic MSM over two prevailing parallel Pippenger algorithms on GPUs. Given a range of practical parameters, across various preprocessing space limitations (across various MSM scales), our construction achieves up to about 28× and 45× (25× and 40×) speedup versus two state-of-the-art preprocessing parallel Pippenger algorithms on GPUs, respectively

    Spin-orbit microlaser emitting in a four-dimensional Hilbert space

    Get PDF
    A step towards the next generation of high-capacity, noise-resilient communication and computing technologies is a substantial increase in the dimensionality of information space and the synthesis of superposition states on an N-dimensional (N > 2) Hilbert space featuring exotic group symmetries. Despite the rapid development of photonic devices and systems, on-chip information technologies are mostly limited to two-level systems owing to the lack of sufficient reconfigurability to satisfy the stringent requirement for 2(N - 1) degrees of freedom, intrinsically associated with the increase of synthetic dimensionalities. Even with extensive efforts dedicated to recently emerged vector lasers and microcavities for the expansion of dimensionalities1-10, it still remains a challenge to actively tune the diversified, high-dimensional superposition states of light on demand. Here we demonstrate a hyperdimensional, spin-orbit microlaser for chip-scale flexible generation and manipulation of arbitrary four-level states. Two microcavities coupled through a non-Hermitian synthetic gauge field are designed to emit spin-orbit-coupled states of light with six degrees of freedom. The vectorial state of the emitted laser beam in free space can be mapped on a Bloch hypersphere defining an SU(4) symmetry, demonstrating dynamical generation and reconfiguration of high-dimensional superposition states with high fidelity.We acknowledge the support from the US Army Research Office (ARO) (W911NF-19-1-0249 and W911NF-21-1-0148), National Science Foundation (NSF) (ECCS-1932803, ECCS-1842612, OMA-1936276 and PHY-1847240), Defense Advanced Research Projects Agency (DARPA) (W91NF-21-1-0340), Office of Naval Research (ONR) (N00014-20-1-2558) and King Abdullah University of Science & Technology (OSR-2020-CRG9-4374.3). L.F. also acknowledges the support from Sloan Research Fellowship. This work was partially supported by NSF through the University of Pennsylvania Materials Research Science and Engineering Center (MRSEC) (DMR-1720530) and carried out in part at the Singh Center for Nanotechnology, which is supported by the NSF National Nanotechnology Coordinated Infrastructure Program under grant NNCI-1542153.Peer reviewe

    Neutrino Physics with JUNO

    Get PDF
    The Jiangmen Underground Neutrino Observatory (JUNO), a 20 kton multi-purposeunderground liquid scintillator detector, was proposed with the determinationof the neutrino mass hierarchy as a primary physics goal. It is also capable ofobserving neutrinos from terrestrial and extra-terrestrial sources, includingsupernova burst neutrinos, diffuse supernova neutrino background, geoneutrinos,atmospheric neutrinos, solar neutrinos, as well as exotic searches such asnucleon decays, dark matter, sterile neutrinos, etc. We present the physicsmotivations and the anticipated performance of the JUNO detector for variousproposed measurements. By detecting reactor antineutrinos from two power plantsat 53-km distance, JUNO will determine the neutrino mass hierarchy at a 3-4sigma significance with six years of running. The measurement of antineutrinospectrum will also lead to the precise determination of three out of the sixoscillation parameters to an accuracy of better than 1\%. Neutrino burst from atypical core-collapse supernova at 10 kpc would lead to ~5000inverse-beta-decay events and ~2000 all-flavor neutrino-proton elasticscattering events in JUNO. Detection of DSNB would provide valuable informationon the cosmic star-formation rate and the average core-collapsed neutrinoenergy spectrum. Geo-neutrinos can be detected in JUNO with a rate of ~400events per year, significantly improving the statistics of existing geoneutrinosamples. The JUNO detector is sensitive to several exotic searches, e.g. protondecay via the pK++νˉp\to K^++\bar\nu decay channel. The JUNO detector will providea unique facility to address many outstanding crucial questions in particle andastrophysics. It holds the great potential for further advancing our quest tounderstanding the fundamental properties of neutrinos, one of the buildingblocks of our Universe

    Potential of Core-Collapse Supernova Neutrino Detection at JUNO

    Get PDF
    JUNO is an underground neutrino observatory under construction in Jiangmen, China. It uses 20kton liquid scintillator as target, which enables it to detect supernova burst neutrinos of a large statistics for the next galactic core-collapse supernova (CCSN) and also pre-supernova neutrinos from the nearby CCSN progenitors. All flavors of supernova burst neutrinos can be detected by JUNO via several interaction channels, including inverse beta decay, elastic scattering on electron and proton, interactions on C12 nuclei, etc. This retains the possibility for JUNO to reconstruct the energy spectra of supernova burst neutrinos of all flavors. The real time monitoring systems based on FPGA and DAQ are under development in JUNO, which allow prompt alert and trigger-less data acquisition of CCSN events. The alert performances of both monitoring systems have been thoroughly studied using simulations. Moreover, once a CCSN is tagged, the system can give fast characterizations, such as directionality and light curve

    Detection of the Diffuse Supernova Neutrino Background with JUNO

    Get PDF
    As an underground multi-purpose neutrino detector with 20 kton liquid scintillator, Jiangmen Underground Neutrino Observatory (JUNO) is competitive with and complementary to the water-Cherenkov detectors on the search for the diffuse supernova neutrino background (DSNB). Typical supernova models predict 2-4 events per year within the optimal observation window in the JUNO detector. The dominant background is from the neutral-current (NC) interaction of atmospheric neutrinos with 12C nuclei, which surpasses the DSNB by more than one order of magnitude. We evaluated the systematic uncertainty of NC background from the spread of a variety of data-driven models and further developed a method to determine NC background within 15\% with {\it{in}} {\it{situ}} measurements after ten years of running. Besides, the NC-like backgrounds can be effectively suppressed by the intrinsic pulse-shape discrimination (PSD) capabilities of liquid scintillators. In this talk, I will present in detail the improvements on NC background uncertainty evaluation, PSD discriminator development, and finally, the potential of DSNB sensitivity in JUNO

    Real-time Monitoring for the Next Core-Collapse Supernova in JUNO

    Full text link
    Core-collapse supernova (CCSN) is one of the most energetic astrophysical events in the Universe. The early and prompt detection of neutrinos before (pre-SN) and during the SN burst is a unique opportunity to realize the multi-messenger observation of the CCSN events. In this work, we describe the monitoring concept and present the sensitivity of the system to the pre-SN and SN neutrinos at the Jiangmen Underground Neutrino Observatory (JUNO), which is a 20 kton liquid scintillator detector under construction in South China. The real-time monitoring system is designed with both the prompt monitors on the electronic board and online monitors at the data acquisition stage, in order to ensure both the alert speed and alert coverage of progenitor stars. By assuming a false alert rate of 1 per year, this monitoring system can be sensitive to the pre-SN neutrinos up to the distance of about 1.6 (0.9) kpc and SN neutrinos up to about 370 (360) kpc for a progenitor mass of 30MM_{\odot} for the case of normal (inverted) mass ordering. The pointing ability of the CCSN is evaluated by using the accumulated event anisotropy of the inverse beta decay interactions from pre-SN or SN neutrinos, which, along with the early alert, can play important roles for the followup multi-messenger observations of the next Galactic or nearby extragalactic CCSN.Comment: 24 pages, 9 figure

    25th annual computational neuroscience meeting: CNS-2016

    Get PDF
    The same neuron may play different functional roles in the neural circuits to which it belongs. For example, neurons in the Tritonia pedal ganglia may participate in variable phases of the swim motor rhythms [1]. While such neuronal functional variability is likely to play a major role the delivery of the functionality of neural systems, it is difficult to study it in most nervous systems. We work on the pyloric rhythm network of the crustacean stomatogastric ganglion (STG) [2]. Typically network models of the STG treat neurons of the same functional type as a single model neuron (e.g. PD neurons), assuming the same conductance parameters for these neurons and implying their synchronous firing [3, 4]. However, simultaneous recording of PD neurons shows differences between the timings of spikes of these neurons. This may indicate functional variability of these neurons. Here we modelled separately the two PD neurons of the STG in a multi-neuron model of the pyloric network. Our neuron models comply with known correlations between conductance parameters of ionic currents. Our results reproduce the experimental finding of increasing spike time distance between spikes originating from the two model PD neurons during their synchronised burst phase. The PD neuron with the larger calcium conductance generates its spikes before the other PD neuron. Larger potassium conductance values in the follower neuron imply longer delays between spikes, see Fig. 17.Neuromodulators change the conductance parameters of neurons and maintain the ratios of these parameters [5]. Our results show that such changes may shift the individual contribution of two PD neurons to the PD-phase of the pyloric rhythm altering their functionality within this rhythm. Our work paves the way towards an accessible experimental and computational framework for the analysis of the mechanisms and impact of functional variability of neurons within the neural circuits to which they belong
    corecore