35 research outputs found

    MMSD2.0: Towards a Reliable Multi-modal Sarcasm Detection System

    Full text link
    Multi-modal sarcasm detection has attracted much recent attention. Nevertheless, the existing benchmark (MMSD) has some shortcomings that hinder the development of reliable multi-modal sarcasm detection system: (1) There are some spurious cues in MMSD, leading to the model bias learning; (2) The negative samples in MMSD are not always reasonable. To solve the aforementioned issues, we introduce MMSD2.0, a correction dataset that fixes the shortcomings of MMSD, by removing the spurious cues and re-annotating the unreasonable samples. Meanwhile, we present a novel framework called multi-view CLIP that is capable of leveraging multi-grained cues from multiple perspectives (i.e., text, image, and text-image interaction view) for multi-modal sarcasm detection. Extensive experiments show that MMSD2.0 is a valuable benchmark for building reliable multi-modal sarcasm detection systems and multi-view CLIP can significantly outperform the previous best baselines.Comment: Accepted by ACL2023 Finding

    Pre-training on Synthetic Driving Data for Trajectory Prediction

    Full text link
    Accumulating substantial volumes of real-world driving data proves pivotal in the realm of trajectory forecasting for autonomous driving. Given the heavy reliance of current trajectory forecasting models on data-driven methodologies, we aim to tackle the challenge of learning general trajectory forecasting representations under limited data availability. We propose to augment both HD maps and trajectories and apply pre-training strategies on top of them. Specifically, we take advantage of graph representations of HD-map and apply vector transformations to reshape the maps, to easily enrich the limited number of scenes. Additionally, we employ a rule-based model to generate trajectories based on augmented scenes; thus enlarging the trajectories beyond the collected real ones. To foster the learning of general representations within this augmented dataset, we comprehensively explore the different pre-training strategies, including extending the concept of a Masked AutoEncoder (MAE) for trajectory forecasting. Extensive experiments demonstrate the effectiveness of our data expansion and pre-training strategies, which outperform the baseline prediction model by large margins, e.g. 5.04%, 3.84% and 8.30% in terms of MR6MR_6, minADE6minADE_6 and minFDE6minFDE_6

    First Steps Toward an Autonomous Accelerator, a Common Project Between DESY and KIT

    Get PDF
    Reinforcement Learning algorithms have risen in popularity in recent years in the accelerator physics community, showing potential in beam control and in the optimization and automation of tasks in accelerator operation. The Helmholtz AI project "Machine Learning toward Autonomous Accelerators" is a collaboration between DESY and KIT that works on investigating and developing RL applications for the automatic start-up of electron linear accelerators. The work is carried out in parallel at two similar research accelerators: ARES at DESY and FLUTE at KIT, giving the unique opportunity of transfer learning between facilities. One of the first steps of this project is the establishment of a common interface between the simulations and the machine, in order to test and apply various optimization approaches interchangeably between the two accelerators. In this paper we present the first results on the common interface and its application to beam focusing in ARES, and the idea of laser shaping with spatial light modulators at FLUTE

    Machine Learning Based Spatial Light Modulator Control for the Photoinjector Laser at FLUTE

    Get PDF
    FLUTE (Ferninfrarot Linac- und Test-Experiment) at KIT is a compact linac-based test facility for novel accelerator technology and a source of intense THz radiation. FLUTE is designed to provide a wide range of electron bunch charges from the pC- to nC-range, high electric fields up to 1.2 GV/m, and ultra-short THz pulses down to the fs-timescale. The electrons are generated at the RF photoinjector, where the electron gun is driven by a commercial titanium sapphire laser. In this kind of setup the electron beam properties are determined by the photoinjector, but more importantly by the characteristics of the laser pulses. Spatial light modulators can be used to transversely and longitudinally shape the laser pulse, offering a flexible way to shape the laser beam and subsequently the electron beam, influencing the produced THz pulses. However, nonlinear effects inherent to the laser manipulation (transportation, compression, third harmonic generation) can distort the original pulse. In this paper we propose to use machine learning methods to manipulate the laser and electron bunch, aiming to generate tailor-made THz pulses. The method is demonstrated experimentally in a test setup

    Learning to Do or Learning While Doing: Reinforcement Learning and Bayesian Optimisation for Online Continuous Tuning

    Full text link
    Online tuning of real-world plants is a complex optimisation problem that continues to require manual intervention by experienced human operators. Autonomous tuning is a rapidly expanding field of research, where learning-based methods, such as Reinforcement Learning-trained Optimisation (RLO) and Bayesian optimisation (BO), hold great promise for achieving outstanding plant performance and reducing tuning times. Which algorithm to choose in different scenarios, however, remains an open question. Here we present a comparative study using a routine task in a real particle accelerator as an example, showing that RLO generally outperforms BO, but is not always the best choice. Based on the study's results, we provide a clear set of criteria to guide the choice of algorithm for a given tuning task. These can ease the adoption of learning-based autonomous tuning solutions to the operation of complex real-world plants, ultimately improving the availability and pushing the limits of operability of these facilities, thereby enabling scientific and engineering advancements.Comment: 17 pages, 8 figures, 2 table

    Synaptic transistor with multiple biological functions based on metal-organic frameworks combined with the LIF model of a spiking neural network to recognize temporal information

    Get PDF
    Spiking neural networks (SNNs) have immense potential due to their utilization of synaptic plasticity and ability to take advantage of temporal correlation and low power consumption. The leaky integration and firing (LIF) model and spike-timing-dependent plasticity (STDP) are the fundamental components of SNNs. Here, a neural device is first demonstrated by zeolitic imidazolate frameworks (ZIFs) as an essential part of the synaptic transistor to simulate SNNs. Significantly, three kinds of typical functions between neurons, the memory function achieved through the hippocampus, synaptic weight regulation and membrane potential triggered by ion migration, are effectively described through short-term memory/long-term memory (STM/LTM), long-term depression/long-term potentiation (LTD/LTP) and LIF, respectively. Furthermore, the update rule of iteration weight in the backpropagation based on the time interval between presynaptic and postsynaptic pulses is extracted and fitted from the STDP. In addition, the postsynaptic currents of the channel directly connect to the very large scale integration (VLSI) implementation of the LIF mode that can convert high-frequency information into spare pulses based on the threshold of membrane potential. The leaky integrator block, firing/detector block and frequency adaptation block instantaneously release the accumulated voltage to form pulses. Finally, we recode the steady-state visual evoked potentials (SSVEPs) belonging to the electroencephalogram (EEG) with filter characteristics of LIF. SNNs deeply fused by synaptic transistors are designed to recognize the 40 different frequencies of EEG and improve accuracy to 95.1%. This work represents an advanced contribution to brain-like chips and promotes the systematization and diversification of artificial intelligence
    corecore