36 research outputs found

    MPR-Net:Multi-Scale Pattern Reproduction Guided Universality Time Series Interpretable Forecasting

    Full text link
    Time series forecasting has received wide interest from existing research due to its broad applications and inherent challenging. The research challenge lies in identifying effective patterns in historical series and applying them to future forecasting. Advanced models based on point-wise connected MLP and Transformer architectures have strong fitting power, but their secondary computational complexity limits practicality. Additionally, those structures inherently disrupt the temporal order, reducing the information utilization and making the forecasting process uninterpretable. To solve these problems, this paper proposes a forecasting model, MPR-Net. It first adaptively decomposes multi-scale historical series patterns using convolution operation, then constructs a pattern extension forecasting method based on the prior knowledge of pattern reproduction, and finally reconstructs future patterns into future series using deconvolution operation. By leveraging the temporal dependencies present in the time series, MPR-Net not only achieves linear time complexity, but also makes the forecasting process interpretable. By carrying out sufficient experiments on more than ten real data sets of both short and long term forecasting tasks, MPR-Net achieves the state of the art forecasting performance, as well as good generalization and robustness performance

    Towards Understanding Hierarchical Learning: Benefits of Neural Representations

    Full text link
    Deep neural networks can empirically perform efficient hierarchical learning, in which the layers learn useful representations of the data. However, how they make use of the intermediate representations are not explained by recent theories that relate them to "shallow learners" such as kernels. In this work, we demonstrate that intermediate neural representations add more flexibility to neural networks and can be advantageous over raw inputs. We consider a fixed, randomly initialized neural network as a representation function fed into another trainable network. When the trainable network is the quadratic Taylor model of a wide two-layer network, we show that neural representation can achieve improved sample complexities compared with the raw input: For learning a low-rank degree-pp polynomial (p≄4p \geq 4) in dd dimension, neural representation requires only O~(d⌈p/2⌉)\tilde{O}(d^{\lceil p/2 \rceil}) samples, while the best-known sample complexity upper bound for the raw input is O~(dp−1)\tilde{O}(d^{p-1}). We contrast our result with a lower bound showing that neural representations do not improve over the raw input (in the infinite width limit), when the trainable network is instead a neural tangent kernel. Our results characterize when neural representations are beneficial, and may provide a new perspective on why depth is important in deep learning.Comment: 41 pages, published in NeurIPS 202

    Fair Abstractive Summarization of Diverse Perspectives

    Full text link
    People from different social and demographic groups express diverse perspectives and conflicting opinions on a broad set of topics such as product reviews, healthcare, law, and politics. A fair summary should provide a comprehensive coverage of diverse perspectives without underrepresenting certain groups. However, current work in summarization metrics and Large Language Models (LLMs) evaluation has not explored fair abstractive summarization. In this paper, we systematically investigate fair abstractive summarization for user-generated data. We first formally define fairness in abstractive summarization as not underrepresenting perspectives of any groups of people and propose four reference-free automatic metrics measuring the differences between target and source perspectives. We evaluate five LLMs, including three GPT models, Alpaca, and Claude, on six datasets collected from social media, online reviews, and recorded transcripts. Experiments show that both the model-generated and the human-written reference summaries suffer from low fairness. We conduct a comprehensive analysis of the common factors influencing fairness and propose three simple but effective methods to alleviate unfair summarization. Our dataset and code are available at https://github.com/psunlpgroup/FairSumm.Comment: 19 pages, 10 figure

    Preliminary Study:Learning the Impact of Simulation Time on Reentry Location and Morphology Induced by Personalized Cardiac Modeling

    Get PDF
    Personalized cardiac modeling is widely used for studying the mechanisms of cardiac arrythmias. Due to the high demanding of computational resource of modeling, the arrhythmias induced in the models are usually simulated for just a few seconds. In clinic, it is common that arrhythmias last for more than several minutes and the morphologies of reentries are not always stable, so it is not clear that whether the simulation of arrythmias for just a few seconds is long enough to match the arrhythmias detected in patients. This study aimed to observe how long simulation of the induced arrhythmias in the personalized cardiac models is sufficient to match the arrhythmias detected in patients. A total of 5 contrast enhanced MRI datasets of patient hearts with myocardial infarction were used in this study. Then, a classification method based on Gaussian mixture model was used to detect the infarct tissue. For each reentry, 3 s and 10 s were simulated. The characteristics of each reentry simulated for different duration were studied. Reentries were induced in all 5 ventricular models and sustained reentries were induced at 39 stimulation sites in the model. By analyzing the simulation results, we found that 41% of the sustained reentries in the 3 s simulation group terminated in the longer simulation groups (10 s). The second finding in our simulation was that only 23.1% of the sustained reentries in the 3 s simulation did not change location and morphology in the extended 10 s simulation. The third finding was that 35.9% reentries were stable in the 3 s simulation and should be extended for the simulation time. The fourth finding was that the simulation results in 10 s simulation matched better with the clinical measurements than the 3 s simulation. It was shown that 10 s simulation was sufficient to make simulation results stable. The findings of this study not only improve the simulation accuracy, but also reduce the unnecessary simulation time to achieve the optimal use of computer resources to improve the simulation efficiency and shorten the simulation time to meet the time node requirements of clinical operation on patients

    Lemur: Harmonizing Natural Language and Code for Language Agents

    Full text link
    We introduce Lemur and Lemur-Chat, openly accessible language models optimized for both natural language and coding capabilities to serve as the backbone of versatile language agents. The evolution from language chat models to functional language agents demands that models not only master human interaction, reasoning, and planning but also ensure grounding in the relevant environments. This calls for a harmonious blend of language and coding capabilities in the models. Lemur and Lemur-Chat are proposed to address this necessity, demonstrating balanced proficiencies in both domains, unlike existing open-source models that tend to specialize in either. Through meticulous pre-training using a code-intensive corpus and instruction fine-tuning on text and code data, our models achieve state-of-the-art averaged performance across diverse text and coding benchmarks among open-source models. Comprehensive experiments demonstrate Lemur's superiority over existing open-source models and its proficiency across various agent tasks involving human communication, tool usage, and interaction under fully- and partially- observable environments. The harmonization between natural and programming languages enables Lemur-Chat to significantly narrow the gap with proprietary models on agent abilities, providing key insights into developing advanced open-source agents adept at reasoning, planning, and operating seamlessly across environments. https://github.com/OpenLemur/Lemu

    All‐In‐One OsciDrop Digital PCR System for Automated and Highly Multiplexed Molecular Diagnostics

    Get PDF
    Digital PCR (dPCR) holds immense potential for precisely detecting nucleic acid markers essential for personalized medicine. However, its broader application is hindered by high consumable costs, complex procedures, and restricted multiplexing capabilities. To address these challenges, an all‐in‐one dPCR system is introduced that eliminates the need for microfabricated chips, offering fully automated operations and enhanced multiplexing capabilities. Using this innovative oscillation‐induced droplet generation technique, OsciDrop, this system supports a comprehensive dPCR workflow, including precise liquid handling, pipette‐based droplet printing, in situ thermocycling, multicolor fluorescence imaging, and machine learning‐driven analysis. The system's reliability is demonstrated by quantifying reference materials and evaluating HER2 copy number variation in breast cancer. Its multiplexing capability is showcased with a quadruplex dPCR assay that detects key EGFR mutations, including 19Del, L858R, and T790M in lung cancer. Moreover, the digital stepwise melting analysis (dSMA) technique is introduced, enabling high‐multiplex profiling of seven major EGFR variants spanning 35 subtypes. This innovative dPCR system presents a cost‐effective and versatile alternative, overcoming existing limitations and paving the way for transformative advances in precision diagnostics

    ERMAS: Becoming Robust to Reward Function Sim-to-Real Gaps in Multi-Agent Simulations

    Full text link
    Multi-agent simulations provide a scalable environment for learning policies that interact with rational agents. However, such policies may fail to generalize to the real-world where agents may differ from simulated counterparts due to unmodeled irrationality and misspecified reward functions. We introduce Epsilon-Robust Multi-Agent Simulation (ERMAS), a robust optimization framework for learning AI policies that are robust to such multiagent sim-to-real gaps. While existing notions of multi-agent robustness concern perturbations in the actions of agents, we address a novel robustness objective concerning perturbations in the reward functions of agents. ERMAS provides this robustness by anticipating suboptimal behaviors from other agents, formalized as the worst-case epsilon-equilibrium. We show empirically that ERMAS yields robust policies for repeated bimatrix games and optimal taxation problems in economic simulations. In particular, in the two-level RL problem posed by the AI Economist (Zheng et al., 2020) ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complex spatiotemporal simulations

    Learning to Play General-Sum Games against Multiple Boundedly Rational Agents

    No full text
    We study the problem of training a principal in a multi-agent general-sum game using reinforcement learning (RL). Learning a robust principal policy requires anticipating the worst possible strategic responses of other agents, which is generally NP-hard. However, we show that no-regret dynamics can identify these worst-case responses in poly-time in smooth games. We propose a framework that uses this policy evaluation method for efficiently learning a robust principal policy using RL. This framework can be extended to provide robustness to boundedly rational agents too. Our motivating application is automated mechanism design: we empirically demonstrate our framework learns robust mechanisms in both matrix games and complex spatiotemporal games. In particular, we learn a dynamic tax policy that improves the welfare of a simulated trade-and-barter economy by 15%, even when facing previously unseen boundedly rational RL taxpayers
    corecore