1,900 research outputs found

    Towards Mitigating Architecture Overfitting in Dataset Distillation

    Full text link
    Dataset distillation methods have demonstrated remarkable performance for neural networks trained with very limited training data. However, a significant challenge arises in the form of architecture overfitting: the distilled training data synthesized by a specific network architecture (i.e., training network) generates poor performance when trained by other network architectures (i.e., test networks). This paper addresses this issue and proposes a series of approaches in both architecture designs and training schemes which can be adopted together to boost the generalization performance across different network architectures on the distilled training data. We conduct extensive experiments to demonstrate the effectiveness and generality of our methods. Particularly, across various scenarios involving different sizes of distilled data, our approaches achieve comparable or superior performance to existing methods when training on the distilled data using networks with larger capacities

    Finite-time analysis of single-timescale actor-critic

    Full text link
    Actor-critic methods have achieved significant success in many challenging applications. However, its finite-time convergence is still poorly understood in the most practical single-timescale form. Existing works on analyzing single-timescale actor-critic have been limited to i.i.d. sampling or tabular setting for simplicity. We investigate the more practical online single-timescale actor-critic algorithm on continuous state space, where the critic assumes linear function approximation and updates with a single Markovian sample per actor step. Previous analysis has been unable to establish the convergence for such a challenging scenario. We demonstrate that the online single-timescale actor-critic method provably finds an ϵ\epsilon-approximate stationary point with O~(ϵ−2)\widetilde{\mathcal{O}}(\epsilon^{-2}) sample complexity under standard assumptions, which can be further improved to O(ϵ−2)\mathcal{O}(\epsilon^{-2}) under the i.i.d. sampling. Our novel framework systematically evaluates and controls the error propagation between the actor and critic. It offers a promising approach for analyzing other single-timescale reinforcement learning algorithms as well

    The Global Minimum Tax, Investment Incentives and Asymmetric Tax Competition

    Get PDF
    This paper investigates how the OECD's global minimum tax (GMT) affects multinational enterprises (MNEs) behavior and countries' corporate taxes. We consider both profit shifting and capital investment responses of the MNE in a formal model of tax competition between asymmetric countries. The GMT reduces the true tax rate differential and benefits the large country, while the revenue effect is generally ambiguous for the small country. In the short run where tax rates are fixed, due to tax deduction of the substance-based income exclusion (SBIE), a higher minimum rate exerts investment incentives but also incurs a larger revenue loss for the small country. We show that under high (low) profit shifting costs the former (latter) effect dominates so that the small country's revenue increases (decreases). In the long run where countries can adjust tax rates, the GMT reshapes the tax game and the competition pattern. In contrast to the existing literature, we reveal that the minimum rate binds the small country only if it is low. With the rise of the GMT rate, countries will undercut the minimum to boost real investments and collect top-up taxes. For small market-size asymmetry and intermediate profit shifting cost, the revenue loss from the elimination of profit shifting may dominate the revenue gain from taxing the true profits generated by substantive activities, so that even a marginal GMT reform may harm the small country. Otherwise, it can raise the small country's tax revenue

    Behavior and elastic buckling analysis for design of thin-walled channel sections with narrow flanges in shear

    Get PDF
    This thesis presents a numerical study on elastic shear buckling of thin-walled channel sections with narrow flanges under predominantly shear. The aim is to investigate the elastic shear buckling behaviour by different numerical methods and to develop a new explicit approach to determine the shear buckling coefficient (kv) of such sections. The research involves developing numerical modelling to perform shear buckling analysis using the finite element method (FEM), semi-analytical finite strip method (SAFSM) and resemi-analytical finite strip method (reSAFSM). SAFSM assumes that the ends of the half-wavelength are free to distort. reSAFSM assumes no cross-sectional distortion at both section ends, and hence all edges of the channels can be treated as simply supported. The FE models are developed using the commercial ABAQUS software package. The results are benchmarked against those from the reSAFSM models developed using the computer program bfinst8R.cpp. reSAFSM models only allow pure shear action without any effect of bending on the sections, while FE models can consider the bending moment by adding distributed normal stress at two cross sections to maintain the static equilibrium. The SAFSM models by computer program bfinst7R.cpp are used to investigate the effect of different boundary conditions on the shear buckling capacity. Both computer programs were developed by Professor Gregory J. Hancock at the University of Sydney. The results of shear buckling analyses are compared with the current design rules in AS/NZS 4600:2018. The research finds that the shear buckling mode for sections with very narrow flanges is governed by twisting buckling, and switches from twisting to local buckling with the gradual increase of flange sizes. This is related to the additional fixity provided by the flanges to the web panel. Based on the research, a new explicit approach for determining kv of both lipped and un-lipped sections with narrow flanges in shear is introduced

    Modeling Human Performance on Statistical Word Segmentation Tasks

    Get PDF
    Harnessing the orbital angular momentum (OAM) of light is an appealing approach to developing photonic technologies for future applications in optical communications and high-dimensional quantum key distribution (QKD) systems. An outstanding challenge to the widespread uptake of the OAM resource is its efficient generation. In this work we design a new device that can directly emit an OAM-carrying light beam from a low-cost semiconductor laser. By fabricating micro-scale spiral phase plates within the aperture of a vertical-cavity surface-emitting laser (VCSEL), the linearly polarized Gaussian beam emitted by the VCSEL is converted into a beam carrying specific OAM modes and their superposition states, with high efficiency and high beam quality. This new approach to OAM generation may be particularly useful in the field of OAM-based optical and quantum communications, especially for short-reach data interconnects and QKD

    VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders

    Full text link
    Large-scale text-to-image diffusion models have shown impressive capabilities across various generative tasks, enabled by strong vision-language alignment obtained through pre-training. However, most vision-language discriminative tasks require extensive fine-tuning on carefully-labeled datasets to acquire such alignment, with great cost in time and computing resources. In this work, we explore directly applying a pre-trained generative diffusion model to the challenging discriminative task of visual grounding without any fine-tuning and additional training dataset. Specifically, we propose VGDiffZero, a simple yet effective zero-shot visual grounding framework based on text-to-image diffusion models. We also design a comprehensive region-scoring method considering both global and local contexts of each isolated proposal. Extensive experiments on RefCOCO, RefCOCO+, and RefCOCOg show that VGDiffZero achieves strong performance on zero-shot visual grounding

    Beneath Surface Similarity: Large Language Models Make Reasonable Scientific Analogies after Structure Abduction

    Full text link
    The vital role of analogical reasoning in human cognition allows us to grasp novel concepts by linking them with familiar ones through shared relational structures. Despite the attention previous research has given to word analogies, this work suggests that Large Language Models (LLMs) often overlook the structures that underpin these analogies, raising questions about the efficacy of word analogies as a measure of analogical reasoning skills akin to human cognition. In response to this, our paper introduces a task of analogical structure abduction, grounded in cognitive psychology, designed to abduce structures that form an analogy between two systems. In support of this task, we establish a benchmark called SCAR, containing 400 scientific analogies from 13 distinct fields, tailored for evaluating analogical reasoning with structure abduction. The empirical evidence underlines the continued challenges faced by LLMs, including ChatGPT and GPT-4, in mastering this task, signifying the need for future exploration to enhance their abilities.Comment: Accepted to EMNLP 2023 (Findings

    Molecular Mechanism Study on the Effect of Nonionic Surfactants with Different Degrees of Ethoxylation on the Wettability of Anthracite

    Get PDF
    A serious risk to the production safety of coal mines is coal dust. The wettability of coal may be successfully changed by adding surfactants to water. However, the creation of very effective dust suppressants is constrained by the lack of knowledge about the microscopic interaction mechanism between coal dust and surfactants. In this investigation, we explained macroscopic experimental phenomena from a molecular perspective. The lauryl polyoxyethylene ethers (C12 (EO)n, n = 7,15,23) were selected. The macromolecular model of anthracite with 55 different components was constructed. Surface tension experiments and hydrophilic lipophilic balance (HLB) calculations showed that the ability of surface hydrophilicization followed the order of C12 (EO)712 \u3e(EO)1512 \u3e(EO)23. Contact angle experiment, XPS and FTIR experiments proved that after the surfactants were adsorbed on the surface of anthracite, the content of carbon element decreased and the content of oxygen element increased, indicating the enhanced surface hydrophilicity. The simulation results showed that with the degree of ethoxylation increases, the adsorption strength of surfactants becomes stronger, and the hydrophilic head group of surfactant on anthracite surface is more uniformly distributed. The greater the degree of ethoxylation, the more powerfully the modified coal surface can bind to water molecules
    • …
    corecore