865 research outputs found

    TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining

    Full text link
    We introduce a block-online variant of the temporal feature-wise linear modulation (TFiLM) model to achieve bandwidth extension. The proposed architecture simplifies the UNet backbone of the TFiLM to reduce inference time and employs an efficient transformer at the bottleneck to alleviate performance degradation. We also utilize self-supervised pretraining and data augmentation to enhance the quality of bandwidth extended signals and reduce the sensitivity with respect to downsampling methods. Experiment results on the VCTK dataset show that the proposed method outperforms several recent baselines in both intrusive and non-intrusive metrics. Pretraining and filter augmentation also help stabilize and enhance the overall performance.Comment: Published as a conference paper at ICASSP 2022, 5 pages, 4 figures, 3 table

    Channel and spatial attention mechanism for fashion image captioning

    Get PDF
    Image captioning aims to automatically generate one or more description sentences for a given input image. Most of the existing captioning methods use encoder-decoder model which mainly focus on recognizing and capturing the relationship between objects appearing in the input image. However, when generating captions for fashion images, it is important to not only describe the items and their relationships, but also mention attribute features of clothes (shape, texture, style, fabric, and more). In this study, one novel model is proposed for fashion image captioning task which can capture not only the items and their relationship, but also their attribute features. Two different attention mechanisms (spatial-attention and channel-wise attention) is incorporated to the traditional encoder-decoder model, which dynamically interprets the caption sentence in multi-layer feature map in addition to the depth dimension of the feature map. We evaluate our proposed architecture on Fashion-Gen using three different metrics (CIDEr, ROUGE-L, and BLEU-1), and achieve the scores of 89.7, 50.6 and 45.6, respectively. Based on experiments, our proposed method shows significant performance improvement for the task of fashion-image captioning, and outperforms other state-of-the-art image captioning methods

    Conditional Support Alignment for Domain Adaptation with Label Shift

    Full text link
    Unsupervised domain adaptation (UDA) refers to a domain adaptation framework in which a learning model is trained based on the labeled samples on the source domain and unlabelled ones in the target domain. The dominant existing methods in the field that rely on the classical covariate shift assumption to learn domain-invariant feature representation have yielded suboptimal performance under the label distribution shift between source and target domains. In this paper, we propose a novel conditional adversarial support alignment (CASA) whose aim is to minimize the conditional symmetric support divergence between the source's and target domain's feature representation distributions, aiming at a more helpful representation for the classification task. We also introduce a novel theoretical target risk bound, which justifies the merits of aligning the supports of conditional feature distributions compared to the existing marginal support alignment approach in the UDA settings. We then provide a complete training process for learning in which the objective optimization functions are precisely based on the proposed target risk bound. Our empirical results demonstrate that CASA outperforms other state-of-the-art methods on different UDA benchmark tasks under label shift conditions

    Irreducible representations of Upq[gl(2/2)]

    Full text link
    The two-parametric quantum superalgebra Upq[gl(2/2)]U_{pq}[gl(2/2)] and its representations are considered. All finite-dimensional irreducible representations of this quantum superalgebra can be constructed and classified into typical and nontypical ones according to a proposition proved in the present paper. This proposition is a nontrivial deformation from the one for the classical superalgebra gl(2/2), unlike the case of one-parametric deformations.Comment: Latex, 8 pages. A reference added in v.

    POLLUTION OF GROUNDWATER BY LEACHATE FROM DONG THANH LANDFILL DISPOSAL SITE

    Full text link
    Joint Research on Environmental Science and Technology for the Eart

    Anisotropic Magneto-Thermopower: the Contribution of Interband Relaxation

    Full text link
    Spin injection in metallic normal/ferromagnetic junctions is investigated taking into account the anisotropic magnetoresistance (AMR) occurring in the ferromagnetic layer. It is shown, on the basis of a generalized two channel model, that there is an interface resistance contribution due to anisotropic scattering, beyond spin accumulation and giant magnetoresistance (GMR). The corresponding expression of the thermopower is derived and compared with the expression for the thermopower produced by the GMR. First measurements of anisotropic magnetothermopower are presented in electrodeposited Ni nanowires contacted with Ni, Au and Cu. The results of this study show that while the giant magnetoresistance and corresponding thermopower demonstrates the role of spin-flip scattering, the observed anisotropic magnetothermopower indicates interband s-d relaxation mechanisms.Comment: 20 pages, 4 figure

    Federated Deep Reinforcement Learning-based Bitrate Adaptation for Dynamic Adaptive Streaming over HTTP

    Full text link
    In video streaming over HTTP, the bitrate adaptation selects the quality of video chunks depending on the current network condition. Some previous works have applied deep reinforcement learning (DRL) algorithms to determine the chunk's bitrate from the observed states to maximize the quality-of-experience (QoE). However, to build an intelligent model that can predict in various environments, such as 3G, 4G, Wifi, \textit{etc.}, the states observed from these environments must be sent to a server for training centrally. In this work, we integrate federated learning (FL) to DRL-based rate adaptation to train a model appropriate for different environments. The clients in the proposed framework train their model locally and only update the weights to the server. The simulations show that our federated DRL-based rate adaptations, called FDRLABR with different DRL algorithms, such as deep Q-learning, advantage actor-critic, and proximal policy optimization, yield better performance than the traditional bitrate adaptation methods in various environments.Comment: 13 pages, 1 colum
    corecore