29 research outputs found

    FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction

    Full text link
    Click-through rate (CTR) prediction is one of the fundamental tasks for online advertising and recommendation. While multi-layer perceptron (MLP) serves as a core component in many deep CTR prediction models, it has been widely recognized that applying a vanilla MLP network alone is inefficient in learning multiplicative feature interactions. As such, many two-stream interaction models (e.g., DeepFM and DCN) have been proposed by integrating an MLP network with another dedicated network for enhanced CTR prediction. As the MLP stream learns feature interactions implicitly, existing research focuses mainly on enhancing explicit feature interactions in the complementary stream. In contrast, our empirical study shows that a well-tuned two-stream MLP model that simply combines two MLPs can even achieve surprisingly good performance, which has never been reported before by existing work. Based on this observation, we further propose feature gating and interaction aggregation layers that can be easily plugged to make an enhanced two-stream MLP model, FinalMLP. In this way, it not only enables differentiated feature inputs but also effectively fuses stream-level interactions across two streams. Our evaluation results on four open benchmark datasets as well as an online A/B test in our industrial system show that FinalMLP achieves better performance than many sophisticated two-stream CTR models. Our source code will be available at MindSpore/models.Comment: Accepted by AAAI 2023. Code available at https://xpai.github.io/FinalML

    Non-invasive Self-attention for Side Information Fusion in Sequential Recommendation

    Full text link
    Sequential recommender systems aim to model users' evolving interests from their historical behaviors, and hence make customized time-relevant recommendations. Compared with traditional models, deep learning approaches such as CNN and RNN have achieved remarkable advancements in recommendation tasks. Recently, the BERT framework also emerges as a promising method, benefited from its self-attention mechanism in processing sequential data. However, one limitation of the original BERT framework is that it only considers one input source of the natural language tokens. It is still an open question to leverage various types of information under the BERT framework. Nonetheless, it is intuitively appealing to utilize other side information, such as item category or tag, for more comprehensive depictions and better recommendations. In our pilot experiments, we found naive approaches, which directly fuse types of side information into the item embeddings, usually bring very little or even negative effects. Therefore, in this paper, we propose the NOninVasive self-attention mechanism (NOVA) to leverage side information effectively under the BERT framework. NOVA makes use of side information to generate better attention distribution, rather than directly altering the item embedding, which may cause information overwhelming. We validate the NOVA-BERT model on both public and commercial datasets, and our method can stably outperform the state-of-the-art models with negligible computational overheads.Comment: Accepted at AAAI 202

    ReLoop2: Building Self-Adaptive Recommendation Models via Responsive Error Compensation Loop

    Full text link
    Industrial recommender systems face the challenge of operating in non-stationary environments, where data distribution shifts arise from evolving user behaviors over time. To tackle this challenge, a common approach is to periodically re-train or incrementally update deployed deep models with newly observed data, resulting in a continual training process. However, the conventional learning paradigm of neural networks relies on iterative gradient-based updates with a small learning rate, making it slow for large recommendation models to adapt. In this paper, we introduce ReLoop2, a self-correcting learning loop that facilitates fast model adaptation in online recommender systems through responsive error compensation. Inspired by the slow-fast complementary learning system observed in human brains, we propose an error memory module that directly stores error samples from incoming data streams. These stored samples are subsequently leveraged to compensate for model prediction errors during testing, particularly under distribution shifts. The error memory module is designed with fast access capabilities and undergoes continual refreshing with newly observed data samples during the model serving phase to support fast model adaptation. We evaluate the effectiveness of ReLoop2 on three open benchmark datasets as well as a real-world production dataset. The results demonstrate the potential of ReLoop2 in enhancing the responsiveness and adaptiveness of recommender systems operating in non-stationary environments.Comment: Accepted by KDD 2023. See the project page at https://xpai.github.io/ReLoo

    Uncovering User Interest from Biased and Noised Watch Time in Video Recommendation

    Full text link
    In the video recommendation, watch time is commonly adopted as an indicator of user interest. However, watch time is not only influenced by the matching of users' interests but also by other factors, such as duration bias and noisy watching. Duration bias refers to the tendency for users to spend more time on videos with longer durations, regardless of their actual interest level. Noisy watching, on the other hand, describes users taking time to determine whether they like a video or not, which can result in users spending time watching videos they do not like. Consequently, the existence of duration bias and noisy watching make watch time an inadequate label for indicating user interest. Furthermore, current methods primarily address duration bias and ignore the impact of noisy watching, which may limit their effectiveness in uncovering user interest from watch time. In this study, we first analyze the generation mechanism of users' watch time from a unified causal viewpoint. Specifically, we considered the watch time as a mixture of the user's actual interest level, the duration-biased watch time, and the noisy watch time. To mitigate both the duration bias and noisy watching, we propose Debiased and Denoised watch time Correction (D2^2Co), which can be divided into two steps: First, we employ a duration-wise Gaussian Mixture Model plus frequency-weighted moving average for estimating the bias and noise terms; then we utilize a sensitivity-controlled correction function to separate the user interest from the watch time, which is robust to the estimation error of bias and noise terms. The experiments on two public video recommendation datasets and online A/B testing indicate the effectiveness of the proposed method.Comment: Accepted by Recsys'2

    BARS: Towards Open Benchmarking for Recommender Systems

    Full text link
    The past two decades have witnessed the rapid development of personalized recommendation techniques. Despite significant progress made in both research and practice of recommender systems, to date, there is a lack of a widely-recognized benchmarking standard in this field. Many existing studies perform model evaluations and comparisons in an ad-hoc manner, for example, by employing their own private data splits or using different experimental settings. Such conventions not only increase the difficulty in reproducing existing studies, but also lead to inconsistent experimental results among them. This largely limits the credibility and practical value of research results in this field. To tackle these issues, we present an initiative project (namely BARS) aiming for open benchmarking for recommender systems. In comparison to some earlier attempts towards this goal, we take a further step by setting up a standardized benchmarking pipeline for reproducible research, which integrates all the details about datasets, source code, hyper-parameter settings, running logs, and evaluation results. The benchmark is designed with comprehensiveness and sustainability in mind. It covers both matching and ranking tasks, and also enables researchers to easily follow and contribute to the research in this field. This project will not only reduce the redundant efforts of researchers to re-implement or re-run existing baselines, but also drive more solid and reproducible research on recommender systems. We would like to call upon everyone to use the BARS benchmark for future evaluation, and contribute to the project through the portal at: https://openbenchmark.github.io/BARS.Comment: Accepted by SIGIR 2022. Note that version v5 is updated to keep consistency with the ACM camera-ready versio

    Multi-View Clustering and Semi-Supervised Classification with Adaptive Neighbours

    No full text
    Due to the efficiency of learning relationships and complex structures hidden in data, graph-oriented methods have been widely investigated and achieve promising performance in multi-view learning. Generally, these learning algorithms construct informative graph for each view or fuse different views to one graph, on which the following procedure are based. However, in many real world dataset, original data always contain noise and outlying entries that result in unreliable and inaccurate graphs, which cannot be ameliorated in the previous methods. In this paper, we propose a novel multi-view learning model which performs clustering/semi-supervised classification and local structure learning simultaneously. The obtained optimal graph can be partitioned into specific clusters directly. Moreover, our model can allocate ideal weight for each view automatically without additional weight and penalty parameters. An efficient algorithm is proposed to optimize this model. Extensive experimental results on different real-world datasets show that the proposed model outperforms other state-of-the-art multi-view algorithms

    SIRT4 functions as a tumor suppressor during prostate cancer by inducing apoptosis and inhibiting glutamine metabolism

    No full text
    Abstract Localized in the mitochondria, SIRT4 is a nicotinamide adenine dinucleotide (NAD +) -dependent adenosine diphosphate (ADP) -ribosyltransferase and is one of the least characterized members of the sirtuin family. Although it is well known that it shows deacetylase activity for energy metabolism, little is understood about its function in tumorigenesis. Recent research suggests that SIRT4 may work as both a tumor suppressor gene and an oncogene. However, the clinical significance of SIRT4 in prostate cancer remains unknown. In this study, we evaluated SIRT4 protein levels in cancerous prostate tissue and corresponding non-tumor prostate tissue via immunohistochemical staining on a tissue microarray including tissues from 89 prostate cancer patients. The association between SIRT4 expression and Gleason score was also determined. Further, shSIRT4 or stable prostate cancer cell lines (22RV1) overexpressing SIRT4 were constructed via lentiviral infection. Using Cell-Counting Kit-8 (CCK-8) assay, wound healing assay, migration, and invasion and apoptosis assays, the effects of SIRT4 on the migration, invasion ability, and proliferation of prostate cancer cells were investigated. We also determined the effect of SIRT4 on glutamine metabolism in 22RV1 cells. We found the protein levels of SIRT4 in prostate cancer tissues were significantly lower than those in their non-neoplastic tissue counterparts (P < 0.01); a lower SIRT4 level was also significantly associated with a higher Gleason score (P < 0.01). SIRT4 suppressed the migration, invasion capabilities, and proliferation of prostate cancer cells and induced cellular apoptosis. Furthermore, the invasion and migration of 22RV1 cells were mechanistically inhibited by SIRT4 via glutamine metabolism inhibition. In conclusion, the present study’s findings showed that SIRT4 protein levels are significantly associated with the Gleason score in patients with prostate cancer, and SIRT4 exerts a tumor-suppressive effect on prostate cancer cells by inhibiting glutamine metabolism. Thus, SIRT4 may serve as a potential novel therapeutic target for prostate cancer

    Auto-Weighted Multi-View Learning for Image Clustering and Semi-Supervised Classification

    No full text
    corecore