141 research outputs found

    Pre-trained Neural Recommenders: A Transferable Zero-Shot Framework for Recommendation Systems

    Full text link
    Modern neural collaborative filtering techniques are critical to the success of e-commerce, social media, and content-sharing platforms. However, despite technical advances -- for every new application domain, we need to train an NCF model from scratch. In contrast, pre-trained vision and language models are routinely applied to diverse applications directly (zero-shot) or with limited fine-tuning. Inspired by the impact of pre-trained models, we explore the possibility of pre-trained recommender models that support building recommender systems in new domains, with minimal or no retraining, without the use of any auxiliary user or item information. Zero-shot recommendation without auxiliary information is challenging because we cannot form associations between users and items across datasets when there are no overlapping users or items. Our fundamental insight is that the statistical characteristics of the user-item interaction matrix are universally available across different domains and datasets. Thus, we use the statistical characteristics of the user-item interaction matrix to identify dataset-independent representations for users and items. We show how to learn universal (i.e., supporting zero-shot adaptation without user or item auxiliary information) representations for nodes and edges from the bipartite user-item interaction graph. We learn representations by exploiting the statistical properties of the interaction data, including user and item marginals, and the size and density distributions of their clusters

    ICMRec: Item Cluster-Wise Multi-Objective Optimization for Unbiased Recommendation

    Full text link
    The traditional observed data used to train the recommender model suffers from severe bias issues (e.g., exposure bias, popularity bias). Interactions of a small fraction of head items account for almost the whole training data. The normal training paradigm from such biased data tends to repetitively generate recommendations from the head items, which further exacerbates the biases and affects the exploration of potentially interesting items from the niche set. In this work, distinct from existing methods, we innovatively explore the central theme of unbiased recommendation from an item cluster-wise multi-objective optimization perspective. Aiming to balance the learning on various item clusters that differ in popularity during the training process, we characterize the recommendation task as an item cluster-wise multi-objective optimization problem. To this end, we propose a model-agnostic framework namely Item Cluster-Wise Multi-Objective Recommendation (ICMRec) for unbiased recommendation. In detail, we define our item cluster-wise optimization target that the recommender model should balance all item clusters that differ in popularity. Thus we set the model learning on each item cluster as a unique optimization objective. To achieve this goal, we first explore items' popularity levels from a novel causal reasoning perspective. Then, we devise popularity discrepancy-based bisecting clustering to separate the discriminated item clusters. Next, we adaptively find the overall harmonious gradient direction for multiple item cluster-wise optimization objectives from a Pareto-efficient solver. Finally, in the prediction stage, we perform counterfactual inference to further eliminate the impact of user conformity. Extensive experimental results demonstrate the superiorities of ICMRec on overall recommendation performance and biases elimination. Codes will be open-source upon acceptance

    CodeTransOcean: A Comprehensive Multilingual Benchmark for Code Translation

    Full text link
    Recent code translation techniques exploit neural machine translation models to translate source code from one programming language to another to satisfy production compatibility or to improve efficiency of codebase maintenance. Most existing code translation datasets only focus on a single pair of popular programming languages. To advance research on code translation and meet diverse requirements of real-world applications, we construct CodeTransOcean, a large-scale comprehensive benchmark that supports the largest variety of programming languages for code translation. CodeTransOcean consists of three novel multilingual datasets, namely, MultilingualTrans supporting translations between multiple popular programming languages, NicheTrans for translating between niche programming languages and popular ones, and LLMTrans for evaluating executability of translated code by large language models (LLMs). CodeTransOcean also includes a novel cross-framework dataset, DLTrans, for translating deep learning code across different frameworks. We develop multilingual modeling approaches for code translation and demonstrate their great potential in improving the translation quality of both low-resource and high-resource language pairs and boosting the training efficiency. We also propose a novel evaluation metric Debugging Success Rate@K for program-level code translation. Last but not least, we evaluate LLM ChatGPT on our datasets and investigate its potential for fuzzy execution predictions. We build baselines for CodeTransOcean and analyze challenges of code translation for guiding future research. The CodeTransOcean datasets and code are publicly available at https://github.com/WeixiangYAN/CodeTransOcean.Comment: Accepted by Findings of EMNLP 202

    Local Conditional Neural Fields for Versatile and Generalizable Large-Scale Reconstructions in Computational Imaging

    Full text link
    Deep learning has transformed computational imaging, but traditional pixel-based representations limit their ability to capture continuous, multiscale details of objects. Here we introduce a novel Local Conditional Neural Fields (LCNF) framework, leveraging a continuous implicit neural representation to address this limitation. LCNF enables flexible object representation and facilitates the reconstruction of multiscale information. We demonstrate the capabilities of LCNF in solving the highly ill-posed inverse problem in Fourier ptychographic microscopy (FPM) with multiplexed measurements, achieving robust, scalable, and generalizable large-scale phase retrieval. Unlike traditional neural fields frameworks, LCNF incorporates a local conditional representation that promotes model generalization, learning multiscale information, and efficient processing of large-scale imaging data. By combining an encoder and a decoder conditioned on a learned latent vector, LCNF achieves versatile continuous-domain super-resolution image reconstruction. We demonstrate accurate reconstruction of wide field-of-view, high-resolution phase images using only a few multiplexed measurements. LCNF robustly captures the continuous object priors and eliminates various phase artifacts, even when it is trained on imperfect datasets. The framework exhibits strong generalization, reconstructing diverse objects even with limited training data. Furthermore, LCNF can be trained on a physics simulator using natural images and successfully applied to experimental measurements on biological samples. Our results highlight the potential of LCNF for solving large-scale inverse problems in computational imaging, with broad applicability in various deep-learning-based techniques

    Rapid Generation of Optimal Generalized Monkhorst-Pack Grids

    Full text link
    Computational modeling of the properties of crystalline materials has become an increasingly important aspect of materials research, consuming hundreds of millions of CPU-hours at scientific computing centres around the world each year, if not more. A routine operation in such calculations is the evaluation of integrals over the Brillouin zone. We have previously demonstrated that performing such integrals using generalized Monkhorst-Pack k-point grids can roughly double the speed of these calculations relative to the widely-used traditional Monkhorst-Pack grids, and such grids can be rapidly generated by querying a free, internet-accessible database of pre-generated grids. To facilitate the widespread use of generalized k-point grids, we present new algorithms that allow rapid generation of optimized generalized Monkhorst-Pack grids on the fly, an open-source library to facilitate their integration into external software packages, and an open-source implementation of the database tool that can be used offline. We also present benchmarks of the speed of our algorithms on structures randomly selected from the Inorganic Crystal Structure Database. For grids that correspond to a real-space supercell with at least 50 angstroms between lattice points, which is sufficient to converge density functional theory calculations within 1 meV/atom for nearly all materials, our algorithm finds optimized grids in an average of 0.19 seconds on a single processing core. For 100 angstroms between real-space lattice points, our algorithm finds optimal grids in less than 5 seconds on average

    Research on Water Pollution Control Based on STM32 Intelligent Vehicle

    Get PDF
    In order to solve the high cost and low efficiency of different degrees of pollution control of natural water resources in China at this stage, photocatalytic water purification technology is adopted to reduce the cost of water pollution treatment and improve the treatment efficiency, and an intelligent vehicle equipped with photocatalytic materials is proposed, which is equipped with industrial cameras, communication positioning modules and sensors, and realizes dynamic planning of navigation routes by improving ant colony algorithms, computer vision recognition, ultrasonic obstacle avoidance, and realizes photocatalytic fixed-point purification. Predict advanced photoelectric catalytic performance based on density functional theory and machine learning, solve the problem of BiVO4 photo corrosion and instability, and achieve efficient water purification at low cost

    Surround-view Fisheye BEV-Perception for Valet Parking: Dataset, Baseline and Distortion-insensitive Multi-task Framework

    Full text link
    Surround-view fisheye perception under valet parking scenes is fundamental and crucial in autonomous driving. Environmental conditions in parking lots perform differently from the common public datasets, such as imperfect light and opacity, which substantially impacts on perception performance. Most existing networks based on public datasets may generalize suboptimal results on these valet parking scenes, also affected by the fisheye distortion. In this article, we introduce a new large-scale fisheye dataset called Fisheye Parking Dataset(FPD) to promote the research in dealing with diverse real-world surround-view parking cases. Notably, our compiled FPD exhibits excellent characteristics for different surround-view perception tasks. In addition, we also propose our real-time distortion-insensitive multi-task framework Fisheye Perception Network (FPNet), which improves the surround-view fisheye BEV perception by enhancing the fisheye distortion operation and multi-task lightweight designs. Extensive experiments validate the effectiveness of our approach and the dataset's exceptional generalizability.Comment: 12 pages, 11 figure

    What Matters for 3D Scene Flow Network

    Full text link
    3D scene flow estimation from point clouds is a low-level 3D motion perception task in computer vision. Flow embedding is a commonly used technique in scene flow estimation, and it encodes the point motion between two consecutive frames. Thus, it is critical for the flow embeddings to capture the correct overall direction of the motion. However, previous works only search locally to determine a soft correspondence, ignoring the distant points that turn out to be the actual matching ones. In addition, the estimated correspondence is usually from the forward direction of the adjacent point clouds, and may not be consistent with the estimated correspondence acquired from the backward direction. To tackle these problems, we propose a novel all-to-all flow embedding layer with backward reliability validation during the initial scene flow estimation. Besides, we investigate and compare several design choices in key components of the 3D scene flow network, including the point similarity calculation, input elements of predictor, and predictor & refinement level design. After carefully choosing the most effective designs, we are able to present a model that achieves the state-of-the-art performance on FlyingThings3D and KITTI Scene Flow datasets. Our proposed model surpasses all existing methods by at least 38.2% on FlyingThings3D dataset and 24.7% on KITTI Scene Flow dataset for EPE3D metric. We release our codes at https://github.com/IRMVLab/3DFlow.Comment: Accepted by ECCV 202

    Efficient Exploration Using Extra Safety Budget in Constrained Policy Optimization

    Full text link
    Reinforcement learning (RL) has achieved promising results on most robotic control tasks. Safety of learning-based controllers is an essential notion of ensuring the effectiveness of the controllers. Current methods adopt whole consistency constraints during the training, thus resulting in inefficient exploration in the early stage. In this paper, we propose an algorithm named Constrained Policy Optimization with Extra Safety Budget (ESB-CPO) to strike a balance between the exploration efficiency and the constraints satisfaction. In the early stage, our method loosens the practical constraints of unsafe transitions (adding extra safety budget) with the aid of a new metric we propose. With the training process, the constraints in our optimization problem become tighter. Meanwhile, theoretical analysis and practical experiments demonstrate that our method gradually meets the cost limit's demand in the final training stage. When evaluated on Safety-Gym and Bullet-Safety-Gym benchmarks, our method has shown its advantages over baseline algorithms in terms of safety and optimality. Remarkably, our method gains remarkable performance improvement under the same cost limit compared with baselines.Comment: 7 pages, 8 figure

    Universal Sleep Decoder: Aligning awake and sleep neural representation across subjects

    Full text link
    Decoding memory content from brain activity during sleep has long been a goal in neuroscience. While spontaneous reactivation of memories during sleep in rodents is known to support memory consolidation and offline learning, capturing memory replay in humans is challenging due to the absence of well-annotated sleep datasets and the substantial differences in neural patterns between wakefulness and sleep. To address these challenges, we designed a novel cognitive neuroscience experiment and collected a comprehensive, well-annotated electroencephalography (EEG) dataset from 52 subjects during both wakefulness and sleep. Leveraging this benchmark dataset, we developed the Universal Sleep Decoder (USD) to align neural representations between wakefulness and sleep across subjects. Our model achieves up to 16.6% top-1 zero-shot accuracy on unseen subjects, comparable to decoding performances using individual sleep data. Furthermore, fine-tuning USD on test subjects enhances decoding accuracy to 25.9% top-1 accuracy, a substantial improvement over the baseline chance of 6.7%. Model comparison and ablation analyses reveal that our design choices, including the use of (i) an additional contrastive objective to integrate awake and sleep neural signals and (ii) the pretrain-finetune paradigm to incorporate different subjects, significantly contribute to these performances. Collectively, our findings and methodologies represent a significant advancement in the field of sleep decoding
    corecore