33 research outputs found

    Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference

    Full text link
    While increasingly deep networks are still in general desired for achieving state-of-the-art performance, for many specific inputs a simpler network might already suffice. Existing works exploited this observation by learning to skip convolutional layers in an input-dependent manner. However, we argue their binary decision scheme, i.e., either fully executing or completely bypassing one layer for a specific input, can be enhanced by introducing finer-grained, "softer" decisions. We therefore propose a Dynamic Fractional Skipping (DFS) framework. The core idea of DFS is to hypothesize layer-wise quantization (to different bitwidths) as intermediate "soft" choices to be made between fully utilizing and skipping a layer. For each input, DFS dynamically assigns a bitwidth to both weights and activations of each layer, where fully executing and skipping could be viewed as two "extremes" (i.e., full bitwidth and zero bitwidth). In this way, DFS can "fractionally" exploit a layer's expressive power during input-adaptive inference, enabling finer-grained accuracy-computational cost trade-offs. It presents a unified view to link input-adaptive layer skipping and input-adaptive hybrid quantization. Extensive experimental results demonstrate the superior tradeoff between computational cost and model expressive power (accuracy) achieved by DFS. More visualizations also indicate a smooth and consistent transition in the DFS behaviors, especially the learned choices between layer skipping and different quantizations when the total computational budgets vary, validating our hypothesis that layer quantization could be viewed as intermediate variants of layer skipping. Our source code and supplementary material are available at \link{https://github.com/Torment123/DFS}

    Dual Dynamic Inference: Enabling More Efficient, Adaptive and Controllable Deep Inference

    Full text link
    State-of-the-art convolutional neural networks (CNNs) yield record-breaking predictive performance, yet at the cost of high-energy-consumption inference, that prohibits their widely deployments in resource-constrained Internet of Things (IoT) applications. We propose a dual dynamic inference (DDI) framework that highlights the following aspects: 1) we integrate both input-dependent and resource-dependent dynamic inference mechanisms under a unified framework in order to fit the varying IoT resource requirements in practice. DDI is able to both constantly suppress unnecessary costs for easy samples, and to halt inference for all samples to meet hard resource constraints enforced; 2) we propose a flexible multi-grained learning to skip (MGL2S) approach for input-dependent inference which allows simultaneous layer-wise and channel-wise skipping; 3) we extend DDI to complex CNN backbones such as DenseNet and show that DDI can be applied towards optimizing any specific resource goals including inference latency or energy cost. Extensive experiments demonstrate the superior inference accuracy-resource trade-off achieved by DDI, as well as the flexibility to control such trade-offs compared to existing peer methods. Specifically, DDI can achieve up to 4 times computational savings with the same or even higher accuracy as compared to existing competitive baselines

    ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

    Full text link
    With large language models (LLMs) achieving remarkable breakthroughs in natural language processing (NLP) domains, LLM-enhanced recommender systems have received much attention and have been actively explored currently. In this paper, we focus on adapting and empowering a pure large language model for zero-shot and few-shot recommendation tasks. First and foremost, we identify and formulate the lifelong sequential behavior incomprehension problem for LLMs in recommendation domains, i.e., LLMs fail to extract useful information from a textual context of long user behavior sequence, even if the length of context is far from reaching the context limitation of LLMs. To address such an issue and improve the recommendation performance of LLMs, we propose a novel framework, namely Retrieval-enhanced Large Language models (ReLLa) for recommendation tasks in both zero-shot and few-shot settings. For zero-shot recommendation, we perform semantic user behavior retrieval (SUBR) to improve the data quality of testing samples, which greatly reduces the difficulty for LLMs to extract the essential knowledge from user behavior sequences. As for few-shot recommendation, we further design retrieval-enhanced instruction tuning (ReiT) by adopting SUBR as a data augmentation technique for training samples. Specifically, we develop a mixed training dataset consisting of both the original data samples and their retrieval-enhanced counterparts. We conduct extensive experiments on a real-world public dataset (i.e., MovieLens-1M) to demonstrate the superiority of ReLLa compared with existing baseline models, as well as its capability for lifelong sequential behavior comprehension.Comment: Under Revie

    Towards Open-World Recommendation with Knowledge Augmentation from Large Language Models

    Full text link
    Recommender systems play a vital role in various online services. However, the insulated nature of training and deploying separately within a specific domain limits their access to open-world knowledge. Recently, the emergence of large language models (LLMs) has shown promise in bridging this gap by encoding extensive world knowledge and demonstrating reasoning capability. Nevertheless, previous attempts to directly use LLMs as recommenders have not achieved satisfactory results. In this work, we propose an Open-World Knowledge Augmented Recommendation Framework with Large Language Models, dubbed KAR, to acquire two types of external knowledge from LLMs -- the reasoning knowledge on user preferences and the factual knowledge on items. We introduce factorization prompting to elicit accurate reasoning on user preferences. The generated reasoning and factual knowledge are effectively transformed and condensed into augmented vectors by a hybrid-expert adaptor in order to be compatible with the recommendation task. The obtained vectors can then be directly used to enhance the performance of any recommendation model. We also ensure efficient inference by preprocessing and prestoring the knowledge from the LLM. Extensive experiments show that KAR significantly outperforms the state-of-the-art baselines and is compatible with a wide range of recommendation algorithms. We deploy KAR to Huawei's news and music recommendation platforms and gain a 7\% and 1.7\% improvement in the online A/B test, respectively

    How Can Recommender Systems Benefit from Large Language Models: A Survey

    Full text link
    Recommender systems (RS) play important roles to match users' information needs for Internet applications. In natural language processing (NLP) domains, large language model (LLM) has shown astonishing emergent abilities (e.g., instruction following, reasoning), thus giving rise to the promising research direction of adapting LLM to RS for performance enhancements and user experience improvements. In this paper, we conduct a comprehensive survey on this research direction from an application-oriented view. We first summarize existing research works from two orthogonal perspectives: where and how to adapt LLM to RS. For the "WHERE" question, we discuss the roles that LLM could play in different stages of the recommendation pipeline, i.e., feature engineering, feature encoder, scoring/ranking function, and pipeline controller. For the "HOW" question, we investigate the training and inference strategies, resulting in two fine-grained taxonomy criteria, i.e., whether to tune LLMs or not, and whether to involve conventional recommendation model (CRM) for inference. Detailed analysis and general development trajectories are provided for both questions, respectively. Then, we highlight key challenges in adapting LLM to RS from three aspects, i.e., efficiency, effectiveness, and ethics. Finally, we summarize the survey and discuss the future prospects. We also actively maintain a GitHub repository for papers and other related resources in this rising direction: https://github.com/CHIANGEL/Awesome-LLM-for-RecSys.Comment: 15 pages; 3 figures; summarization table in appendi

    CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models

    Full text link
    With the emergence of Large Language Models (LLMs), there has been a significant improvement in the programming capabilities of models, attracting growing attention from researchers. We propose CodeApex, a bilingual benchmark dataset focusing on the programming comprehension and code generation abilities of LLMs. CodeApex comprises three types of multiple-choice questions: conceptual understanding, commonsense reasoning, and multi-hop reasoning, designed to evaluate LLMs on programming comprehension tasks. Additionally, CodeApex utilizes algorithmic questions and corresponding test cases to assess the code quality generated by LLMs. We evaluate 14 state-of-the-art LLMs, including both general-purpose and specialized models. GPT exhibits the best programming capabilities, achieving approximate accuracies of 50% and 56% on the two tasks, respectively. There is still significant room for improvement in programming tasks. We hope that CodeApex can serve as a reference for evaluating the coding capabilities of LLMs, further promoting their development and growth. Datasets are released at https://github.com/APEXLAB/CodeApex.git. CodeApex submission website is https://apex.sjtu.edu.cn/codeapex/.Comment: 21 page

    Case report of a Li-Fraumeni syndrome-like phenotype with a de novo mutation in <i>CHEK2</i>

    Get PDF
    BACKGROUND: Cases of multiple tumors are rarely reported in China. In our study, a 57-year-old female patient had concurrent squamous cell carcinoma, mucoepidermoid carcinoma, brain cancer, bone cancer, and thyroid cancer, which has rarely been reported to date. METHODS: To determine the relationship among these multiple cancers, available DNA samples from the thyroid, lung, and skin tumors and from normal thyroid tissue were sequenced using whole exome sequencing. RESULTS: The notable discrepancies of somatic mutations among the 3 tumor tissues indicated that they arose independently, rather than metastasizing from 1 tumor. A novel deleterious germline mutation (chr22:29091846, G->A, p.H371Y) was identified in CHEK2, a Li–Fraumeni syndrome causal gene. Examining the status of this novel mutation in the patient's healthy siblings revealed its de novo origin. CONCLUSION: Our study reports the first case of Li–Fraumeni syndrome-like in Chinese patients and demonstrates the important contribution of de novo mutations in this type of rare disease

    Full-length single-cell RNA-seq applied to a viral human cancer:applications to HPV expression and splicing analysis in HeLa S3 cells

    Get PDF
    Background: Viral infection causes multiple forms of human cancer, and HPV infection is the primary factor in cervical carcinomas Recent single-cell RNA-seq studies highlight the tumor heterogeneity present in most cancers, but virally induced tumors have not been studied HeLa is a well characterized HPV+ cervical cancer cell line Result: We developed a new high throughput platform to prepare single-cell RNA on a nanoliter scale based on a customized microwell chip Using this method, we successfully amplified full-length transcripts of 669 single HeLa S3 cells and 40 of them were randomly selected to perform single-cell RNA sequencing Based on these data, we obtained a comprehensive understanding of the heterogeneity of HeLa S3 cells in gene expression, alternative splicing and fusions Furthermore, we identified a high diversity of HPV-18 expression and splicing at the single-cell level By co-expression analysis we identified 283 E6, E7 co-regulated genes, including CDC25, PCNA, PLK4, BUB1B and IRF1 known to interact with HPV viral proteins Conclusion: Our results reveal the heterogeneity of a virus-infected cell line It not only provides a transcriptome characterization of HeLa S3 cells at the single cell level, but is a demonstration of the power of single cell RNA-seq analysis of virally infected cells and cancers

    Multiple Sulfur Isotopes of Iron Sulfides From Thick Greigite‐Bearing Sediments Indicate Anaerobic Oxidation and Possible Leakages of Coastal Methane

    No full text
    Abstract Magnetic greigite may be a valuable indicator for methane emissions in the geological past, if its formation pathway and diagenetic environment can be unambiguously defined. Here, we investigate sulfur isotopic compositions of iron sulfides and ferrous iron concentrations of thick greigite‐bearing sediments (TGBSs) in the South Yellow Sea, a shallow marginal sea with strong methane emissions. For the first time, isotopically heavy iron sulfides (up to 28.7‰ in δ34S and 0.19‰ in Δ33S) and enrichments of ferrous iron in the TGBSs are observed. We interpret the data as indicating synchronic occurrences of anaerobic oxidation of methane coupled to sulfate and iron reductions, which occur in coastal methanic zones with limited sulfate availability and therefore probably imply leakages of methane. Consequently, we suggest that greigite is a promising geological indicator for tracing methane liberated from coastal sediments, accounting for ∼60% of the currently rising global marine methane budget
    corecore