31 research outputs found

    Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference

    Full text link
    While increasingly deep networks are still in general desired for achieving state-of-the-art performance, for many specific inputs a simpler network might already suffice. Existing works exploited this observation by learning to skip convolutional layers in an input-dependent manner. However, we argue their binary decision scheme, i.e., either fully executing or completely bypassing one layer for a specific input, can be enhanced by introducing finer-grained, "softer" decisions. We therefore propose a Dynamic Fractional Skipping (DFS) framework. The core idea of DFS is to hypothesize layer-wise quantization (to different bitwidths) as intermediate "soft" choices to be made between fully utilizing and skipping a layer. For each input, DFS dynamically assigns a bitwidth to both weights and activations of each layer, where fully executing and skipping could be viewed as two "extremes" (i.e., full bitwidth and zero bitwidth). In this way, DFS can "fractionally" exploit a layer's expressive power during input-adaptive inference, enabling finer-grained accuracy-computational cost trade-offs. It presents a unified view to link input-adaptive layer skipping and input-adaptive hybrid quantization. Extensive experimental results demonstrate the superior tradeoff between computational cost and model expressive power (accuracy) achieved by DFS. More visualizations also indicate a smooth and consistent transition in the DFS behaviors, especially the learned choices between layer skipping and different quantizations when the total computational budgets vary, validating our hypothesis that layer quantization could be viewed as intermediate variants of layer skipping. Our source code and supplementary material are available at \link{https://github.com/Torment123/DFS}

    Dual Dynamic Inference: Enabling More Efficient, Adaptive and Controllable Deep Inference

    Full text link
    State-of-the-art convolutional neural networks (CNNs) yield record-breaking predictive performance, yet at the cost of high-energy-consumption inference, that prohibits their widely deployments in resource-constrained Internet of Things (IoT) applications. We propose a dual dynamic inference (DDI) framework that highlights the following aspects: 1) we integrate both input-dependent and resource-dependent dynamic inference mechanisms under a unified framework in order to fit the varying IoT resource requirements in practice. DDI is able to both constantly suppress unnecessary costs for easy samples, and to halt inference for all samples to meet hard resource constraints enforced; 2) we propose a flexible multi-grained learning to skip (MGL2S) approach for input-dependent inference which allows simultaneous layer-wise and channel-wise skipping; 3) we extend DDI to complex CNN backbones such as DenseNet and show that DDI can be applied towards optimizing any specific resource goals including inference latency or energy cost. Extensive experiments demonstrate the superior inference accuracy-resource trade-off achieved by DDI, as well as the flexibility to control such trade-offs compared to existing peer methods. Specifically, DDI can achieve up to 4 times computational savings with the same or even higher accuracy as compared to existing competitive baselines

    ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

    Full text link
    With large language models (LLMs) achieving remarkable breakthroughs in natural language processing (NLP) domains, LLM-enhanced recommender systems have received much attention and have been actively explored currently. In this paper, we focus on adapting and empowering a pure large language model for zero-shot and few-shot recommendation tasks. First and foremost, we identify and formulate the lifelong sequential behavior incomprehension problem for LLMs in recommendation domains, i.e., LLMs fail to extract useful information from a textual context of long user behavior sequence, even if the length of context is far from reaching the context limitation of LLMs. To address such an issue and improve the recommendation performance of LLMs, we propose a novel framework, namely Retrieval-enhanced Large Language models (ReLLa) for recommendation tasks in both zero-shot and few-shot settings. For zero-shot recommendation, we perform semantic user behavior retrieval (SUBR) to improve the data quality of testing samples, which greatly reduces the difficulty for LLMs to extract the essential knowledge from user behavior sequences. As for few-shot recommendation, we further design retrieval-enhanced instruction tuning (ReiT) by adopting SUBR as a data augmentation technique for training samples. Specifically, we develop a mixed training dataset consisting of both the original data samples and their retrieval-enhanced counterparts. We conduct extensive experiments on a real-world public dataset (i.e., MovieLens-1M) to demonstrate the superiority of ReLLa compared with existing baseline models, as well as its capability for lifelong sequential behavior comprehension.Comment: Under Revie

    How Can Recommender Systems Benefit from Large Language Models: A Survey

    Full text link
    Recommender systems (RS) play important roles to match users' information needs for Internet applications. In natural language processing (NLP) domains, large language model (LLM) has shown astonishing emergent abilities (e.g., instruction following, reasoning), thus giving rise to the promising research direction of adapting LLM to RS for performance enhancements and user experience improvements. In this paper, we conduct a comprehensive survey on this research direction from an application-oriented view. We first summarize existing research works from two orthogonal perspectives: where and how to adapt LLM to RS. For the "WHERE" question, we discuss the roles that LLM could play in different stages of the recommendation pipeline, i.e., feature engineering, feature encoder, scoring/ranking function, and pipeline controller. For the "HOW" question, we investigate the training and inference strategies, resulting in two fine-grained taxonomy criteria, i.e., whether to tune LLMs or not, and whether to involve conventional recommendation model (CRM) for inference. Detailed analysis and general development trajectories are provided for both questions, respectively. Then, we highlight key challenges in adapting LLM to RS from three aspects, i.e., efficiency, effectiveness, and ethics. Finally, we summarize the survey and discuss the future prospects. We also actively maintain a GitHub repository for papers and other related resources in this rising direction: https://github.com/CHIANGEL/Awesome-LLM-for-RecSys.Comment: 15 pages; 3 figures; summarization table in appendi

    CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models

    Full text link
    With the emergence of Large Language Models (LLMs), there has been a significant improvement in the programming capabilities of models, attracting growing attention from researchers. We propose CodeApex, a bilingual benchmark dataset focusing on the programming comprehension and code generation abilities of LLMs. CodeApex comprises three types of multiple-choice questions: conceptual understanding, commonsense reasoning, and multi-hop reasoning, designed to evaluate LLMs on programming comprehension tasks. Additionally, CodeApex utilizes algorithmic questions and corresponding test cases to assess the code quality generated by LLMs. We evaluate 14 state-of-the-art LLMs, including both general-purpose and specialized models. GPT exhibits the best programming capabilities, achieving approximate accuracies of 50% and 56% on the two tasks, respectively. There is still significant room for improvement in programming tasks. We hope that CodeApex can serve as a reference for evaluating the coding capabilities of LLMs, further promoting their development and growth. Datasets are released at https://github.com/APEXLAB/CodeApex.git. CodeApex submission website is https://apex.sjtu.edu.cn/codeapex/.Comment: 21 page

    Case report of a Li-Fraumeni syndrome-like phenotype with a de novo mutation in <i>CHEK2</i>

    Get PDF
    BACKGROUND: Cases of multiple tumors are rarely reported in China. In our study, a 57-year-old female patient had concurrent squamous cell carcinoma, mucoepidermoid carcinoma, brain cancer, bone cancer, and thyroid cancer, which has rarely been reported to date. METHODS: To determine the relationship among these multiple cancers, available DNA samples from the thyroid, lung, and skin tumors and from normal thyroid tissue were sequenced using whole exome sequencing. RESULTS: The notable discrepancies of somatic mutations among the 3 tumor tissues indicated that they arose independently, rather than metastasizing from 1 tumor. A novel deleterious germline mutation (chr22:29091846, G->A, p.H371Y) was identified in CHEK2, a Li–Fraumeni syndrome causal gene. Examining the status of this novel mutation in the patient's healthy siblings revealed its de novo origin. CONCLUSION: Our study reports the first case of Li–Fraumeni syndrome-like in Chinese patients and demonstrates the important contribution of de novo mutations in this type of rare disease

    Full-length single-cell RNA-seq applied to a viral human cancer:applications to HPV expression and splicing analysis in HeLa S3 cells

    Get PDF
    Background: Viral infection causes multiple forms of human cancer, and HPV infection is the primary factor in cervical carcinomas Recent single-cell RNA-seq studies highlight the tumor heterogeneity present in most cancers, but virally induced tumors have not been studied HeLa is a well characterized HPV+ cervical cancer cell line Result: We developed a new high throughput platform to prepare single-cell RNA on a nanoliter scale based on a customized microwell chip Using this method, we successfully amplified full-length transcripts of 669 single HeLa S3 cells and 40 of them were randomly selected to perform single-cell RNA sequencing Based on these data, we obtained a comprehensive understanding of the heterogeneity of HeLa S3 cells in gene expression, alternative splicing and fusions Furthermore, we identified a high diversity of HPV-18 expression and splicing at the single-cell level By co-expression analysis we identified 283 E6, E7 co-regulated genes, including CDC25, PCNA, PLK4, BUB1B and IRF1 known to interact with HPV viral proteins Conclusion: Our results reveal the heterogeneity of a virus-infected cell line It not only provides a transcriptome characterization of HeLa S3 cells at the single cell level, but is a demonstration of the power of single cell RNA-seq analysis of virally infected cells and cancers

    Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference

    No full text
    While increasingly deep networks are still in general desired for achieving state-of-the-art performance, for many specific inputs a simpler network might already suffice. Existing works exploited this observation by learning to skip convolutional layers in an input-dependent manner. However, we argue their binary decision scheme, i.e., either fully executing or completely bypassing one layer for a specific input, can be enhanced by introducing finer-grained, “softer” decisions. We therefore propose a Dynamic Fractional Skipping (DFS) framework. The core idea of DFS is to hypothesize layer-wise quantization (to different bitwidths) as intermediate “soft” choices to be made between fully utilizing and skipping a layer. For each input, DFS dynamically assigns a bitwidth to both weights and activations of each layer, where fully executing and skipping could be viewed as two “extremes” (i.e., full bitwidth and zero bitwidth). In this way, DFS can “fractionally” exploit a layer's expressive power during input-adaptive inference, enabling finer-grained accuracy-computational cost trade-offs. It presents a unified view to link input-adaptive layer skipping and input-adaptive hybrid quantization. Extensive experimental results demonstrate the superior tradeoff between computational cost and model expressive power (accuracy) achieved by DFS. More visualizations also indicate a smooth and consistent transition in the DFS behaviors, especially the learned choices between layer skipping and different quantizations when the total computational budgets vary, validating our hypothesis that layer quantization could be viewed as intermediate variants of layer skipping. Our source code and supplementary material are available at https://github.com/Torment123/DFS

    Investigation of Sandstone Mesostructure Damage Caused by Freeze-Thaw Cycles via CT Image Enhancement Technology

    No full text
    The mesostructures of rocks determine their macromechanical properties. These rock mesostructures may be altered by the freeze-thaw cycles in cold regions. In this regard, this paper proposes a quantitative evaluation method based on computed tomography (CT) scanning technology for investigating the mesostructure and damage characteristics of sandstone subjected to freeze-thaw conditions. CT scan images of two sandstones with different grain sizes were obtained after 0, 20, 40, 60, 80, and 100 freeze-thaw cycles, using a high-precision CT scanner. Based on the microphysical information contained in these CT images, pseudo-color-enhancement of the CT images of rocks subjected to freeze-thaw cycles was realized. The use of such a pseudo-color-enhancement technique can improve the resolution of CT images. Thus, particle detachment, crack initiation, crack propagation, and increased porosity due to the volumetric expansion of water inside the rock could be detected and clearly observed. Furthermore, a numerical expression for the mesostructure and damage information contained in the pseudo-color-enhanced images is presented herein; this serves as a convenient method for quantitative analyses of sandstone damage under freeze-thaw cycles. An analysis of the pseudo-color-enhanced images shows that, under freeze-thaw cycles, damage propagation in sandstone originates from existing damage or defect sites. After the stages of crack (pore) formation, penetration, and propagation, the freeze-thaw cycle-induced damage increases gradually, while the effective bearing area of the rock decreases continuously. Herein, a schematic of a conceptual model for the freeze-thaw cycle-induced deterioration in sandstone mesostructures is presented. Damage propagation models for sandstones with two different grain sizes subjected to freeze-thaw cycles were also developed. Based on the damage mechanics theory, a damage variable expressed in terms of the pore area was defined. Moreover, the relationship between this damage variable and the freeze-thaw cycles was established
    corecore