33 research outputs found
Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference
While increasingly deep networks are still in general desired for achieving
state-of-the-art performance, for many specific inputs a simpler network might
already suffice. Existing works exploited this observation by learning to skip
convolutional layers in an input-dependent manner. However, we argue their
binary decision scheme, i.e., either fully executing or completely bypassing
one layer for a specific input, can be enhanced by introducing finer-grained,
"softer" decisions. We therefore propose a Dynamic Fractional Skipping (DFS)
framework. The core idea of DFS is to hypothesize layer-wise quantization (to
different bitwidths) as intermediate "soft" choices to be made between fully
utilizing and skipping a layer. For each input, DFS dynamically assigns a
bitwidth to both weights and activations of each layer, where fully executing
and skipping could be viewed as two "extremes" (i.e., full bitwidth and zero
bitwidth). In this way, DFS can "fractionally" exploit a layer's expressive
power during input-adaptive inference, enabling finer-grained
accuracy-computational cost trade-offs. It presents a unified view to link
input-adaptive layer skipping and input-adaptive hybrid quantization. Extensive
experimental results demonstrate the superior tradeoff between computational
cost and model expressive power (accuracy) achieved by DFS. More visualizations
also indicate a smooth and consistent transition in the DFS behaviors,
especially the learned choices between layer skipping and different
quantizations when the total computational budgets vary, validating our
hypothesis that layer quantization could be viewed as intermediate variants of
layer skipping. Our source code and supplementary material are available at
\link{https://github.com/Torment123/DFS}
Dual Dynamic Inference: Enabling More Efficient, Adaptive and Controllable Deep Inference
State-of-the-art convolutional neural networks (CNNs) yield record-breaking
predictive performance, yet at the cost of high-energy-consumption inference,
that prohibits their widely deployments in resource-constrained Internet of
Things (IoT) applications. We propose a dual dynamic inference (DDI) framework
that highlights the following aspects: 1) we integrate both input-dependent and
resource-dependent dynamic inference mechanisms under a unified framework in
order to fit the varying IoT resource requirements in practice. DDI is able to
both constantly suppress unnecessary costs for easy samples, and to halt
inference for all samples to meet hard resource constraints enforced; 2) we
propose a flexible multi-grained learning to skip (MGL2S) approach for
input-dependent inference which allows simultaneous layer-wise and channel-wise
skipping; 3) we extend DDI to complex CNN backbones such as DenseNet and show
that DDI can be applied towards optimizing any specific resource goals
including inference latency or energy cost. Extensive experiments demonstrate
the superior inference accuracy-resource trade-off achieved by DDI, as well as
the flexibility to control such trade-offs compared to existing peer methods.
Specifically, DDI can achieve up to 4 times computational savings with the same
or even higher accuracy as compared to existing competitive baselines
ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation
With large language models (LLMs) achieving remarkable breakthroughs in
natural language processing (NLP) domains, LLM-enhanced recommender systems
have received much attention and have been actively explored currently. In this
paper, we focus on adapting and empowering a pure large language model for
zero-shot and few-shot recommendation tasks. First and foremost, we identify
and formulate the lifelong sequential behavior incomprehension problem for LLMs
in recommendation domains, i.e., LLMs fail to extract useful information from a
textual context of long user behavior sequence, even if the length of context
is far from reaching the context limitation of LLMs. To address such an issue
and improve the recommendation performance of LLMs, we propose a novel
framework, namely Retrieval-enhanced Large Language models (ReLLa) for
recommendation tasks in both zero-shot and few-shot settings. For zero-shot
recommendation, we perform semantic user behavior retrieval (SUBR) to improve
the data quality of testing samples, which greatly reduces the difficulty for
LLMs to extract the essential knowledge from user behavior sequences. As for
few-shot recommendation, we further design retrieval-enhanced instruction
tuning (ReiT) by adopting SUBR as a data augmentation technique for training
samples. Specifically, we develop a mixed training dataset consisting of both
the original data samples and their retrieval-enhanced counterparts. We conduct
extensive experiments on a real-world public dataset (i.e., MovieLens-1M) to
demonstrate the superiority of ReLLa compared with existing baseline models, as
well as its capability for lifelong sequential behavior comprehension.Comment: Under Revie
Towards Open-World Recommendation with Knowledge Augmentation from Large Language Models
Recommender systems play a vital role in various online services. However,
the insulated nature of training and deploying separately within a specific
domain limits their access to open-world knowledge. Recently, the emergence of
large language models (LLMs) has shown promise in bridging this gap by encoding
extensive world knowledge and demonstrating reasoning capability. Nevertheless,
previous attempts to directly use LLMs as recommenders have not achieved
satisfactory results. In this work, we propose an Open-World Knowledge
Augmented Recommendation Framework with Large Language Models, dubbed KAR, to
acquire two types of external knowledge from LLMs -- the reasoning knowledge on
user preferences and the factual knowledge on items. We introduce factorization
prompting to elicit accurate reasoning on user preferences. The generated
reasoning and factual knowledge are effectively transformed and condensed into
augmented vectors by a hybrid-expert adaptor in order to be compatible with the
recommendation task. The obtained vectors can then be directly used to enhance
the performance of any recommendation model. We also ensure efficient inference
by preprocessing and prestoring the knowledge from the LLM. Extensive
experiments show that KAR significantly outperforms the state-of-the-art
baselines and is compatible with a wide range of recommendation algorithms. We
deploy KAR to Huawei's news and music recommendation platforms and gain a 7\%
and 1.7\% improvement in the online A/B test, respectively
How Can Recommender Systems Benefit from Large Language Models: A Survey
Recommender systems (RS) play important roles to match users' information
needs for Internet applications. In natural language processing (NLP) domains,
large language model (LLM) has shown astonishing emergent abilities (e.g.,
instruction following, reasoning), thus giving rise to the promising research
direction of adapting LLM to RS for performance enhancements and user
experience improvements. In this paper, we conduct a comprehensive survey on
this research direction from an application-oriented view. We first summarize
existing research works from two orthogonal perspectives: where and how to
adapt LLM to RS. For the "WHERE" question, we discuss the roles that LLM could
play in different stages of the recommendation pipeline, i.e., feature
engineering, feature encoder, scoring/ranking function, and pipeline
controller. For the "HOW" question, we investigate the training and inference
strategies, resulting in two fine-grained taxonomy criteria, i.e., whether to
tune LLMs or not, and whether to involve conventional recommendation model
(CRM) for inference. Detailed analysis and general development trajectories are
provided for both questions, respectively. Then, we highlight key challenges in
adapting LLM to RS from three aspects, i.e., efficiency, effectiveness, and
ethics. Finally, we summarize the survey and discuss the future prospects. We
also actively maintain a GitHub repository for papers and other related
resources in this rising direction:
https://github.com/CHIANGEL/Awesome-LLM-for-RecSys.Comment: 15 pages; 3 figures; summarization table in appendi
CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models
With the emergence of Large Language Models (LLMs), there has been a
significant improvement in the programming capabilities of models, attracting
growing attention from researchers. We propose CodeApex, a bilingual benchmark
dataset focusing on the programming comprehension and code generation abilities
of LLMs. CodeApex comprises three types of multiple-choice questions:
conceptual understanding, commonsense reasoning, and multi-hop reasoning,
designed to evaluate LLMs on programming comprehension tasks. Additionally,
CodeApex utilizes algorithmic questions and corresponding test cases to assess
the code quality generated by LLMs. We evaluate 14 state-of-the-art LLMs,
including both general-purpose and specialized models. GPT exhibits the best
programming capabilities, achieving approximate accuracies of 50% and 56% on
the two tasks, respectively. There is still significant room for improvement in
programming tasks. We hope that CodeApex can serve as a reference for
evaluating the coding capabilities of LLMs, further promoting their development
and growth. Datasets are released at https://github.com/APEXLAB/CodeApex.git.
CodeApex submission website is https://apex.sjtu.edu.cn/codeapex/.Comment: 21 page
Case report of a Li-Fraumeni syndrome-like phenotype with a de novo mutation in <i>CHEK2</i>
BACKGROUND: Cases of multiple tumors are rarely reported in China. In our study, a 57-year-old female patient had concurrent squamous cell carcinoma, mucoepidermoid carcinoma, brain cancer, bone cancer, and thyroid cancer, which has rarely been reported to date. METHODS: To determine the relationship among these multiple cancers, available DNA samples from the thyroid, lung, and skin tumors and from normal thyroid tissue were sequenced using whole exome sequencing. RESULTS: The notable discrepancies of somatic mutations among the 3 tumor tissues indicated that they arose independently, rather than metastasizing from 1 tumor. A novel deleterious germline mutation (chr22:29091846, G->A, p.H371Y) was identified in CHEK2, a Li–Fraumeni syndrome causal gene. Examining the status of this novel mutation in the patient's healthy siblings revealed its de novo origin. CONCLUSION: Our study reports the first case of Li–Fraumeni syndrome-like in Chinese patients and demonstrates the important contribution of de novo mutations in this type of rare disease
Full-length single-cell RNA-seq applied to a viral human cancer:applications to HPV expression and splicing analysis in HeLa S3 cells
Background: Viral infection causes multiple forms of human cancer, and HPV infection is the primary factor in cervical carcinomas Recent single-cell RNA-seq studies highlight the tumor heterogeneity present in most cancers, but virally induced tumors have not been studied HeLa is a well characterized HPV+ cervical cancer cell line Result: We developed a new high throughput platform to prepare single-cell RNA on a nanoliter scale based on a customized microwell chip Using this method, we successfully amplified full-length transcripts of 669 single HeLa S3 cells and 40 of them were randomly selected to perform single-cell RNA sequencing Based on these data, we obtained a comprehensive understanding of the heterogeneity of HeLa S3 cells in gene expression, alternative splicing and fusions Furthermore, we identified a high diversity of HPV-18 expression and splicing at the single-cell level By co-expression analysis we identified 283 E6, E7 co-regulated genes, including CDC25, PCNA, PLK4, BUB1B and IRF1 known to interact with HPV viral proteins Conclusion: Our results reveal the heterogeneity of a virus-infected cell line It not only provides a transcriptome characterization of HeLa S3 cells at the single cell level, but is a demonstration of the power of single cell RNA-seq analysis of virally infected cells and cancers
Multiple Sulfur Isotopes of Iron Sulfides From Thick Greigite‐Bearing Sediments Indicate Anaerobic Oxidation and Possible Leakages of Coastal Methane
Abstract Magnetic greigite may be a valuable indicator for methane emissions in the geological past, if its formation pathway and diagenetic environment can be unambiguously defined. Here, we investigate sulfur isotopic compositions of iron sulfides and ferrous iron concentrations of thick greigite‐bearing sediments (TGBSs) in the South Yellow Sea, a shallow marginal sea with strong methane emissions. For the first time, isotopically heavy iron sulfides (up to 28.7‰ in δ34S and 0.19‰ in Δ33S) and enrichments of ferrous iron in the TGBSs are observed. We interpret the data as indicating synchronic occurrences of anaerobic oxidation of methane coupled to sulfate and iron reductions, which occur in coastal methanic zones with limited sulfate availability and therefore probably imply leakages of methane. Consequently, we suggest that greigite is a promising geological indicator for tracing methane liberated from coastal sediments, accounting for ∼60% of the currently rising global marine methane budget