195 research outputs found
Parameter-free Dynamic Graph Embedding for Link Prediction
Dynamic interaction graphs have been widely adopted to model the evolution of
user-item interactions over time. There are two crucial factors when modelling
user preferences for link prediction in dynamic interaction graphs: 1)
collaborative relationship among users and 2) user personalized interaction
patterns. Existing methods often implicitly consider these two factors
together, which may lead to noisy user modelling when the two factors diverge.
In addition, they usually require time-consuming parameter learning with
back-propagation, which is prohibitive for real-time user preference modelling.
To this end, this paper proposes FreeGEM, a parameter-free dynamic graph
embedding method for link prediction. Firstly, to take advantage of the
collaborative relationships, we propose an incremental graph embedding engine
to obtain user/item embeddings, which is an Online-Monitor-Offline architecture
consisting of an Online module to approximately embed users/items over time, a
Monitor module to estimate the approximation error in real time and an Offline
module to calibrate the user/item embeddings when the online approximation
errors exceed a threshold. Meanwhile, we integrate attribute information into
the model, which enables FreeGEM to better model users belonging to some under
represented groups. Secondly, we design a personalized dynamic interaction
pattern modeller, which combines dynamic time decay with attention mechanism to
model user short-term interests. Experimental results on two link prediction
tasks show that FreeGEM can outperform the state-of-the-art methods in accuracy
while achieving over 36X improvement in efficiency. All code and datasets can
be found in https://github.com/FudanCISL/FreeGEM.Comment: 19 pages, 9 figures, 13 tables, Thirty-Sixth Conference on Neural
Information Processing Systems (NeurIPS 2022), preprint versio
Enhancing CTR Prediction with Context-Aware Feature Representation Learning
CTR prediction has been widely used in the real world. Many methods model
feature interaction to improve their performance. However, most methods only
learn a fixed representation for each feature without considering the varying
importance of each feature under different contexts, resulting in inferior
performance. Recently, several methods tried to learn vector-level weights for
feature representations to address the fixed representation issue. However,
they only produce linear transformations to refine the fixed feature
representations, which are still not flexible enough to capture the varying
importance of each feature under different contexts. In this paper, we propose
a novel module named Feature Refinement Network (FRNet), which learns
context-aware feature representations at bit-level for each feature in
different contexts. FRNet consists of two key components: 1) Information
Extraction Unit (IEU), which captures contextual information and cross-feature
relationships to guide context-aware feature refinement; and 2) Complementary
Selection Gate (CSGate), which adaptively integrates the original and
complementary feature representations learned in IEU with bit-level weights.
Notably, FRNet is orthogonal to existing CTR methods and thus can be applied in
many existing methods to boost their performance. Comprehensive experiments are
conducted to verify the effectiveness, efficiency, and compatibility of FRNet.Comment: SIGIR 202
AutoSeqRec: Autoencoder for Efficient Sequential Recommendation
Sequential recommendation demonstrates the capability to recommend items by
modeling the sequential behavior of users. Traditional methods typically treat
users as sequences of items, overlooking the collaborative relationships among
them. Graph-based methods incorporate collaborative information by utilizing
the user-item interaction graph. However, these methods sometimes face
challenges in terms of time complexity and computational efficiency. To address
these limitations, this paper presents AutoSeqRec, an incremental
recommendation model specifically designed for sequential recommendation tasks.
AutoSeqRec is based on autoencoders and consists of an encoder and three
decoders within the autoencoder architecture. These components consider both
the user-item interaction matrix and the rows and columns of the item
transition matrix. The reconstruction of the user-item interaction matrix
captures user long-term preferences through collaborative filtering. In
addition, the rows and columns of the item transition matrix represent the item
out-degree and in-degree hopping behavior, which allows for modeling the user's
short-term interests. When making incremental recommendations, only the input
matrices need to be updated, without the need to update parameters, which makes
AutoSeqRec very efficient. Comprehensive evaluations demonstrate that
AutoSeqRec outperforms existing methods in terms of accuracy, while showcasing
its robustness and efficiency.Comment: 10 pages, accepted by CIKM 202
A Comprehensive Summarization and Evaluation of Feature Refinement Modules for CTR Prediction
Click-through rate (CTR) prediction is widely used in academia and industry.
Most CTR tasks fall into a feature embedding \& feature interaction paradigm,
where the accuracy of CTR prediction is mainly improved by designing practical
feature interaction structures. However, recent studies have argued that the
fixed feature embedding learned only through the embedding layer limits the
performance of existing CTR models. Some works apply extra modules on top of
the embedding layer to dynamically refine feature representations in different
instances, making it effective and easy to integrate with existing CTR methods.
Despite the promising results, there is a lack of a systematic review and
summarization of this new promising direction on the CTR task. To fill this
gap, we comprehensively summarize and define a new module, namely
\textbf{feature refinement} (FR) module, that can be applied between feature
embedding and interaction layers. We extract 14 FR modules from previous works,
including instances where the FR module was proposed but not clearly defined or
explained. We fully assess the effectiveness and compatibility of existing FR
modules through comprehensive and extensive experiments with over 200 augmented
models and over 4,000 runs for more than 15,000 GPU hours. The results offer
insightful guidelines for researchers, and all benchmarking code and
experimental results are open-sourced. In addition, we present a new
architecture of assigning independent FR modules to separate sub-networks for
parallel CTR models, as opposed to the conventional method of inserting a
shared FR module on top of the embedding layer. Our approach is also supported
by comprehensive experiments demonstrating its effectiveness
Backdoor Attack with Mode Mixture Latent Modification
Backdoor attacks become a significant security concern for deep neural
networks in recent years. An image classification model can be compromised if
malicious backdoors are injected into it. This corruption will cause the model
to function normally on clean images but predict a specific target label when
triggers are present. Previous research can be categorized into two genres:
poisoning a portion of the dataset with triggered images for users to train the
model from scratch, or training a backdoored model alongside a triggered image
generator. Both approaches require significant amount of attackable parameters
for optimization to establish a connection between the trigger and the target
label, which may raise suspicions as more people become aware of the existence
of backdoor attacks. In this paper, we propose a backdoor attack paradigm that
only requires minimal alterations (specifically, the output layer) to a clean
model in order to inject the backdoor under the guise of fine-tuning. To
achieve this, we leverage mode mixture samples, which are located between
different modes in latent space, and introduce a novel method for conducting
backdoor attacks. We evaluate the effectiveness of our method on four popular
benchmark datasets: MNIST, CIFAR-10, GTSRB, and TinyImageNet
The role of GLI2-ABCG2 signaling axis for 5Fu resistance in gastric cancer
Gastric cancer is a leading cause of cancer-related mortality worldwide, and options to treat gastric cancer are limited. Fluorouracil (5Fu)-based chemotherapy is frequently used as a neoadjuvant or an adjuvant agent for gastric cancer therapy. Most patients with advanced gastric cancer eventually succumb to the disease despite the fact that some patients respond initially to chemotherapy. Thus, identifying molecular mechanisms responsible for chemotherapy resistance will help design novel strategies to treat gastric cancer. In this study, we discovered that residual cancer cells following 5Fu treatment have elevated expression of hedgehog (Hg) target genes GLI1 and GLI2, suggestive of Hh signaling activation. Hh signaling, a pathway essential for embryonic development, is an important regulator for putative cancer stem cells/residual cancer cells. We found that high GLI1/GLI2 expression is associated with some features of putative cancer stem cells, such as increased side population. We demonstrated that GLI2 knockdown sensitized gastric cancer cells to 5Fu treatment, decreased ABCG2 expression, and reduced side population. Elevated GLI2 expression is also associated with an increase in tumor sphere size, another marker for putative cancer stem cells. We believe that GLI2 regulates putative cancer stem cells through direct regulation of ABCG2. ABCG2 can rescue the GLI2 shRNA effects in 5Fu response, tumor sphere formation and side population changes, suggesting that ABCG2 is an important mediator for GLI2-associated 5Fu resistance. The relevance of our studies to gastric cancer patient care is reflected by our discovery that high GLI1/GLI2/ABCG2 expression is associated with a high incidence of cancer relapse in two cohorts of gastric cancer patients who underwent chemotherapy (containing 5Fu). Taken together, we have identified a molecular mechanism by which gastric cancer cells gain 5Fu resistance
DocGraphLM: Documental Graph Language Model for Information Extraction
Advances in Visually Rich Document Understanding (VrDU) have enabled
information extraction and question answering over documents with complex
layouts. Two tropes of architectures have emerged -- transformer-based models
inspired by LLMs, and Graph Neural Networks. In this paper, we introduce
DocGraphLM, a novel framework that combines pre-trained language models with
graph semantics. To achieve this, we propose 1) a joint encoder architecture to
represent documents, and 2) a novel link prediction approach to reconstruct
document graphs. DocGraphLM predicts both directions and distances between
nodes using a convergent joint loss function that prioritizes neighborhood
restoration and downweighs distant node detection. Our experiments on three
SotA datasets show consistent improvement on IE and QA tasks with the adoption
of graph features. Moreover, we report that adopting the graph features
accelerates convergence in the learning process during training, despite being
solely constructed through link prediction.Comment: Published at SIGIR'23 (repost for easier access
GLI1-mediated regulation of side population is responsible for drug resistance in gastric cancer
Gastric cancer is the third leading cause of cancer-related mortality worldwide. Chemotherapy is frequently used for gastric cancer treatment. Most patients with advanced gastric cancer eventually succumb to the disease despite some patients responded initially to chemotherapy. Thus, identifying molecular mechanisms responsible for cancer relapse following chemotherapy will help design new ways to treat gastric cancer. In this study, we revealed that the residual cancer cells following treatment with chemotherapeutic reagent cisplatin have elevated expression of hedgehog target genes GLI1, GLI2 and PTCH1, suggestive of hedgehog signaling activation. We showed that GLI1 knockdown sensitized gastric cancer cells to CDDP whereas ectopic GLI1 expression decreased the sensitivity. Further analyses indicate elevated GLI1 expression is associated with an increase in tumor sphere formation, side population and cell surface markers for putative cancer stem cells. We have evidence to support that GLI1 is critical for maintenance of putative cancer stem cells through direct regulation of ABCG2. In fact, GLI1 protein was shown to be associated with the promoter fragment of ABCG2 through a Gli-binding consensus site in gastric cancer cells. Disruption of ABCG2 function, through ectopic expression of an ABCG2 dominant negative construct or a specific ABCG2 inhibitor, increased drug sensitivity of cancer cells both in culture and in mice. The relevance of our studies to gastric cancer patient care is reflected by our discovery that high ABCG2 expression was associated with poor survival in the gastric cancer patients who underwent chemotherapy. Taken together, we have identified a molecular mechanism by which gastric cancer cells gain chemotherapy resistance
Expression of fatty acid and lipid biosynthetic genes in developing endosperm of Jatropha curcas
BACKGROUND: Temporal and spatial expression of fatty acid and lipid biosynthetic genes are associated with the accumulation of storage lipids in the seeds of oil plants. In jatropha (Jatropha curcas L.), a potential biofuel plant, the storage lipids are mainly synthesized and accumulated in the endosperm of seeds. Although the fatty acid and lipid biosynthetic genes in jatropha have been identified, the expression of these genes at different developing stages of endosperm has not been systemically investigated. RESULTS: Transmission electron microscopy study revealed that the oil body formation in developing endosperm of jatropha seeds initially appeared at 28 days after fertilization (DAF), was actively developed at 42 DAF and reached to the maximum number and size at 56 DAF. Sixty-eight genes that encode enzymes, proteins or their subunits involved in fatty acid and lipid biosynthesis were identified from a normalized cDNA library of jatropha developing endosperm. Gene expression with quantitative reverse-transcription polymerase chain reaction analysis demonstrated that the 68 genes could be collectively grouped into five categories based on the patterns of relative expression of the genes during endosperm development. Category I has 47 genes and they displayed a bell-shaped expression pattern with the peak expression at 28 or 42 DAF, but low expression at 14 and 56 DAF. Category II contains 8 genes and expression of the 8 genes was constantly increased from 14 to 56 DAF. Category III comprises of 2 genes and both genes were constitutively expressed throughout endosperm development. Category IV has 9 genes and they showed a high expression at 14 and 28 DAF, but a decreased expression from 42 to 56 DAF. Category V consists of 2 genes and both genes showed a medium expression at 14 DAF, the lowest expression at 28 or 42 DAF, and the highest expression at 56 DAF. In addition, genes encoding enzymes or proteins with similar function were differentially expressed during endosperm development. CONCLUSION: The formation of oil bodies in jatropha endosperm is developmentally regulated. The expression of the majority of fatty acid and lipid biosynthetic genes is highly consistent with the development of oil bodies and endosperm in jatropha seeds, while the genes encoding enzymes with similar function may be differentially expressed during endosperm development. These results not only provide the initial information on spatial and temporal expression of fatty acid and lipid biosynthetic genes in jatropha developing endosperm, but are also valuable to identify the rate-limiting genes for storage lipid biosynthesis and accumulation during seed development
- …