9 research outputs found

    Graph Kernels via Functional Embedding

    Full text link
    We propose a representation of graph as a functional object derived from the power iteration of the underlying adjacency matrix. The proposed functional representation is a graph invariant, i.e., the functional remains unchanged under any reordering of the vertices. This property eliminates the difficulty of handling exponentially many isomorphic forms. Bhattacharyya kernel constructed between these functionals significantly outperforms the state-of-the-art graph kernels on 3 out of the 4 standard benchmark graph classification datasets, demonstrating the superiority of our approach. The proposed methodology is simple and runs in time linear in the number of edges, which makes our kernel more efficient and scalable compared to many widely adopted graph kernels with running time cubic in the number of vertices

    Integrating Summarization and Retrieval for Enhanced Personalization via Large Language Models

    Full text link
    Personalization, the ability to tailor a system to individual users, is an essential factor in user experience with natural language processing (NLP) systems. With the emergence of Large Language Models (LLMs), a key question is how to leverage these models to better personalize user experiences. To personalize a language model's output, a straightforward approach is to incorporate past user data into the language model prompt, but this approach can result in lengthy inputs exceeding limitations on input length and incurring latency and cost issues. Existing approaches tackle such challenges by selectively extracting relevant user data (i.e. selective retrieval) to construct a prompt for downstream tasks. However, retrieval-based methods are limited by potential information loss, lack of more profound user understanding, and cold-start challenges. To overcome these limitations, we propose a novel summary-augmented approach by extending retrieval-augmented personalization with task-aware user summaries generated by LLMs. The summaries can be generated and stored offline, enabling real-world systems with runtime constraints like voice assistants to leverage the power of LLMs. Experiments show our method with 75% less of retrieved user data is on-par or outperforms retrieval augmentation on most tasks in the LaMP personalization benchmark. We demonstrate that offline summarization via LLMs and runtime retrieval enables better performance for personalization on a range of tasks under practical constraints.Comment: 4 pages, International Workshop on Personalized Generative AI (@CIKM 2023

    Information overload in structured data

    Get PDF
    Information overload refers to the difficulty of making decisions caused by too much information. In this dissertation, we address information overload problem in two separate structured domains, namely, graphs and text. Graph kernels have been proposed as an efficient and theoretically sound approach to compute graph similarity. They decompose graphs into certain sub-structures, such as subtrees, or subgraphs. However, existing graph kernels suffer from a few drawbacks. First, the dimension of the feature space associated with the kernel often grows exponentially as the complexity of sub-structures increase. One immediate consequence of this behavior is that small, non-informative, sub-structures occur more frequently and cause information overload. Second, as the number of features increase, we encounter sparsity: only a few informative sub-structures will co-occur in multiple graphs. In the first part of this dissertation, we propose to tackle the above problems by exploiting the dependency relationship among sub-structures. First, we propose a novel framework that learns the latent representations of sub-structures by leveraging recent advancements in deep learning. Second, we propose a general smoothing framework that takes structural similarity into account, inspired by state-of-the-art smoothing techniques used in natural language processing. Both the proposed frameworks are applicable to popular graph kernel families, and achieve significant performance improvements over state-of-the-art graph kernels. In the second part of this dissertation, we tackle information overload in text. We first focus on a popular social news aggregation website, Reddit, and design a submodular recommender system that tailors a personalized frontpage for individual users. Second, we propose a novel submodular framework to summarize videos, where both transcript and comments are available. Third, we demonstrate how to apply filtering techniques to select a small subset of informative features from virtual machine logs in order to predict resource usage

    Deep Attention Networks for Images and Graphs

    Get PDF
    Deep learning has achieved great success in various machine learning areas, such as computer vision, natural language processing, and graph representation learning. While numerous deep neural networks (DNNs) have been proposed, the set of fundamental building blocks of DNNs remains small, including fully-connected layers, convolutions and recurrent units. Recently, the attention mechanism has shown promise in serving as a new kind of fundamental building blocks. Deep attention networks (DANs), i.e. DNNs that use the attention mechanism as a fundamental building block, have revolutionized the area of natural language processing. However, developing DANs for computer vision and graph representation learning applications is still challenging. Due to the intrinsic differences in data and applications, directly migrating DANs from textual data to images and graphs is usually either infeasible or ineffective. In this dissertation, we address this challenge by analyzing the functionality of the attention mechanism and exploring scenarios where DANs can push the limits of current DNNs. We propose several effective DANs for images and graphs. For images, we build DANs for a variety of image-to-image transformation applications by proposing powerful attention-based building blocks. First, we start the exploration through studying a common problem in dilated convolutions, which naturally results in the use of the attention mechanism. Dilated convolutions, a variant of convolutions, have been widely applied in deep convolutional neural networks (DCNNs) for image segmentation. However, dilated convolutions suffer from the gridding artifacts, which hampers the performance. We propose two simple yet effective degridding methods by studying a decomposition of dilated convolutions, and generalize them by defining separable and shared (SS) operators. Then we connect the SS operators with the attention mechanism and propose the SS output layer, which is able to smooth the entire DCNNs by only replacing the output layer and improves the performance significantly. Second, we notice an interesting fact from the first study that, as the attention mechanism allows the SS output layer to have a receptive field of any size, the best performance is achieved when using a global receptive field. This fact motivates us to think of the attention mechanism as global operators, as opposed to local operators like convolutions. With this insight, we propose the non-local U-Nets, which are equipped with flexible attention-based global aggregation blocks, for biomedical image segmentation. In particular, we are the first to enable the attention mechanism for down-sampling and up-sampling processes. Finally, we go beyond biomedical image segmentation and extend the non-local U-Nets to global voxel transformer networks (GVTNets), which serve as a powerful open-source tool for 3D image-to-image transformation tasks. In addition to leveraging the non-local property of the attention mechanism under the supervised learning setting, we also investigate the generalization ability of the attention mechanism under the transfer learning setting. We perform thorough experiments on a wide range of real-world image-to-image transformation tasks, whose results clearly demonstrate the effectiveness and efficiency of our proposed DANs. For graphs, we develop DANs for both graph and node classification applications. First, we focus on graph pooling, which is necessary for graph neural networks (GNNs) to perform graph classification tasks. In particular, we point out that the second-order pooling naturally satisfies the requirement of graph pooling but encounters practical problems. To overcome these problems, we propose attentional second-order pooling. Specifically, we bridge the second-order pooling with the attention mechanism and design an attention-based pooling method that can be flexibly used as either global or hierarchical graph pooling. Second, on node classification tasks, we pay attention to the problem that most GNNs lack the ability of performing effective non-local aggregation, which greatly limits the performance on disassortative graphs. In particular, it even leads to worse performance of GNNs than simple multi-layer perceptrons on some disassortative graphs. In order to address this problem, we propose a simple yet effective non-local aggregation framework with an efficient attention-guided sorting for GNNs, based on which we develop non-local GNNs. Experimental results on various graph and node classification benchmark datasets show that our DANs improve the performance significantly and consistently