Search CORE

57 research outputs found

Self-Supervised Pretraining for Heterogeneous Hypergraph Neural Networks

Author: Abubaker Abdalgader
Maehara Takanori
Nimishakavi Madhav
Plachouras Vassilis
Publication venue
Publication date: 19/11/2023
Field of study

Recently, pretraining methods for the Graph Neural Networks (GNNs) have been successful at learning effective representations from unlabeled graph data. However, most of these methods rely on pairwise relations in the graph and do not capture the underling higher-order relations between entities. Hypergraphs are versatile and expressive structures that can effectively model higher-order relationships among entities in the data. Despite the efforts to adapt GNNs to hypergraphs (HyperGNN), there are currently no fully self-supervised pretraining methods for HyperGNN on heterogeneous hypergraphs. In this paper, we present SPHH, a novel self-supervised pretraining framework for heterogeneous HyperGNNs. Our method is able to effectively capture higher-order relations among entities in the data in a self-supervised manner. SPHH is consist of two self-supervised pretraining tasks that aim to simultaneously learn both local and global representations of the entities in the hypergraph by using informative representations derived from the hypergraph structure. Overall, our work presents a significant advancement in the field of self-supervised pretraining of HyperGNNs, and has the potential to improve the performance of various graph-based downstream tasks such as node classification and link prediction tasks which are mapped to hypergraph configuration. Our experiments on two real-world benchmarks using four different HyperGNN models show that our proposed SPHH framework consistently outperforms state-of-the-art baselines in various downstream tasks. The results demonstrate that SPHH is able to improve the performance of various HyperGNN models in various downstream tasks, regardless of their architecture or complexity, which highlights the robustness of our framework

arXiv.org e-Print Archive

Is Meta-Learning the Right Approach for the Cold-Start Problem in Recommender Systems?

Author: Buffelli Davide
Gupta Ashish
Plachouras Vassilis
Strzalka Agnieszka
Publication venue
Publication date: 16/08/2023
Field of study

Recommender systems have become fundamental building blocks of modern online products and services, and have a substantial impact on user experience. In the past few years, deep learning methods have attracted a lot of research, and are now heavily used in modern real-world recommender systems. Nevertheless, dealing with recommendations in the cold-start setting, e.g., when a user has done limited interactions in the system, is a problem that remains far from solved. Meta-learning techniques, and in particular optimization-based meta-learning, have recently become the most popular approaches in the academic research literature for tackling the cold-start problem in deep learning models for recommender systems. However, current meta-learning approaches are not practical for real-world recommender systems, which have billions of users and items, and strict latency requirements. In this paper we show that it is possible to obtaining similar, or higher, performance on commonly used benchmarks for the cold-start problem without using meta-learning techniques. In more detail, we show that, when tuned correctly, standard and widely adopted deep learning models perform just as well as newer meta-learning models. We further show that an extremely simple modular approach using common representation learning techniques, can perform comparably to meta-learning techniques specifically designed for the cold-start setting while being much more easily deployable in real-world applications

arXiv.org e-Print Archive

a comparison of two paraphrase models for taxonomy augmentation

Author: Fabio Petroni
Jochen L. Leidner
Timothy Nugent
Vassilis Plachouras
Publication venue
Publication date: 01/01/2018
Field of study

Crossref

Open Access Repository

attr2vec jointly learning word and contextual attribute embeddings with factorization machines

Author: Fabio Petroni
Jochen L. Leidner
Timothy Nugent
Vassilis Plachouras
Publication venue
Publication date: 01/01/2018
Field of study

Crossref

Open Access Repository

Concept Matching for Low-Resource Classification

Author: Denoyer Ludovic
Edizel Bora
Errica Federico
Petroni Fabio
Plachouras Vassilis
Riedel Sebastian
Silvestri Fabrizio
Publication venue
Publication date: 01/06/2020
Field of study

We propose a model to tackle classification tasks in the presence of very little training data. To this aim, we approximate the notion of exact match with a theoretically sound mechanism that computes a probability of matching in the input space. Importantly, the model learns to focus on elements of the input that are relevant for the task at hand; by leveraging highlighted portions of the training data, an error boosting technique guides the learning process. In practice, it increases the error associated with relevant parts of the input by a given factor. Remarkable results on text classification tasks confirm the benefits of the proposed approach in both balanced and unbalanced cases, thus being of practical use when labeling new examples is expensive. In addition, by inspecting its weights, it is often possible to gather insights on what the model has learned

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

How Decoding Strategies Affect the Verifiability of Generated Text

Author: Massarelli Luca
Ott Myle
Petroni Fabio
Piktus Aleksandra
Plachouras Vassilis
Riedel Sebastian
Rocktäschel Tim
Silvestri Fabrizio
Publication venue
Publication date: 20/11/2019
Field of study

Recent progress in pre-trained language models led to systems that are able to generate text of an increasingly high quality. While several works have investigated the fluency and grammatical correctness of such models, it is still unclear to which extent the generated text is consistent with factual world knowledge. Here, we go beyond fluency and also investigate the verifiability of text generated by state-of-the-art pre-trained language models. A generated sentence is verifiable if it can be corroborated or disproved by Wikipedia, and we find that the verifiability of generated text strongly depends on the decoding strategy. In particular, we discover a tradeoff between factuality (i.e., the ability of generating Wikipedia corroborated text) and repetitiveness. While decoding strategies such as top-k and nucleus sampling lead to less repetitive generations, they also produce less verifiable text. Based on these finding, we introduce a simple and effective decoding strategy which, in comparison to previously used decoding strategies, produces less repetitive and more verifiable text.Comment: accepted at Findings of EMNLP 202

arXiv.org e-Print Archive

Crossref

UCL Discovery

Archivio della ricerca- Università di Roma La Sapienza

How Decoding Strategies Affect the Verifiability of Generated Text

Author: Aleksandra Piktus
Fabio Petroni
Fabrizio Silvestri
Luca Massarelli
Myle Ott
Sebastian Riedel
Tim Rocktaschel
Vassilis Plachouras
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

Crossref

Archivio della ricerca- Università di Roma La Sapienza

KILT: a Benchmark for Knowledge Intensive Language Tasks

Author: De Cao Nicola
Fan Angela
Jernite Yacine
Karpukhin Vladimir
Lewis Patrick
Maillard Jean
Petroni Fabio
Piktus Aleksandra
Plachouras Vassilis
Riedel Sebastian
Rocktäschel Tim
Thorne James
Yazdani Majid
Publication venue
Publication date: 12/04/2021
Field of study

Challenging problems such as open-domain question answering, fact checking, slot filling and entity linking require access to large, external knowledge sources. While some models do well on individual tasks, developing general models is difficult as each task might require computationally expensive indexing of custom knowledge sources, in addition to dedicated infrastructure. To catalyze research on models that condition on specific information in large textual resources, we present a benchmark for knowledge-intensive language tasks (KILT). All tasks in KILT are grounded in the same snapshot of Wikipedia, reducing engineering turnaround through the re-use of components, as well as accelerating research into task-agnostic memory architectures. We test both task-specific and general baselines, evaluating downstream performance in addition to the ability of the models to provide provenance. We find that a shared dense vector index coupled with a seq2seq model is a strong baseline, outperforming more tailor-made approaches for fact checking, open-domain question answering and dialogue, and yielding competitive results on entity linking and slot filling, by generating disambiguated text. KILT data and code are available at https://github.com/facebookresearch/KILT.Comment: accepted at NAACL 202

arXiv.org e-Print Archive

UCL Discovery

ARCOMEM Crawling Architecture

Author: Vassilis Plachouras
Publication venue: 'MDPI AG'
Publication date: 19/08/2014
Field of study

The World Wide Web is the largest information repository available today. However, this information is very volatile and Web archiving is essential to preserve it for the future. Existing approaches to Web archiving are based on simple definitions of the scope of Web pages to crawl and are limited to basic interactions with Web servers. The aim of the ARCOMEM project is to overcome these limitations and to provide flexible, adaptive and intelligent content acquisition, relying on social media to create topical Web archives. In this article, we focus on ARCOMEM’s crawling architecture. We introduce the overall architecture and we describe its modules, such as the online analysis module, which computes a priority for the Web pages to be crawled, and the Application-Aware Helper which takes into account the type of Web sites and applications to extract structure from crawled content. We also describe a large-scale distributed crawler that has been developed, as well as the modifications we have implemented to adapt Heritrix, an open source crawler, to the needs of the project. Our experimental results from real crawls show that ARCOMEM’s crawling architecture is effective in acquiring focused information about a topic and leveraging the information from social media

Multidisciplinary Digital Publishing Institute