Search CORE

2,611 research outputs found

Multi-task Neural Network for Non-discrete Attribute Prediction in Knowledge Graphs

Author: Bollacker Kurt D.
Diederik
Guo Shu
Gupta Abhijeet
Hoffart Johannes
Jenatton Rodolphe
Ji Guoliang
Li Ya
Lin Yankai
Lin Yankai
Lin Yankai
Liu Pengfei
Luu Anh Tuan
Mikolov Tomas
Nickel Maximilian
Qiu Xipeng
Tay Yi
Wang Zhen
Xiao Han
Xie Ruobing
Zhong Huaping
Publication venue
Publication date: 16/08/2017
Field of study

Many popular knowledge graphs such as Freebase, YAGO or DBPedia maintain a list of non-discrete attributes for each entity. Intuitively, these attributes such as height, price or population count are able to richly characterize entities in knowledge graphs. This additional source of information may help to alleviate the inherent sparsity and incompleteness problem that are prevalent in knowledge graphs. Unfortunately, many state-of-the-art relational learning models ignore this information due to the challenging nature of dealing with non-discrete data types in the inherently binary-natured knowledge graphs. In this paper, we propose a novel multi-task neural network approach for both encoding and prediction of non-discrete attribute information in a relational setting. Specifically, we train a neural network for triplet prediction along with a separate network for attribute value regression. Via multi-task learning, we are able to learn representations of entities, relations and attributes that encode information about both tasks. Moreover, such attributes are not only central to many predictive tasks as an information source but also as a prediction target. Therefore, models that are able to encode, incorporate and predict such information in a relational learning context are highly attractive as well. We show that our approach outperforms many state-of-the-art methods for the tasks of relational triplet classification and attribute value prediction.Comment: Accepted at CIKM 201

arXiv.org e-Print Archive

Crossref

Learning Correspondence Structures for Person Re-identification

Author: Lin Weiyao
Lu Ke
Shen Yang
Wang Jingdong
Wu Jianxin
Xu Mingliang
Yan Junchi
Publication venue
Publication date: 27/04/2017
Field of study

This paper addresses the problem of handling spatial misalignments due to camera-view changes or human-pose variations in person re-identification. We first introduce a boosting-based approach to learn a correspondence structure which indicates the patch-wise matching probabilities between images from a target camera pair. The learned correspondence structure can not only capture the spatial correspondence pattern between cameras but also handle the viewpoint or human-pose variation in individual images. We further introduce a global constraint-based matching process. It integrates a global matching constraint over the learned correspondence structure to exclude cross-view misalignments during the image patch matching process, hence achieving a more reliable matching score between images. Finally, we also extend our approach by introducing a multi-structure scheme, which learns a set of local correspondence structures to capture the spatial correspondence sub-patterns between a camera pair, so as to handle the spatial misalignments between individual images in a more precise way. Experimental results on various datasets demonstrate the effectiveness of our approach.Comment: IEEE Trans. Image Processing, vol. 26, no. 5, pp. 2438-2453, 2017. The project page for this paper is available at http://min.sjtu.edu.cn/lwydemo/personReID.htm arXiv admin note: text overlap with arXiv:1504.0624

arXiv.org e-Print Archive

Equivariant Contrastive Learning for Sequential Recommendation

Author: Gao Jingqi
Hua Yining
Kim Jae Boum
Kim Sunghun
Wang Shoujin
Xie Yueqi
Ye Qichen
Zhou Peilin
Publication venue
Publication date: 28/07/2023
Field of study

Contrastive learning (CL) benefits the training of sequential recommendation models with informative self-supervision signals. Existing solutions apply general sequential data augmentation strategies to generate positive pairs and encourage their representations to be invariant. However, due to the inherent properties of user behavior sequences, some augmentation strategies, such as item substitution, can lead to changes in user intent. Learning indiscriminately invariant representations for all augmentation strategies might be suboptimal. Therefore, we propose Equivariant Contrastive Learning for Sequential Recommendation (ECL-SR), which endows SR models with great discriminative power, making the learned user behavior representations sensitive to invasive augmentations (e.g., item substitution) and insensitive to mild augmentations (e.g., featurelevel dropout masking). In detail, we use the conditional discriminator to capture differences in behavior due to item substitution, which encourages the user behavior encoder to be equivariant to invasive augmentations. Comprehensive experiments on four benchmark datasets show that the proposed ECL-SR framework achieves competitive performance compared to state-of-the-art SR models. The source code is available at https://github.com/Tokkiu/ECL.Comment: Accepted by RecSys 202

arXiv.org e-Print Archive

딥 뉴럴 네트워크 기반의 문장 인코더를 이용한 문장 간 관계 모델링

Author: 최지헌
Publication venue: 서울대학교 대학원
Publication date: 01/02/2020
Field of study

학위논문(박사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2020. 2. 이상구.문장 매칭이란 두 문장 간 의미적으로 일치하는 정도를 예측하는 문제이다. 어떤 모델이 두 문장 사이의 관계를 효과적으로 밝혀내기 위해서는 높은 수준의 자연어 텍스트 이해 능력이 필요하기 때문에, 문장 매칭은 다양한 자연어 처리 응용의 성능에 중요한 영향을 미친다. 본 학위 논문에서는 문장 인코더, 매칭 함수, 준지도 학습이라는 세 가지 측면에서 문장 매칭의 성능 개선을 모색한다. 문장 인코더란 문장으로부터 유용한 특질들을 추출하는 역할을 하는 구성 요소로, 본 논문에서는 문장 인코더의 성능 향상을 위하여 Gumbel Tree-LSTM과 Cell-aware Stacked LSTM이라는 두 개의 새로운 아키텍처를 제안한다. Gumbel Tree-LSTM은 재귀적 뉴럴 네트워크(recursive neural network) 구조에 기반한 아키텍처이다. 구조 정보가 포함된 데이터를 입력으로 사용하던 기존의 재귀적 뉴럴 네트워크 모델과 달리, Gumbel Tree-LSTM은 구조가 없는 데이터로부터 특정 문제에 대한 성능을 최대화하는 파싱 전략을 학습한다. Cell-aware Stacked LSTM은 LSTM 구조를 개선한 아키텍처로, 여러 LSTM 레이어를 중첩하여 사용할 때 망각 게이트(forget gate)를 추가적으로 도입하여 수직 방향의 정보 흐름을 더 효율적으로 제어할 수 있도록 한다. 한편, 새로운 매칭 함수로서 우리는 요소별 쌍선형 문장 매칭(element-wise bilinear sentence matching, ElBiS) 함수를 제안한다. ElBiS 알고리즘은 특정 문제를 해결하는 데에 적합한 방식으로 두 문장 표현을 하나의 벡터로 합치는 방법을 자동으로 찾는 것을 목적으로 한다. 문장 표현을 얻을 때에 서로 같은 문장 인코더를 사용한다는 사실로부터 우리는 벡터의 각 요소 간 쌍선형(bilinear) 상호 작용만을 고려하여도 두 문장 벡터 간 비교를 충분히 잘 수행할 수 있다는 가설을 수립하고 이를 실험적으로 검증한다. 상호 작용의 범위를 제한함으로써, 자동으로 유용한 병합 방법을 찾는다는 이점을 유지하면서 모든 상호 작용을 고려하는 쌍선형 풀링 방법에 비해 필요한 파라미터의 수를 크게 줄일 수 있다. 마지막으로, 학습 시 레이블이 있는 데이터와 레이블이 없는 데이터를 함께 사용하는 준지도 학습을 위해 우리는 교차 문장 잠재 변수 모델(cross-sentence latent variable model, CS-LVM)을 제안한다. CS-LVM의 생성 모델은 출처 문장(source sentence)의 잠재 표현 및 출처 문장과 목표 문장(target sentence) 간의 관계를 나타내는 변수로부터 목표 문장이 생성된다고 가정한다. CS-LVM에서는 두 문장이 하나의 모델 안에서 모두 고려되기 때문에, 학습에 사용되는 목적 함수가 더 자연스럽게 정의된다. 또한, 우리는 생성 모델의 파라미터가 더 의미적으로 적합한 문장을 생성하도록 유도하기 위하여 일련의 의미 제약들을 정의한다. 본 학위 논문에서 제안된 개선 방안들은 문장 매칭 과정을 포함하는 다양한 자연어 처리 응용의 효용성을 높일 것으로 기대된다.Sentence matching is a task of predicting the degree of match between two sentences. Since high level of understanding natural language text is needed for a model to identify the relationship between two sentences, it is an important component for various natural language processing applications. In this dissertation, we seek for the improvement of the sentence matching module from the following three ingredients: sentence encoder, matching function, and semi-supervised learning. To enhance a sentence encoder network which takes responsibility of extracting useful features from a sentence, we propose two new sentence encoder architectures: Gumbel Tree-LSTM and Cell-aware Stacked LSTM (CAS-LSTM). Gumbel Tree-LSTM is based on a recursive neural network (RvNN) architecture, however unlike typical RvNN architectures it does not need a structured input. Instead, it learns from data a parsing strategy that is optimized for a specific task. The latter, CAS-LSTM, extends the stacked long short-term memory (LSTM) architecture by introducing an additional forget gate for better handling of vertical information flow. And then, as a new matching function, we present the element-wise bilinear sentence matching (ElBiS) function. It aims to automatically find an aggregation scheme that fuses two sentence representations into a single one suitable for a specific task. From the fact that a sentence encoder is shared across inputs, we hypothesize and empirically prove that considering only the element-wise bilinear interaction is sufficient for comparing two sentence vectors. By restricting the interaction, we can largely reduce the number of required parameters compared with full bilinear pooling methods without losing the advantage of automatically discovering useful aggregation schemes. Finally, to facilitate semi-supervised training, i.e. to make use of both labeled and unlabeled data in training, we propose the cross-sentence latent variable model (CS-LVM). Its generative model assumes that a target sentence is generated from the latent representation of a source sentence and the variable indicating the relationship between the source and the target sentence. As it considers the two sentences in a pair together in a single model, the training objectives are defined more naturally than prior approaches based on the variational auto-encoder (VAE). We also define semantic constraints that force the generator to generate semantically more plausible sentences. We believe that the improvements proposed in this dissertation would advance the effectiveness of various natural language processing applications containing modeling sentence pairs.Chapter 1 Introduction 1 1.1 Sentence Matching 1 1.2 Deep Neural Networks for Sentence Matching 2 1.3 Scope of the Dissertation 4 Chapter 2 Background and Related Work 9 2.1 Sentence Encoders 9 2.2 Matching Functions 11 2.3 Semi-Supervised Training 13 Chapter 3 Sentence Encoder: Gumbel Tree-LSTM 15 3.1 Motivation 15 3.2 Preliminaries 16 3.2.1 Recursive Neural Networks 16 3.2.2 Training RvNNs without Tree Information 17 3.3 Model Description 19 3.3.1 Tree-LSTM 19 3.3.2 Gumbel-Softmax 20 3.3.3 Gumbel Tree-LSTM 22 3.4 Implementation Details 25 3.5 Experiments 27 3.5.1 Natural Language Inference 27 3.5.2 Sentiment Analysis 32 3.5.3 Qualitative Analysis 33 3.6 Summary 36 Chapter 4 Sentence Encoder: Cell-aware Stacked LSTM 38 4.1 Motivation 38 4.2 Related Work 40 4.3 Model Description 43 4.3.1 Stacked LSTMs 43 4.3.2 Cell-aware Stacked LSTMs 44 4.3.3 Sentence Encoders 46 4.4 Experiments 47 4.4.1 Natural Language Inference 47 4.4.2 Paraphrase Identification 50 4.4.3 Sentiment Classification 52 4.4.4 Machine Translation 53 4.4.5 Forget Gate Analysis 55 4.4.6 Model Variations 56 4.5 Summary 59 Chapter 5 Matching Function: Element-wise Bilinear Sentence Matching 60 5.1 Motivation 60 5.2 Proposed Method: ElBiS 61 5.3 Experiments 63 5.3.1 Natural language inference 64 5.3.2 Paraphrase Identification 66 5.4 Summary and Discussion 68 Chapter 6 Semi-Supervised Training: Cross-Sentence Latent Variable Model 70 6.1 Motivation 70 6.2 Preliminaries 71 6.2.1 Variational Auto-Encoders 71 6.2.2 von Mises–Fisher Distribution 73 6.3 Proposed Framework: CS-LVM 74 6.3.1 Cross-Sentence Latent Variable Model 75 6.3.2 Architecture 78 6.3.3 Optimization 79 6.4 Experiments 84 6.4.1 Natural Language Inference 84 6.4.2 Paraphrase Identification 85 6.4.3 Ablation Study 86 6.4.4 Generated Sentences 88 6.4.5 Implementation Details 89 6.5 Summary and Discussion 90 Chapter 7 Conclusion 92 Appendix A Appendix 96 A.1 Sentences Generated from CS-LVM 96Docto

SNU Open Repository and Archive

Semi-Supervised and Unsupervised Deep Visual Learning: A Survey

Author: Akata Z.
Chen Y.
Mancini M.
Zhu X.
Publication venue
Publication date: 01/01/2022
Field of study

State-of-the-art deep learning models are often trained with a large amountof costly labeled training data. However, requiring exhaustive manualannotations may degrade the model's generalizability in the limited-labelregime. Semi-supervised learning and unsupervised learning offer promisingparadigms to learn from an abundance of unlabeled visual data. Recent progressin these paradigms has indicated the strong benefits of leveraging unlabeleddata to improve model generalization and provide better model initialization.In this survey, we review the recent advanced deep learning algorithms onsemi-supervised learning (SSL) and unsupervised learning (UL) for visualrecognition from a unified perspective. To offer a holistic understanding ofthe state-of-the-art in these areas, we propose a unified taxonomy. Wecategorize existing representative SSL and UL with comprehensive and insightfulanalysis to highlight their design rationales in different learning scenariosand applications in different computer vision tasks. Lastly, we discuss theemerging trends and open challenges in SSL and UL to shed light on futurecritical research directions.<br

MPG.PuRe

Discovery of Visual Semantics by Unsupervised and Self-Supervised Representation Learning

Author: Larsson Gustav
Publication venue
Publication date: 19/08/2017
Field of study

The success of deep learning in computer vision is rooted in the ability of deep networks to scale up model complexity as demanded by challenging visual tasks. As complexity is increased, so is the need for large amounts of labeled data to train the model. This is associated with a costly human annotation effort. To address this concern, with the long-term goal of leveraging the abundance of cheap unlabeled data, we explore methods of unsupervised "pre-training." In particular, we propose to use self-supervised automatic image colorization. We show that traditional methods for unsupervised learning, such as layer-wise clustering or autoencoders, remain inferior to supervised pre-training. In search for an alternative, we develop a fully automatic image colorization method. Our method sets a new state-of-the-art in revitalizing old black-and-white photography, without requiring human effort or expertise. Additionally, it gives us a method for self-supervised representation learning. In order for the model to appropriately re-color a grayscale object, it must first be able to identify it. This ability, learned entirely self-supervised, can be used to improve other visual tasks, such as classification and semantic segmentation. As a future direction for self-supervision, we investigate if multiple proxy tasks can be combined to improve generalization. This turns out to be a challenging open problem. We hope that our contributions to this endeavor will provide a foundation for future efforts in making self-supervision compete with supervised pre-training.Comment: Ph.D. thesi

arXiv.org e-Print Archive

Knowledge UChicago