Search CORE

51 research outputs found

Multi-Entity Dependence Learning with Rich Context via Conditional Variational Auto-encoder

Author: Chen Di
Gomes Carla P.
Tang Luming
Xue Yexiang
Publication venue
Publication date: 17/09/2017
Field of study

Multi-Entity Dependence Learning (MEDL) explores conditional correlations among multiple entities. The availability of rich contextual information requires a nimble learning scheme that tightly integrates with deep neural networks and has the ability to capture correlation structures among exponentially many outcomes. We propose MEDL_CVAE, which encodes a conditional multivariate distribution as a generating process. As a result, the variational lower bound of the joint likelihood can be optimized via a conditional variational auto-encoder and trained end-to-end on GPUs. Our MEDL_CVAE was motivated by two real-world applications in computational sustainability: one studies the spatial correlation among multiple bird species using the eBird data and the other models multi-dimensional landscape composition and human footprint in the Amazon rainforest with satellite images. We show that MEDL_CVAE captures rich dependency structures, scales better than previous methods, and further improves on the joint likelihood taking advantage of very large datasets that are beyond the capacity of previous methods.Comment: The first two authors contribute equall

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Nearest Labelset Using Double Distances for Multi-label Classification

Author: Gweon Hyukjun
Schonlau Matthias
Steiner Stefan
Publication venue
Publication date: 15/02/2017
Field of study

Multi-label classification is a type of supervised learning where an instance may belong to multiple labels simultaneously. Predicting each label independently has been criticized for not exploiting any correlation between labels. In this paper we propose a novel approach, Nearest Labelset using Double Distances (NLDD), that predicts the labelset observed in the training data that minimizes a weighted sum of the distances in both the feature space and the label space to the new instance. The weights specify the relative tradeoff between the two distances. The weights are estimated from a binomial regression of the number of misclassified labels as a function of the two distances. Model parameters are estimated by maximum likelihood. NLDD only considers labelsets observed in the training data, thus implicitly taking into account label dependencies. Experiments on benchmark multi-label data sets show that the proposed method on average outperforms other well-known approaches in terms of Hamming loss, 0/1 loss, and multi-label accuracy and ranks second after ECC on the F-measure

arXiv.org e-Print Archive

Hybrid Classifier System: Support Vector Machines Dikombinasikan dengan K-Nearest Neighbors untuk Menentukan Kelayakan Nasabah Bank dalam Pengajuan Kredit

Author: Ginting Selvia Lorena Br
Permana Aldi Azhar
Publication venue: 'Universitas Komputer Indonesia'
Publication date: 30/04/2018
Field of study

This research intends to build an application that can analyze bank data and then determine the feasibility in terms of creditworthiness, to avoid non-performing loans in the future. The method used is a hybrid method that combines two Data Mining classification techniques namely Support Vector Machines (SVM) and K-Nearest Neighbors (KNN). SVM works by finding the optimal hyperplane and support vectors. Furthermore, the KNN will classify bank data based on identifying the support vectors. With 2000 training data and 103 testing data: cost parameter values = 0.1, gamma = 2, 1998 support vectors, then with K value = 16 the system gives 88.35% suitable data (91 data from 103). In conclusion, the application can work in terms of helping the credit analysts to recommend prospective customers who deserve loans. Keywords â€“ application; data mining; hybrid method; SVM-KNNRiset ini dilakukan dengan maksud membangun aplikasi yang dapat manganalisis data nasabah bank kemudian menentukan kelayakan nasabah tersebut dalam hal pemberian pinjaman, agar terhindar dari masalah kredit macet dikemudian hari. Metode yang digunakan adalah metode hybrid yang menggabungkan 2 teknik klasifikasi Data Mining yaitu Support Vector Machines (SVM) dan K-Nearest Neighbors (KNN). SVM bekerja dengan cara menemukan hyperplane yang optimal dan support vector. Lebih lanjut, algoritma KNN akan melakukan klasifikasi data nasabah bank berdasarkan pengidentifikasian support vector tersebut. Dengan 2000 data latih dan 103 data uji: nilai parameter cost=0,1, gamma=2, sistem mengidentifikasi 1998 support vector, kemudian dengan nilai K=16 sistem memberikan hasil 88,35% data yang cocok (91 data dari 103). Dapat disimpulkan bahwa aplikasi ini bekerja dengan cukup baik dan dapat membantu credit analyst dalam merekomendasikan nasabah yang layak memperoleh pinjaman. Kata Kunci - aplikasi; data mining; klasifikasi; metode hybrid; SVM-KNN &nbsp

Open Journal - Universitas Komputer Indonesia

A Survey on Sentence Embedding Models Performance for Patent Analysis

Author: Bekamiri Hamid
Hain Daniel S.
Jurowetzki Roman
Publication venue
Publication date: 28/04/2022
Field of study

VBN