Search CORE

5,006 research outputs found

Supervised Typing of Big Graphs using Semantic Embeddings

Author: Ma Yongtao
Pennington Jeffrey
Rosati Jessica
Sahlgren Magnus
Turian Joseph
van der Maaten Laurens
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/03/2017
Field of study

We propose a supervised algorithm for generating type embeddings in the same semantic vector space as a given set of entity embeddings. The algorithm is agnostic to the derivation of the underlying entity embeddings. It does not require any manual feature engineering, generalizes well to hundreds of types and achieves near-linear scaling on Big Graphs containing many millions of triples and instances by virtue of an incremental execution. We demonstrate the utility of the embeddings on a type recommendation task, outperforming a non-parametric feature-agnostic baseline while achieving 15x speedup and near-constant memory usage on a full partition of DBpedia. Using state-of-the-art visualization, we illustrate the agreement of our extensionally derived DBpedia type embeddings with the manually curated domain ontology. Finally, we use the embeddings to probabilistically cluster about 4 million DBpedia instances into 415 types in the DBpedia ontology.Comment: 6 pages, to be published in Semantic Big Data Workshop at ACM, SIGMOD 2017; extended version in preparation for Open Journal of Semantic Web (OJSW

arXiv.org e-Print Archive

Crossref

Personalized Purchase Prediction of Market Baskets with Wasserstein-Based Sequence Matching

Author: Agrawal Rakesh
Hidasi Balázs
Kusner Matt
Rendle Steffen
Sakurai Yasushi
Tran Nhat-Quang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/06/2019
Field of study

Personalization in marketing aims at improving the shopping experience of customers by tailoring services to individuals. In order to achieve this, businesses must be able to make personalized predictions regarding the next purchase. That is, one must forecast the exact list of items that will comprise the next purchase, i.e., the so-called market basket. Despite its relevance to firm operations, this problem has received surprisingly little attention in prior research, largely due to its inherent complexity. In fact, state-of-the-art approaches are limited to intuitive decision rules for pattern extraction. However, the simplicity of the pre-coded rules impedes performance, since decision rules operate in an autoregressive fashion: the rules can only make inferences from past purchases of a single customer without taking into account the knowledge transfer that takes place between customers. In contrast, our research overcomes the limitations of pre-set rules by contributing a novel predictor of market baskets from sequential purchase histories: our predictions are based on similarity matching in order to identify similar purchase habits among the complete shopping histories of all customers. Our contributions are as follows: (1) We propose similarity matching based on subsequential dynamic time warping (SDTW) as a novel predictor of market baskets. Thereby, we can effectively identify cross-customer patterns. (2) We leverage the Wasserstein distance for measuring the similarity among embedded purchase histories. (3) We develop a fast approximation algorithm for computing a lower bound of the Wasserstein distance in our setting. An extensive series of computational experiments demonstrates the effectiveness of our approach. The accuracy of identifying the exact market baskets based on state-of-the-art decision rules from the literature is outperformed by a factor of 4.0.Comment: Accepted for oral presentation at 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2019

arXiv.org e-Print Archive

Crossref

Towards personalized data-driven bundle design with QoS constraint

Author: Alessandro Avenali
Allan D. Shocker
Arthur Lewbel
Chengqi Zhang
Dusit Niyato
Esther Gal-Or
Francesco Ricci
J. Harris
JOSHUA S. GANS
M. Mithat Üner
Manoj K. Agarwal
Mark Armstrong
Mehdi Sheikhzadeh
Michael D Johnson
Moran Beladev
Qi Liu
R. Preston McAfee
R. Venkatesh
Ruiliang Yan
Sandro Shelegia
Shibin Sheng
Silvano Martello
Srinivasan Raghunathan
Stefan Stremersch
Suchan Chae
T W Anderson
V. Krishnan
Wann-Yih Wu
Xianjun Geng
Xiaoyuan Su
Yannis Bakos
Yannis Bakos
Yehuda Koren
Yongmin Chen
Yongmin Chen
Zhiwen Yu
Zhiwen Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2019
Field of study

Singapore National Research Foundation under its International Research Centre @ Singapore Funding Initiativ

Crossref

Institutional Knowledge at Singapore Management University

Constraint-Based Ontology Induction From Online Customer Reviews

Author: Lee Thomas
Publication venue: ScholarlyCommons
Publication date: 01/05/2007
Field of study

We present an unsupervised, domain-independent technique for inducing a product-specific ontology of product features based upon online customer reviews. We frame ontology induction as a logical assignment problem and solve it with a bounds consistency constrained logic program. Using shallow natural language processing techniques, reviews are parsed into phrase sequences where each phrase refers to a single concept. Traditional document clustering techniques are adapted to collect phrases into initial concepts. We generate a token graph for each initial concept cluster and find a maximal clique to define the corresponding logical set of concept sub-elements. The logic program assigns tokens to clique sub-elements. We apply the technique to several thousand digital camera customer reviews and evaluate the results by comparing them to the ontologies represented by several prominent online buying guides. Because our results are drawn directly from customer comments, differences between our automatically induced product features and those in extant guides may reflect opportunities for better managing customer-producer relationships rather than errors in the process

ScholarlyCommons@Penn

A Similarity Measure for GPU Kernel Subgraph Matching

Author: A Sabne
BP Miller
C Böhm
F Zhang
G Ammons
L Adhianto
MH Williams
R Lim
R Singh
RC Gonzales
SS Shende
T Ball
Publication venue
Publication date: 21/03/2019
Field of study

Accelerator architectures specialize in executing SIMD (single instruction, multiple data) in lockstep. Because the majority of CUDA applications are parallelized loops, control flow information can provide an in-depth characterization of a kernel. CUDAflow is a tool that statically separates CUDA binaries into basic block regions and dynamically measures instruction and basic block frequencies. CUDAflow captures this information in a control flow graph (CFG) and performs subgraph matching across various kernel's CFGs to gain insights to an application's resource requirements, based on the shape and traversal of the graph, instruction operations executed and registers allocated, among other information. The utility of CUDAflow is demonstrated with SHOC and Rodinia application case studies on a variety of GPU architectures, revealing novel thread divergence characteristics that facilitates end users, autotuners and compilers in generating high performing code

arXiv.org e-Print Archive

Crossref