Search CORE

420 research outputs found

Effective Approaches to Attention-based Neural Machine Translation

Author: Luong Minh-Thang
Manning Christopher D.
Pham Hieu
Publication venue
Publication date: 01/01/2015
Field of study

An attentional mechanism has lately been used to improve neural machine translation (NMT) by selectively focusing on parts of the source sentence during translation. However, there has been little work exploring useful architectures for attention-based NMT. This paper examines two simple and effective classes of attentional mechanism: a global approach which always attends to all source words and a local one that only looks at a subset of source words at a time. We demonstrate the effectiveness of both approaches over the WMT translation tasks between English and German in both directions. With local attention, we achieve a significant gain of 5.0 BLEU points over non-attentional systems which already incorporate known techniques such as dropout. Our ensemble model using different attention architectures has established a new state-of-the-art result in the WMT'15 English to German translation task with 25.9 BLEU points, an improvement of 1.0 BLEU points over the existing best system backed by NMT and an n-gram reranker.Comment: 11 pages, 7 figures, EMNLP 2015 camera-ready version, more training detail

arXiv.org e-Print Archive

CiteSeerX

Crossref

Deriving an Emergent Relational Schema from RDF data

Author: Pham M.-D. (Minh-Duc)
Publication venue
Publication date: 01/01/2015
Field of study

CWI's Institutional Repository

Emergent relational schemas for RDF

Author: Pham M.-D. (Minh-Duc)
Publication venue
Publication date: 06/09/2018
Field of study

CWI's Institutional Repository

Self-organizing Structured RDF in MonetDB

Author: Pham M.-D. (Minh-Duc)
Publication venue
Publication date: 01/04/2013
Field of study

The semantic web uses RDF as its data model, providing ultimate flexibility for users to represent and evolve data without need of a schema. Yet, this flexibility poses challenges in implementing efficient RDF stores, leading from plans with very many self-joins to a triple table, difficulties to optimize these, and a lack of data locality since without a notion of multi-attribute data structure, clustered indexing opportunities are lost. Apart from performance issues, users of huge RDF graphs often have problems formulating queries as they lack any system-supported notion of the structure in the data. In this research, we exploit the observation that real RDF data, while not as regularly structured as relational data, still has the great majority of triples conforming to regular patterns. We conjecture that a system that would recognize this structure automatically would both allow RDF stores to become more efficient and also easier to use. Concretely, we propose to derive self-organizing RDF that stores data in PSO format in such a way that the regular parts of the data physically correspond to relational columnar storage; and propose RDFscan/RDFjoin algorithms that compute star-patterns over these without wasting effort in self-joins. These regular parts, i.e. tables, are identified on ingestion by a schema discovery algorithm -- as such users will gain an SQL view of the regular part of the RDF data. This research aims to produce a state-of-the-art SPARQL frontend for MonetDB as a by-product, and we already present some preliminary results on this platform

CWI's Institutional Repository

Exploiting emergent schemas to make RDF systems more efficient

Author: Boncz P.A. (Peter)
Pham M.-D. (Minh-Duc)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

We build on our earlier finding that more than 95 % of the triples in actual RDF triple graphs have a remarkably tabular structure, whose schema does not necessarily follow from explicit metadata such as ontologies, but for which an RDF store can automatically derive by looking at the data using so-called “emergent schema” detection techniques. In this paper we investigate how computers and in particular RDF stores can take advantage from this emergent schema to more compactly store RDF data and more efficiently optimize and execute SPARQL queries. To this end, we contribute techniques for efficient emergent schema aware RDF storage and new query operator algorithms for emergent schema aware scans and joins. In all, these techniques allow RDF schema processors fully catch up with relational database techniques in terms of rich physical database design options and efficiency, without requiring a rigid upfront schema structure definition

VU Research Portal

CWI's Institutional Repository

S3G2: a Scalable Structure-correlated Social Graph Generator

Author: Boncz P.A. (Peter)
Erling O. (Orri)
Pham M.-D. (Minh-Duc)
Publication venue: CWI
Publication date: 01/06/2012
Field of study

Benchmarking graph-oriented database workloads and graph-oriented database systems are increasingly becoming relevant in analytical Big Data tasks, such as social network analysis. In graph data, structure is not mainly found inside the nodes, but especially in the way nodes happen to be connected, i.e. structural correlations. Because such structural correlations determine join fan-outs experienced by graph analysis algorithms and graph query executors, they are an essential, yet typically neglected, ingredient of synthetic graph generators. To address this, we present S3G2: a Scalable Structure-correlated Social Graph Generator. This graph generator creates a synthetic social graph, containing non-uniform value distributions and structural correlations, and is intended as a testbed for scalable graph analysis algorithms and graph database systems. We generalize the problem to decompose correlated graph generation in multiple passes that each focus on one so-called "correlation dimension"; each of which can be mapped to a MapReduce task. We show that using S3G2 can generate social graphs that (i) share well-known graph connectivity characteristics typically found in real social graphs (ii) contain certain plausible structural correlations that influence the performance of graph analysis algorithms and queries, and (iii) can be quickly generated at huge sizes on common cluster hardware

CWI's Institutional Repository

Homophily-based social group formation in a spin-glass self-assembly framework

Author: Hanel Rudolf
Korbel Jan
Lindner Simon D.
Pham Tuan Minh
Thurner Stefan
Publication venue
Publication date: 22/11/2022
Field of study

Homophily, the tendency of humans to attract each other when sharing similar features, traits, or opinions has been identified as one of the main driving forces behind the formation of structured societies. Here we ask to what extent homophily can explain the formation of social groups, particularly their size distribution. We propose a spin-glass-inspired framework of self-assembly, where opinions are represented as multidimensional spins that dynamically self-assemble into groups; individuals within a group tend to share similar opinions (intra-group homophily), and opinions between individuals belonging to different groups tend to be different (inter-group heterophily). We compute the associated non-trivial phase diagram by solving a self-consistency equation for 'magnetization' (combined average opinion). Below a critical temperature, there exist two stable phases: one ordered with non-zero magnetization and large clusters, the other disordered with zero magnetization and no clusters. The system exhibits a first-order transition to the disordered phase. We analytically derive the group-size distribution that successfully matches empirical group-size distributions from online communities.Comment: 6 pages, 5 pages of SI, to appear in Phys. Rev. Let

arXiv.org e-Print Archive

Network Sensitivity of Systemic Risk

Author: Barucca Paolo
Cimini Giulio
Di Gangi Domenico
Macchiati Valentina
Minh Tuan Pham
Pinotti Francesco
Ramadiah Amanah
Sardo D. Ruggiero Lo
Wilinski Mateusz
Publication venue
Publication date: 01/01/2020
Field of study

A growing body of studies on systemic risk in financial markets has emphasized the key importance of taking into consideration the complex interconnections among financial institutions. Much effort has been put in modeling the contagion dynamics of financial shocks, and to assess the resilience of specific financial markets - either using real network data, reconstruction techniques or simple toy networks. Here we address the more general problem of how shock propagation dynamics depends on the topological details of the underlying network. To this end we consider different realistic network topologies, all consistent with balance sheets information obtained from real data on financial institutions. In particular, we consider networks of varying density and with different block structures, and diversify as well in the details of the shock propagation dynamics. We confirm that the systemic risk properties of a financial network are extremely sensitive to its network features. Our results can aid in the design of regulatory policies to improve the robustness of financial markets

arXiv.org e-Print Archive

ART

sFuzz: An efficient adaptive fuzzer for solidity smart contracts

Author: LIN Yun
NGUYEN Tai D.
PHAM Long H.
SUN Jun
TRAN Minh Quang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/04/2020
Field of study

Ministry of Education, Singapore under its Academic Research Funding Tier

arXiv.org e-Print Archive

Crossref

Institutional Knowledge at Singapore Management University

Adjustment of Vietnamese labour market in time of economic fluctuations and structural changes

Author: Oudin Xavier
Pasquier-Doumer Laure
Pham Minh T.
Roubaud François
Vu Hoang D.
Publication venue: 'Dialogue and Discourse'
Publication date: 01/01/2014
Field of study

Dans cet article, nous examinons les ajustements du marché du travail aux fluctuations économiques, compte tenu des transformations structurelles en cours ainsi que des changements à court terme. Nous utilisons pour cela des données des recensements de la population ou publiées dans les annuaires statistiques de l’Office Général de la Statistique pour les séries à long terme, et les enquêtes emploi conduites entre 2007 à 2012 pour les données à court terme. Cet article souligne la profonde transformation du marché du travail au cours des dernières décennies. La population active a doublé en 25 ans et la part de l'agriculture est passée en dessous du seuil de 50 %. L’absorption de l'offre de travail a donc été l'un des principaux défis pour l'économie vietnamienne sur cette période. Le secteur des entreprises familiales agricoles et non-agricoles a été le principal pourvoyeur d'emplois au cours de ces années. Le marché du travail s'est adapté au récent ralentissement économique à travers différents canaux. Le chômage est resté stable mais le nombre de personnes inactives a augmenté. La quantité de travail a également été affectée par une réduction significative du nombre d'heures travaillées. Alors que le secteur non agricole a généré plus d'emplois pour les travailleurs qualifiés, un flux de travailleurs non-qualifiés vers l’agriculture a été observé. En raison de facteurs démographiques, l'absorption de l'offre de travail et la création de nouveaux emplois ne sont plus le principal problème. En revanche, l’évolution récente du marché du travail appelle à la mise en oeuvre de politiques structurelles en vue d’améliorer les conditions de travail, la période étant particulièrement favorable pour mener ces politiques puisque le Vietnam profite actuellement du dividende démographique

Horizon / Pleins textes