Search CORE

9,632 research outputs found

Exploratory topic modeling with distributional semantics

Author: A Treisman
DA Keim
DM Blei
J Risch
L Barth
M Bostock
S Fortunato
S Lohmann
S Palmer
Y Bengio
Publication venue
Publication date: 16/07/2015
Field of study

As we continue to collect and store textual data in a multitude of domains, we are regularly confronted with material whose largely unknown thematic structure we want to uncover. With unsupervised, exploratory analysis, no prior knowledge about the content is required and highly open-ended tasks can be supported. In the past few years, probabilistic topic modeling has emerged as a popular approach to this problem. Nevertheless, the representation of the latent topics as aggregations of semi-coherent terms limits their interpretability and level of detail. This paper presents an alternative approach to topic modeling that maps topics as a network for exploration, based on distributional semantics using learned word vectors. From the granular level of terms and their semantic similarity relations global topic structures emerge as clustered regions and gradients of concepts. Moreover, the paper discusses the visual interactive representation of the topic map, which plays an important role in supporting its exploration.Comment: Conference: The Fourteenth International Symposium on Intelligent Data Analysis (IDA 2015

arXiv.org e-Print Archive

Crossref

Recommended from our members

Proceedings ICPW'07: 2nd International Conference on the Pragmatic Web, 22-23 Oct. 2007, Tilburg: NL

Author
Publication venue: Tilburg University
Publication date: 01/10/2007
Field of study

Proceedings ICPW'07: 2nd International Conference on the Pragmatic Web, 22-23 Oct. 2007, Tilburg: N

Open Research Online (The Open University)

Collaboration in the Semantic Grid: a Basis for e-Learning

Author: Austin Tate
Danius Michaelides
David De Roure
Jeff Dalton
Jiri Komzak
Kesselman C.
Kevin Page
Kirschner P. A.
Kirschner P. A.
Marc Eisenstadt
Michelle Bachler
Nigel Shadbolt
Simon Buckingham Shum
Stephen Potter
Varis T.
Yun-Heh Chen-Burger
Publication venue
Publication date: 01/01/2005
Field of study

The CoAKTinG project aims to advance the state of the art in collaborative mediated spaces for the Semantic Grid. This paper presents an overview of the hypertext and knowledge based tools which have been deployed to augment existing collaborative environments, and the ontology which is used to exchange structure, promote enhanced process tracking, and aid navigation of resources before, after, and while a collaboration occurs. While the primary focus of the project has been supporting e-Science, this paper also explores the similarities and application of CoAKTinG technologies as part of a human-centred design approach to e-Learning

Southampton (e-Prints Soton)

Crossref

Open Research Online (The Open University)

Edinburgh Research Archive

Oxford University Research Archive

The influence of conceptual user models on the creation and interpretation of diagrams representing reactive systems

Author: Tabachneck-Schijf H.J.M.
Verpoorten J.H.
Weg R.L.W. van de
Wieringa R.J.
Publication venue: Open Institute of Knowledge
Publication date: 01/01/2006
Field of study

In system design, many diagrams of many different types are used. Diagrams communicate design aspects between members of the development team, and between these experts and the non-expert customers and future users. Mastering the creation of diagrams is often a challenging task, judging by particular errors persistently found in diagrams created by undergraduate computer science students. We assume a possible misalignment between human perception and cognition on the one hand and the diagrams’ structure and syntax on the other. This article presents the results of an investigation of such a misalignment. We focus on the deployment of so-called 'conceptual user models' (mental models, created by users in their mind) at the creation of diagrams. We propose a taxonomy for mental mappings, used for categorization of representations. We describe an experiment where naive and novice subjects created one or several diagrams of a familiar task. We use our taxonomy for analysing these diagrams, both for the represented task structure and the symbols used. The results indeed show a mismatch between mental models and currently used diagram techniques

University of Twente Research Information

Utrecht University Repository

The CIAO Multi-Dialect Compiler and System: An Experimentation Workbench for Future (C)LP Systems

Author: Bueno Carrillo Francisco
Cabeza Gras Daniel
Carro Liñares Manuel
García de la Banda M.
Hermenegildo Manuel V.
López García Pedro
Puebla Sánchez Alvaro Germán
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/1996
Field of study

CIAO is an advanced programming environment supporting Logic and Constraint programming. It offers a simple concurrent kernel on top of which declarative and non-declarative extensions are added via librarles. Librarles are available for supporting the ISOProlog standard, several constraint domains, functional and higher order programming, concurrent and distributed programming, internet programming, and others. The source language allows declaring properties of predicates via assertions, including types and modes. Such properties are checked at compile-time or at run-time. The compiler and system architecture are designed to natively support modular global analysis, with the two objectives of proving properties in assertions and performing program optimizations, including transparently exploiting parallelism in programs. The purpose of this paper is to report on recent progress made in the context of the CIAO system, with special emphasis on the capabilities of the compiler, the techniques used for supporting such capabilities, and the results in the áreas of program analysis and transformation already obtained with the system

Archivo Digital UPM

HHMM at SemEval-2019 Task 2: Unsupervised Frame Induction using Contextualized Word Embeddings

Author: Anwar Saba
Arefyev Nikolay
Biemann Chris
Panchenko Alexander
Ponzetto Simone Paolo
Ustalov Dmitry
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

We present our system for semantic frame induction that showed the best performance in Subtask B.1 and finished as the runner-up in Subtask A of the SemEval 2019 Task 2 on unsupervised semantic frame induction (QasemiZadeh et al., 2019). Our approach separates this task into two independent steps: verb clustering using word and their context embeddings and role labeling by combining these embeddings with syntactical features. A simple combination of these steps shows very competitive results and can be extended to process other datasets and languages.Comment: 5 pages, 3 tables, accepted at SemEval 201

arXiv.org e-Print Archive

Crossref

Information Extraction in Illicit Domains

Author: Banko M.
Bauer F.
Chakrabarti S.
Kushmerick N.
Mikolov T.
Sahlgren M.
Wick M.
Zouaq A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/03/2017
Field of study

Extracting useful entities and attribute values from illicit domains such as human trafficking is a challenging problem with the potential for widespread social impact. Such domains employ atypical language models, have `long tails' and suffer from the problem of concept drift. In this paper, we propose a lightweight, feature-agnostic Information Extraction (IE) paradigm specifically designed for such domains. Our approach uses raw, unlabeled text from an initial corpus, and a few (12-120) seed annotations per domain-specific attribute, to learn robust IE models for unobserved pages and websites. Empirically, we demonstrate that our approach can outperform feature-centric Conditional Random Field baselines by over 18\% F-Measure on five annotated sets of real-world human trafficking datasets in both low-supervision and high-supervision settings. We also show that our approach is demonstrably robust to concept drift, and can be efficiently bootstrapped even in a serial computing environment.Comment: 10 pages, ACM WWW 201

arXiv.org e-Print Archive

Crossref