Search CORE

19,950 research outputs found

The BURCHAK corpus: a Challenge Data Set for Interactive Learning of Visually Grounded Word Meanings

Author: Eshghi Arash
Lemon Oliver Joseph
Mills Gregory
Yu Yanchao
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

We motivate and describe a new freely available human-human dialogue dataset for interactive learning of visually grounded word meanings through ostensive definition by a tutor to a learner. The data has been collected using a novel, character-by-character variant of the DiET chat tool (Healey et al., 2003; Mills and Healey, submitted) with a novel task, where a Learner needs to learn invented visual attribute words (such as " burchak " for square) from a tutor. As such, the text-based interactions closely resemble face-to-face conversation and thus contain many of the linguistic phenomena encountered in natural, spontaneous dialogue. These include self-and other-correction, mid-sentence continuations, interruptions, overlaps, fillers, and hedges. We also present a generic n-gram framework for building user (i.e. tutor) simulations from this type of incremental data, which is freely available to researchers. We show that the simulations produce outputs that are similar to the original data (e.g. 78% turn match similarity). Finally, we train and evaluate a Reinforcement Learning dialogue control agent for learning visually grounded word meanings, trained from the BURCHAK corpus. The learned policy shows comparable performance to a rule-based system built previously.Comment: 10 pages, THE 6TH WORKSHOP ON VISION AND LANGUAGE (VL'17

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

Verification and validation of simulation models.

Author: Kleijnen J.P.C.
Publication venue
Publication date
Field of study

Research Papers in Economics

An Analysis of Data Sets Used to Train and Validate Cost Prediction Systems

Author: Jorgensen Magne
Mair Carolyn
Shepperd Martin
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2006
Field of study

OBJECTIVE - the aim of this investigation is to build up a picture of the nature and type of data sets being used to develop and evaluate different software project effort prediction systems. We believe this to be important since there is a growing body of published work that seeks to assess different prediction approaches. Unfortunately, results – to date – are rather inconsistent so we are interested in the extent to which this might be explained by different data sets. METHOD - we performed an exhaustive search from 1980 onwards from three software engineering journals for research papers that used project data sets to compare cost prediction systems. RESULTS - this identified a total of 50 papers that used, one or more times, a total of 74 unique project data sets. We observed that some of the better known and publicly accessible data sets were used repeatedly making them potentially disproportionately influential. Such data sets also tend to be amongst the oldest with potential problems of obsolescence. We also note that only about 70% of all data sets are in the public domain and this can be particularly problematic when the data set description is incomplete or limited. Finally, extracting relevant information from research papers has been time consuming due to different styles of presentation and levels of contextural information. CONCLUSIONS - we believe there are two lessons to learn. First, the community needs to consider the quality and appropriateness of the data set being utilised; not all data sets are equal. Second, we need to assess the way results are presented in order to facilitate meta-analysis and whether a standard protocol would be appropriate

Coverage and Deployment Analysis of Narrowband Internet of Things in the Wild

Author: Alay Özgü
Brunstrom Anna
Caso Giuseppe
De Nardis Luca
Di Benedetto Maria-Gabriella
Kousias Konstantinos
Neri Marco
Publication venue
Publication date: 01/01/2020
Field of study

Narrowband Internet of Things (NB-IoT) is gaining momentum as a promising technology for massive Machine Type Communication (mMTC). Given that its deployment is rapidly progressing worldwide, measurement campaigns and performance analyses are needed to better understand the system and move toward its enhancement. With this aim, this paper presents a large scale measurement campaign and empirical analysis of NB-IoT on operational networks, and discloses valuable insights in terms of deployment strategies and radio coverage performance. The reported results also serve as examples showing the potential usage of the collected dataset, which we make open-source along with a lightweight data visualization platform.Comment: Accepted for publication in IEEE Communications Magazine (Internet of Things and Sensor Networks Series

arXiv.org e-Print Archive

NORA - Norwegian Open Research Archives

Fluctuations and the role of collision duration in reaction-diffusion systems

Author: Lee Chiu Fan
Peruani Fernando
Publication venue: 'IOP Publishing'
Publication date: 20/05/2013
Field of study

In a reaction-diffusion system, fluctuations in both diffusion and reaction events, have important effects on the steady-state statistics of the system. Here, we argue through extensive lattice simulations, mean-field type arguments, and the Doi-Peliti formalism that the collision duration statistics -- i.e., the time two particles stay together in a lattice site -- plays a leading role in determining the steady state of the system. We obtain approximate expressions for the average densities of the chemical species and for the critical diffusion coefficient required to sustain the reaction

arXiv.org e-Print Archive

HAL-UNICE

EDP Sciences OAI-PMH repository (1.2.0)

A multiarchitecture parallel-processing development environment

Author: Blech Richard
Cole Gary
Townsend Scott
Publication venue
Publication date
Field of study

A description is given of the hardware and software of a multiprocessor test bed - the second generation Hypercluster system. The Hypercluster architecture consists of a standard hypercube distributed-memory topology, with multiprocessor shared-memory nodes. By using standard, off-the-shelf hardware, the system can be upgraded to use rapidly improving computer technology. The Hypercluster's multiarchitecture nature makes it suitable for researching parallel algorithms in computational field simulation applications (e.g., computational fluid dynamics). The dedicated test-bed environment of the Hypercluster and its custom-built software allows experiments with various parallel-processing concepts such as message passing algorithms, debugging tools, and computational 'steering'. Such research would be difficult, if not impossible, to achieve on shared, commercial systems

NASA Technical Reports Server