Search CORE

23,236 research outputs found

Mining Hierarchical Scenario-Based Specifications

Author: LO David
Maoz Shahar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Abstract—Scalability over long traces, as well as comprehensibility and expressivity of results, are major challenges for dynamic analysis approaches to specification mining. In this work we present a novel use of object hierarchies over traces of inter-object method calls, as an abstraction/refinement mechanism that enables user-guided, top-down or bottom-up mining of layered scenario-based specifications, broken down by hierarchies embedded in the system under investigation. We do this using data mining methods that provide statistically significant sound and complete results modulo user-defined thresholds, in the context of Damm and Harel’s live sequence charts (LSC); a visual, modal, scenario-based, inter-object language. Thus, scalability, comprehensibility, and expressivity are all addressed. Our technical contribution includes a formal definition of hierarchical inter-object traces, and algorithms for ‘zoomingout’ and ‘zooming-in’, used to move between abstraction levels on the mined specifications. An evaluation of our approach based on several case studies shows promising results. I

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Ontology of core data mining entities

Author: A Bernstein
A Golbraikh
A Karalic
B Smith
B Smith
B Smith
C Silla
C Vens
D Demšar
D Kocev
D Kocev
D Qi
D Young
DJ Hand
F Serban
G Madjarov
G Tsoumakas
GH Bakir
H Mannila
HP Kriegel
I Slavkov
J Vanschoren
K Button
Larisa Soldatova
LN Soldatova
M Courtot
M Ford
M Žáková
MA Avery
MA Avery
MF López
O Spjuth
P Robinson
Panče Panov
Q Yang
R Caruana
R Guha
R Guha
RD King
RD King
RR Brinkman
Sašo Džeroski
T Dietterich
V Podpečan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/07/2014
Field of study

In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

Crossref

Brunel University Research Archive

Locally adaptive smoothing with Markov random fields and shrinkage priors

Author: Faulkner James R.
Minin Vladimir N.
Publication venue
Publication date: 09/02/2017
Field of study

We present a locally adaptive nonparametric curve fitting method that operates within a fully Bayesian framework. This method uses shrinkage priors to induce sparsity in order-k differences in the latent trend function, providing a combination of local adaptation and global control. Using a scale mixture of normals representation of shrinkage priors, we make explicit connections between our method and kth order Gaussian Markov random field smoothing. We call the resulting processes shrinkage prior Markov random fields (SPMRFs). We use Hamiltonian Monte Carlo to approximate the posterior distribution of model parameters because this method provides superior performance in the presence of the high dimensionality and strong parameter correlations exhibited by our models. We compare the performance of three prior formulations using simulated data and find the horseshoe prior provides the best compromise between bias and precision. We apply SPMRF models to two benchmark data examples frequently used to test nonparametric methods. We find that this method is flexible enough to accommodate a variety of data generating models and offers the adaptive properties and computational tractability to make it a useful addition to the Bayesian nonparametric toolbox.Comment: 38 pages, to appear in Bayesian Analysi

arXiv.org e-Print Archive

eScholarship - University of California

Flexible shrinkage in high-dimensional Bayesian spatial autoregressive models

Author: Pfarrhofer Michael
Piribauer Philipp
Publication venue: 'Elsevier BV'
Publication date: 28/05/2018
Field of study

This article introduces two absolutely continuous global-local shrinkage priors to enable stochastic variable selection in the context of high-dimensional matrix exponential spatial specifications. Existing approaches as a means to dealing with overparameterization problems in spatial autoregressive specifications typically rely on computationally demanding Bayesian model-averaging techniques. The proposed shrinkage priors can be implemented using Markov chain Monte Carlo methods in a flexible and efficient way. A simulation study is conducted to evaluate the performance of each of the shrinkage priors. Results suggest that they perform particularly well in high-dimensional environments, especially when the number of parameters to estimate exceeds the number of observations. For an empirical illustration we use pan-European regional economic growth data.Comment: Keywords: Matrix exponential spatial specification, model selection, shrinkage priors, hierarchical modeling; JEL: C11, C21, C5

arXiv.org e-Print Archive

Elektronische Publikationen der Wirtschaftsuniversität Wien

ELICA: An Automated Tool for Dynamic Extraction of Requirements Relevant Information

Author: Abad Zahra Shakeri Hossein
Barker Ken
Gervasi Vincenzo
Zowghi Didar
Publication venue
Publication date: 01/01/2018
Field of study

Requirements elicitation requires extensive knowledge and deep understanding of the problem domain where the final system will be situated. However, in many software development projects, analysts are required to elicit the requirements from an unfamiliar domain, which often causes communication barriers between analysts and stakeholders. In this paper, we propose a requirements ELICitation Aid tool (ELICA) to help analysts better understand the target application domain by dynamic extraction and labeling of requirements-relevant knowledge. To extract the relevant terms, we leverage the flexibility and power of Weighted Finite State Transducers (WFSTs) in dynamic modeling of natural language processing tasks. In addition to the information conveyed through text, ELICA captures and processes non-linguistic information about the intention of speakers such as their confidence level, analytical tone, and emotions. The extracted information is made available to the analysts as a set of labeled snippets with highlighted relevant terms which can also be exported as an artifact of the Requirements Engineering (RE) process. The application and usefulness of ELICA are demonstrated through a case study. This study shows how pre-existing relevant information about the application domain and the information captured during an elicitation meeting, such as the conversation and stakeholders' intentions, can be captured and used to support analysts achieving their tasks.Comment: 2018 IEEE 26th International Requirements Engineering Conference Workshop

arXiv.org e-Print Archive

Crossref

OPUS - University of Technology Sydney

Archivio della Ricerca - Università di Pisa

ASlib: A Benchmark Library for Algorithm Selection

Author: Bischl Bernd
Frechette Alexandre
Hoos Holger
Hutter Frank
Kerschke Pascal
Kotthoff Lars
Leyton-Brown Kevin
Lindauer Marius
Malitsky Yuri
Tierney Kevin
Vanschoren Joaquin
Publication venue
Publication date: 01/01/2016
Field of study

The task of algorithm selection involves choosing an algorithm from a set of algorithms on a per-instance basis in order to exploit the varying performance of algorithms over a set of instances. The algorithm selection problem is attracting increasing attention from researchers and practitioners in AI. Years of fruitful applications in a number of domains have resulted in a large amount of data, but the community lacks a standard format or repository for this data. This situation makes it difficult to share and compare different approaches effectively, as is done in other, more established fields. It also unnecessarily hinders new researchers who want to work in this area. To address this problem, we introduce a standardized format for representing algorithm selection scenarios and a repository that contains a growing number of data sets from the literature. Our format has been designed to be able to express a wide variety of different scenarios. Demonstrating the breadth and power of our platform, we describe a set of example experiments that build and evaluate algorithm selection models through a common interface. The results display the potential of algorithm selection to achieve significant performance improvements across a broad range of problems and algorithms.Comment: Accepted to be published in Artificial Intelligence Journa

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Publications at Bielefeld University