Search CORE

18,975 research outputs found

Data generator for evaluating ETL process quality

Author: Abelló Gamazo Alberto
Jovanovic Petar
Nakuçi Emona
Theodorou Vasileios
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Obtaining the right set of data for evaluating the fulfillment of different quality factors in the extract-transform-load (ETL) process design is rather challenging. First, the real data might be out of reach due to different privacy constraints, while manually providing a synthetic set of data is known as a labor-intensive task that needs to take various combinations of process parameters into account. More importantly, having a single dataset usually does not represent the evolution of data throughout the complete process lifespan, hence missing the plethora of possible test cases. To facilitate such demanding task, in this paper we propose an automatic data generator (i.e., Bijoux). Starting from a given ETL process model, Bijoux extracts the semantics of data transformations, analyzes the constraints they imply over input data, and automatically generates testing datasets. Bijoux is highly modular and configurable to enable end-users to generate datasets for a variety of interesting test scenarios (e.g., evaluating specific parts of an input ETL process design, with different input dataset sizes, different distributions of data, and different operation selectivities). We have developed a running prototype that implements the functionality of our data generation framework and here we report our experimental findings showing the effectiveness and scalability of our approach.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Supervised Learning for Coverage-Directed Test Selection in Simulation-Based Verification

Author: Blackmore Tim
Eder Kerstin
Masamba Nyasha
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date: 26/09/2022
Field of study

Explore Bristol Research

Targeted Automatic Integer Overflow Discovery Using Goal-Directed Conditional Branch Enforcement

Author: Brumley D.
Cadar C.
Cowan C.
Dietz W.
Drewry W.
Haller I.
Long F.
Rinard M. C.
Röning J.
Seacord R.
Sharif M. I.
Sidiroglou S.
Sutton M.
Tielei W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/03/2015
Field of study

We present a new technique and system, DIODE, for auto- matically generating inputs that trigger overflows at memory allocation sites. DIODE is designed to identify relevant sanity checks that inputs must satisfy to trigger overflows at target memory allocation sites, then generate inputs that satisfy these sanity checks to successfully trigger the overflow. DIODE works with off-the-shelf, production x86 binaries. Our results show that, for our benchmark set of applications, and for every target memory allocation site exercised by our seed inputs (which the applications process correctly with no overflows), either 1) DIODE is able to generate an input that triggers an overflow at that site or 2) there is no input that would trigger an overflow for the observed target expression at that site.United States. Defense Advanced Research Projects Agency (Grant FA8650-11-C-7192

CiteSeerX

DSpace@MIT

Crossref

Dependency parsing resources for French: Converting acquired lexical functional grammar F-Structure annotations and parsing F-Structures directly

Author: Schluter Natalie
van Genabith Josef
Publication venue
Publication date: 01/01/2009
Field of study

Recent years have seen considerable success in the generation of automatically obtained wide-coverage deep grammars for natural language processing, given reliable and large CFG-like treebanks. For research within Lexical Functional Grammar framework, these deep grammars are typically based on an extended PCFG parsing scheme from which dependencies are extracted. However, increasing success in statistical dependency parsing suggests that such deep grammar approaches to statistical parsing could be streamlined. We explore this novel approach to deep grammar parsing within the framework of LFG in this paper, for French, showing that best results (an f-score of 69.46) for the established integrated architecture may be obtained for French

Irish Universities

DCU Online Research Access Service

DSpace at Tartu University Library

Practical applications of multi-agent systems in electric power systems

Author: Ai
Amin
Beck
Belkacemi
Bellifemine
Brown
Catterson
Catterson
Chao
Cockburn
Davidson
Dimeas
Finin
Franklin
Gómez-Pérez
Hayes-Roth
Hossack
Jennings
Jiang
Ko
Kok
Kok
Li
Liu
Maes
McArthur
McArthur
McArthur
McIlraith
Mullen
Muscettola
Russell
Saleem
Solanki
Somani
Staszesky
Talukdar
Vale
Wedde
Wittig
Wooldridge
Zabet
Zabet
Publication venue: 'Wiley'
Publication date: 01/03/2012
Field of study

The transformation of energy networks from passive to active systems requires the embedding of intelligence within the network. One suitable approach to integrating distributed intelligent systems is multi-agent systems technology, where components of functionality run as autonomous agents capable of interaction through messaging. This provides loose coupling between components that can benefit the complex systems envisioned for the smart grid. This paper reviews the key milestones of demonstrated agent systems in the power industry and considers which aspects of agent design must still be addressed for widespread application of agent technology to occur

Crossref

University of Strathclyde Institutional Repository

ACETest: Automated Constraint Extraction for Testing Deep Learning Operators

Author: Chen Yufeng
Huo Wei
Li Yeting
Li Yuekang
Shi Jingyi
Su Hui
Xiao Yang
Yu Chendong
Yu Dongsong
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/05/2023
Field of study

Deep learning (DL) applications are prevalent nowadays as they can help with multiple tasks. DL libraries are essential for building DL applications. Furthermore, DL operators are the important building blocks of the DL libraries, that compute the multi-dimensional data (tensors). Therefore, bugs in DL operators can have great impacts. Testing is a practical approach for detecting bugs in DL operators. In order to test DL operators effectively, it is essential that the test cases pass the input validity check and are able to reach the core function logic of the operators. Hence, extracting the input validation constraints is required for generating high-quality test cases. Existing techniques rely on either human effort or documentation of DL library APIs to extract the constraints. They cannot extract complex constraints and the extracted constraints may differ from the actual code implementation. To address the challenge, we propose ACETest, a technique to automatically extract input validation constraints from the code to build valid yet diverse test cases which can effectively unveil bugs in the core function logic of DL operators. For this purpose, ACETest can automatically identify the input validation code in DL operators, extract the related constraints and generate test cases according to the constraints. The experimental results on popular DL libraries, TensorFlow and PyTorch, demonstrate that ACETest can extract constraints with higher quality than state-of-the-art (SOTA) techniques. Moreover, ACETest is capable of extracting 96.4% more constraints and detecting 1.95 to 55 times more bugs than SOTA techniques. In total, we have used ACETest to detect 108 previously unknown bugs on TensorFlow and PyTorch, with 87 of them confirmed by the developers. Lastly, five of the bugs were assigned with CVE IDs due to their security impacts.Comment: Accepted by ISSTA 202

arXiv.org e-Print Archive