Search CORE

36,117 research outputs found

Discovering Dynamic Integrity Rules with a Rules-Based Tool for Data Quality Analyzing

Author: Helfert Markus
Pham Thi Thanh Thoa
Publication venue
Publication date: 01/01/2010
Field of study

Rules based approaches for data quality solutions often use business rules or integrity rules for data monitoring purpose. Integrity rules are constraints on data derived from business rules into a formal form in order to allow computerization. One of challenges of these approaches is rules discovering, which is usually manually made by business experts or system analysts based on experiences. In this paper, we present our rule-based approach for data quality analyzing, in which we discuss a comprehensive method for discovering dynamic integrity rules

MURAL - Maynooth University Research Archive Library

Discovering Dynamic Integrity Rules with a Rules-Based Tool for Data Quality Analyzing

Author: Helfert Markus
Pham Thi Thanh Thoa
Publication venue
Publication date: 01/01/2010
Field of study

MURAL - Maynooth University Research Archive Library

Crossref

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Automatic extraction of knowledge from web documents

Author: Alani Harith
Hall Wendy
Kim Sanghee
Lewis Paul H.
Millard David E.
Shadbolt Nigel R.
Weal Mark J.
Publication venue
Publication date: 01/01/2003
Field of study

A large amount of digital information available is written as text documents in the form of web pages, reports, papers, emails, etc. Extracting the knowledge of interest from such documents from multiple sources in a timely fashion is therefore crucial. This paper provides an update on the Artequakt system which uses natural language tools to automatically extract knowledge about artists from multiple documents based on a predefined ontology. The ontology represents the type and form of knowledge to extract. This knowledge is then used to generate tailored biographies. The information extraction process of Artequakt is detailed and evaluated in this paper

CiteSeerX

Southampton (e-Prints Soton)

Open Research Online (The Open University)

A unified view of data-intensive flows in business intelligence systems : a survey

Author: Abelló Gamazo Alberto
Jovanovic Petar
Romero Moral Óscar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Who watches the watchers: Validating the ProB Validation Tool

Author: Bendisposto Jens
Krings Sebastian
Leuschel Michael
Publication venue: 'Open Publishing Association'
Publication date: 26/04/2014
Field of study

Over the years, ProB has moved from a tool that complemented proving, to a development environment that is now sometimes used instead of proving for applications, such as exhaustive model checking or data validation. This has led to much more stringent requirements on the integrity of ProB. In this paper we present a summary of our validation efforts for ProB, in particular within the context of the norm EN 50128 and safety critical applications in the railway domain.Comment: In Proceedings F-IDE 2014, arXiv:1404.578

arXiv.org e-Print Archive

Directory of Open Access Journals

A Practical Approach to Protect IoT Devices against Attacks and Compile Security Incident Datasets

Author: Cruz Bruno
Gómez-Meire Silvana
Janicke Helge
Méndez Jose R.
Ruano-Ordás David
Yevseyeva Iryna
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2019
Field of study

open access articleThe Internet of Things (IoT) introduced the opportunity of remotely manipulating home appliances (such as heating systems, ovens, blinds, etc.) using computers and mobile devices. This idea fascinated people and originated a boom of IoT devices together with an increasing demand that was difficult to support. Many manufacturers quickly created hundreds of devices implementing functionalities but neglected some critical issues pertaining to device security. This oversight gave rise to the current situation where thousands of devices remain unpatched having many security issues that manufacturers cannot address after the devices have been produced and deployed. This article presents our novel research protecting IOT devices using Berkeley Packet Filters (BPFs) and evaluates our findings with the aid of our Filter.tlk tool, which is able to facilitate the development of BPF expressions that can be executed by GNU/Linux systems with a low impact on network packet throughput

Directory of Open Access Journals

De Montfort University Open Research Archive

Mining Techniques For Invariants In Cloud Computing

Author: K Sadhika
Publication venue: 'IOR Press'
Publication date: 30/05/2019
Field of study

The increasing popularity of Software as a Service (SaaS) stresses the need of solutions to predict failures and avoid service interruptions, which invariably result in SLA violations and severe loss of revenue. A promising approach to continuously monitor the correct functioning of the system is to check the execution conformance to a set of invariants, i.e., properties that must hold when the system is deemed to run correctly. This paper proposes a technique to spot a true anomalies by the use of various data mining techniques like clustering, association rule and decision tree algorithms help in finding the hidden and previously unknown information from the database. We assess the techniques in two invariants’ applications, namely executions characterization and anomaly detection, using the metrics of coverage, recall and precision. In this work two real-world datasets have been used - the publicly available Google datacenter dataset and a dataset of a commercial SaaS utility computing platform - for detecting the anomalies

Sri Shakthi SIET Journals

Data Mining in Electronic Commerce

Author: Banks David L.
Said Yasmin H.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 07/09/2006
Field of study

Modern business is rushing toward e-commerce. If the transition is done properly, it enables better management, new services, lower transaction costs and better customer relations. Success depends on skilled information technologists, among whom are statisticians. This paper focuses on some of the contributions that statisticians are making to help change the business world, especially through the development and application of data mining methods. This is a very large area, and the topics we cover are chosen to avoid overlap with other papers in this special issue, as well as to respect the limitations of our expertise. Inevitably, electronic commerce has raised and is raising fresh research problems in a very wide range of statistical areas, and we try to emphasize those challenges.Comment: Published at http://dx.doi.org/10.1214/088342306000000204 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref