Search CORE

34,229 research outputs found

Benchmarking Summarizability Processing in XML Warehouses with Complex Hierarchies

Author: Darmont Jérôme
Hachicha Marouane
Kit Chantola
Publication venue
Publication date: 01/01/2012
Field of study

Business Intelligence plays an important role in decision making. Based on data warehouses and Online Analytical Processing, a business intelligence tool can be used to analyze complex data. Still, summarizability issues in data warehouses cause ineffective analyses that may become critical problems to businesses. To settle this issue, many researchers have studied and proposed various solutions, both in relational and XML data warehouses. However, they find difficulty in evaluating the performance of their proposals since the available benchmarks lack complex hierarchies. In order to contribute to summarizability analysis, this paper proposes an extension to the XML warehouse benchmark (XWeB) with complex hierarchies. The benchmark enables us to generate XML data warehouses with scalable complex hierarchies as well as summarizability processing. We experimentally demonstrated that complex hierarchies can definitely be included into a benchmark dataset, and that our benchmark is able to compare two alternative approaches dealing with summarizability issues.Comment: 15th International Workshop on Data Warehousing and OLAP (DOLAP 2012), Maui : United States (2012

arXiv.org e-Print Archive

GeneReg: integration of experimental data on the DNA transcription process

Author: Cortés-Calabuig Álvaro
De Moor Bart
Denecker Marc
Lemmens Karen
Marchal Kathleen
Pastor David
Publication venue
Publication date: 01/01/2007
Field of study

Ghent University Academic Bibliography

Data Mining

Author: Parker Julian
Sloan Terence
Yau Hon
Publication venue
Publication date: 01/01/1998
Field of study

Edinburgh Research Explorer

Finding needles in haystacks: linking scientific names, reference specimens and molecular data for Fungi

Author: Abarenkov K
Aime MC
Ariyawansa HA
Bidartondo M
Boekhout T
Buyck B
Cai Q
Cardinali G
Chen J
Crespo A
Crous PW
Damm U
De Beer ZW
Dentinger BTM
Dieguez Uribeondo J
Divakar PK
Duenas M
Duong V
Feau N
Federhen S
Fliegerova K
Garcia MA
Ge Z-W
Griffith G
Groenewald JZ
Groenewald M
Grube M
Gryzenhout M
Gueidan C
Guo L
Hambleton S
Hamelin R
Hansen K
Hofstetter V
Hong S-B
Houbraken J
Hughes K
Hyde KD
Inderbitzin P
Irinyi L
Johnston PR
Karunarathna SC
Kirk PM
Koljalg U
Kovacs GM
Kraichak E
Krizsan K
Kurtzman CP
Larsson K-H
Leavitt S
Letcher PM
Liimatainen K
Liu J-K
Lodge DJ
Luangsa-ard JJ
Lumbsch HT
Maharachchikumbura SSN
Manamgoda D
Martin MP
Meyer W
Miller AN
Minnis AM
Moncalvo J-M
Mule G
Nakasone KK
Nilsson RH
Niskanen T
Olariaga I
Papp T
Petkovits T
Pino-Bodas R
Powell MJ
Raja HA
Redecker D
Robbertse B
Robert V
Sarmiento-Ramirez JM
Schoch CL
Seifert KA
Shrestha B
Stenroos S
Stielow B
Subbarao KV
Suh S-O
Tanaka K
Tedersoo L
Teresa Telleria M
Udayanga D
Untereiner WA
Vagvoelgyi C
Visagie C
Voigt K
Walker DM
Weir BS
Weiss M
Wijayawardene NN
Wingfield MJ
Xu JP
Yang ZL
Zhang N
Zhuang W-Y
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2014
Field of study

DNA phylogenetic comparisons have shown that morphology-based species recognition often underestimates fungal diversity. Therefore, the need for accurate DNA sequence data, tied to both correct taxonomic names and clearly annotated specimen data, has never been greater. Furthermore, the growing number of molecular ecology and microbiome projects using high-throughput sequencing require fast and effective methods for en masse species assignments. In this article, we focus on selecting and re-annotating a set of marker reference sequences that represent each currently accepted order of Fungi. The particular focus is on sequences from the internal transcribed spacer region in the nuclear ribosomal cistron, derived from type specimens and/or ex-type cultures. Re-annotated and verified sequences were deposited in a curated public database at the National Center for Biotechnology Information (NCBI), namely the RefSeq Targeted Loci (RTL) database, and will be visible during routine sequence similarity searches with NR_prefixed accession numbers. A set of standards and protocols is proposed to improve the data quality of new sequences, and we suggest how type and other reference sequences can be used to improve identification of Fungi

Shared Research Repository

Wageningen University & Research Publications

Spiral - Imperial College Digital Repository

A review of data visualization: opportunities in manufacturing sequence management.

Author: Al-Gaylani M. F.
Sackett Peter J.
Tiwari Ashutosh
Williams D.
Publication venue: 'Informa UK Limited'
Publication date: 01/10/2006
Field of study

Data visualization now benefits from developments in technologies that offer innovative ways of presenting complex data. Potentially these have widespread application in communicating the complex information domains typical of manufacturing sequence management environments for global enterprises. In this paper the authors review the visualization functionalities, techniques and applications reported in literature, map these to manufacturing sequence information presentation requirements and identify the opportunities available and likely development paths. Current leading-edge practice in dynamic updating and communication with suppliers is not being exploited in manufacturing sequence management; it could provide significant benefits to manufacturing business. In the context of global manufacturing operations and broad-based user communities with differing needs served by common data sets, tool functionality is generally ahead of user application

Crossref

Cranfield CERES

A unified view of data-intensive flows in business intelligence systems : a survey

Author: Abelló Gamazo Alberto
Jovanovic Petar
Romero Moral Óscar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

RACOFI: A Rule-Applying Collaborative Filtering System

Author: Boley Harold
Lemire Daniel
Publication venue
Publication date: 01/01/2003
Field of study

In this paper we give an overview of the RACOFI (Rule-Applying Collaborative Filtering) multidimensional rating system and its related technologies. This will be exemplified with RACOFI Music, an implemented collaboration agent that assists on-line users in the rating and recommendation of audio (Learning) Objects. It lets users rate contemporary Canadian music in the five dimensions of impression, lyrics, music, originality, and production. The collaborative filtering algorithms STI Pearson, STIN2, and the Per Item Average algorithms are then employed together with RuleML-based rules to recommend music objects that best match user queries. RACOFI has been on-line since August 2003 at http://racofi.elg.ca.

CogPrints Cognitive Sciences Eprint Archive