Search CORE

684 research outputs found

A Comprehensive and Modularized Platform for Time Series Forecast and Analytics

Author: Fry Chris
Ganguly Apratim
Lichtendahl Jr. Kenneth C.
Lu Yiqian
Maharaj Yogishwar
Shen Weijie
Wu Haoyun
Yousuf Kashif
Publication venue: Technical Disclosure Commons
Publication date: 29/10/2020
Field of study

Users that work with time series data typically disaggregate time series problems into various isolated tasks and use specific libraries, packages, tools, and services that deal with each individual task. However, the tools used are often fragmented. Analysts have to load different packages for common tasks such as data preprocessing, clustering, feature extraction, forecasting, hierarchical reconciliation, evaluation, and visualization. This disclosure describes a reliable, scalable infrastructure to meet various needs of time series practitioners without adding engineering overload. The infrastructure is modularized and the modules are connected in a flow type declarative language which makes the infrastructure extensible and future proof. Practitioners can use the entire infrastructure or only certain modules, while performing other operations using first or third party libraries or pipelines

Technical Disclosure Common

A Service-Oriented Approach for Network-Centric Data Integration and Its Application to Maritime Surveillance

Author: GIULI DINO
PAGANELLI FEDERICA
PARLANTI DAVID
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

Maritime-surveillance operators still demand for an integrated maritime picture better supporting international coordination for their operations, as looked for in the European area. In this area, many data-integration efforts have been interpreted in the past as the problem of designing, building and maintaining huge centralized repositories. Current research activities are instead leveraging service-oriented principles to achieve more flexible and network-centric solutions to systems and data integration. In this direction, this article reports on the design of a SOA platform, the Service and Application Integration (SAI) system, targeting novel approaches for legacy data and systems integration in the maritime surveillance domain. We have developed a proof-of-concept of the main system capabilities to assess feasibility of our approach and to evaluate how the SAI middleware architecture can fit application requirements for dynamic data search, aggregation and delivery in the distributed maritime domain

Archivio della Ricerca - Università di Pisa

Advanced grouping and aggregation for data integration

Author: Eike Schallehn
Gunter Saake
Kai-Uwe Sattler
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2004
Field of study

Crossref

Genomic data integration and user-defined sample-set extraction for population variant analysis

Author: Alfonsi Tommaso
Bernasconi Anna
Canakoglu Arif
Masseroli Marco
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

Population variant analysis is of great importance for gathering insights into the links between human genotype and phenotype. The 1000 Genomes Project established a valuable reference for human genetic variation; however, the integrative use of the corresponding data with other datasets within existing repositories and pipelines is not fully supported. Particularly, there is a pressing need for flexible and fast selection of population partitions based on their variant and metadata-related characteristics

Archivio istituzionale della ricerca - Politecnico di Milano

PubMed Central

Recommended from our members

Analytic provenance for sensemaking: A research agenda

Author: Attfield S.
Jankun-Kelly T. J.
Nguyen P.
Selvaraj N.
Wheat A.
Xu K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/05/2015
Field of study

Sensemaking is a process of finding meaning from information that often involves activities such as information foraging and hypothesis generation. It can be valuable to maintain a history of the data and reasoning involved. This history, commonly known as provenance information, can be a resource for 'reflection-in-action'' during analysis, supporting collaboration between analysts, and can help trace data quality and uncertainty through the analysis process. Currently, there is limited work on utilizing analytic provenance, which captures the interactive data exploration and human reasoning process, to support sensemaking. This article presents and extends the research challenges discussed in a IEEE VIS 2014 workshop on this topic to provide an agenda for sensemaking analytic provenance

City Research Online

Middlesex University Research Repository

Dynamic Integration of Evolving Distributed Databases using Services

Author: WENG BIN
Publication venue
Publication date: 01/01/2010
Field of study

This thesis investigates the integration of many separate existing heterogeneous and distributed databases which, due to organizational changes, must be merged and appear as one database. A solution to some database evolution problems is presented. It presents an Evolution Adaptive Service-Oriented Data Integration Architecture (EA-SODIA) to dynamically integrate heterogeneous and distributed source databases, aiming to minimize the cost of the maintenance caused by database evolution. An algorithm, named Relational Schema Mapping by Views (RSMV), is designed to integrate source databases that are exposed as services into a pre-designed global schema that is in a data integrator service. Instead of producing hard-coded programs, views are built using relational algebra operations to eliminate the heterogeneities among the source databases. More importantly, the definitions of those views are represented and stored in the meta-database with some constraints to test their validity. Consequently, the method, called Evolution Detection, is then able to identify in the meta-database the views affected by evolutions and then modify them automatically. An evaluation is presented using case study. Firstly, it is shown that most types of heterogeneity defined in this thesis can be eliminated by RSMV, except semantic conflict. Secondly, it presents that few manual modification on the system is required as long as the evolutions follow the rules. For only three types of database evolutions, human intervention is required and some existing views are discarded. Thirdly, the computational cost of the automatic modification shows a slow linear growth in the number of source database. Other characteristics addressed include EA-SODIA’ scalability, domain independence, autonomy of source databases, and potential of involving other data sources (e.g.XML). Finally, the descriptive comparison with other data integration approaches is presented. It shows that although other approaches may provide better performance of query processing in some circumstances, the service-oriented architecture provide better autonomy, flexibility and capability of evolution

Durham e-Theses

OpenGrey Repository

GEM: requirement-driven generation of ETL and multidimensional conceptual designs

Author: Abelló Gamazo Alberto
Romero Moral Óscar
Simitsis Alkis
Publication venue
Publication date: 01/01/2010
Field of study

Technical ReportAt the early stages of a data warehouse design project, the main objective is to collect the business requirements and needs, and translate them into an appropriate conceptual, multidimensional design. Typically, this task is performed manually, through a series of interviews involving two different parties: the business analysts and technical designers. Producing an appropriate conceptual design is an errorprone task that undergoes several rounds of reconciliation and redesigning, until the business needs are satisfied. It is of great importance for the business of an enterprise to facilitate and automate such a process. The goal of our research is to provide designers with a semi-automatic means for producing conceptual multidimensional designs and also, conceptual representation of the extract-transform-load (ETL)processes that orchestrate the data flow from the operational sources to the data warehouse constructs. In particular, we describe a method that combines information about the data sources along with the business requirements, for validating and completing –if necessary– these requirements, producing a multidimensional design, and identifying the ETL operations needed. We present our method in terms of the TPC-DS benchmark and show its applicability and usefulness.Preprin

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Synthesizing System Integration Requirements Model Fragments

Author: Bolloju Narasimha
Tan Chuan Hoo
Publication venue: AIS Electronic Library (AISeL)
Publication date: 17/07/2009
Field of study

Systems integration is an enduring issue in organizations. Many organizations have often been faced with the predicament of managing large and complex IT infrastructures accumulated over the years. Before proposing suitable integration architecture and selecting appropriate implementation solutions, a holistic and clear understanding of the enterprise-wide integration requirements among various internal and external systems is needed. This paper builds on prior literature on conceptual modelling of integration requirements to present an algorithm that synthesizes model fragments, i.e., piecemeal sections of the integration requirements. The details of the algorithm, for synthesizing two or more model fragments into a single integration requirements model, are detailed in this paper. An empirical assessment of the algorithm\u27s generated integration solution is made by comparing it against that performed manually

AIS Electronic Library (AISeL)