Search CORE

14 research outputs found

A BPMN-Based Design and Maintenance Framework for ETL Processes

Author: El Akkaoui Zineb
Mazón Jose-Norberto
Trujillo Juan
Zimányi Esteban
Publication venue: 'IGI Global'
Publication date: 01/01/2013
Field of study

Business Intelligence (BI) applications require the design, implementation, and maintenance of processes that extract, transform, and load suitable data for analysis. The development of these processes (known as ETL) is an inherently complex problem that is typically costly and time consuming. In a previous work, we have proposed a vendor-independent language for reducing the design complexity due to disparate ETL languages tailored to specific design tools with steep learning curves. Nevertheless, the designer still faces two major issues during the development of ETL processes: (i) how to implement the designed processes in an executable language, and (ii) how to maintain the implementation when the organization data infrastructure evolves. In this paper, we propose a model-driven framework that provides automatic code generation capability and ameliorate maintenance support of our ETL language. We present a set of model-to-text transformations able to produce code for different ETL commercial tools as well as model-to-model transformations that automatically update the ETL models with the aim of supporting the maintenance of the generated code according to data source evolution. A demonstration using an example is conducted as an initial validation to show that the framework covering modeling, code generation and maintenance could be used in practice

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

DI-fusion

The Specification of ETL Transformation Operations based on Weaving Models

Author: Petrović Marko
Stanojević Milan
Turajlić Nina
Vučković Milica
Publication venue: Agora University Press
Publication date: 14/09/2014
Field of study

In the ETL process the transformation of data is achieved through the execution of a set of transformation operations. The realization of this process (the order in which the transformation operations must be executed) should be preceded by a specification of the transformation process at a higher level of abstraction. The specification is given through mappings representing abstract operations specific to the transformation process. These mappings are defined through weaving models and metamodels. A generated weaving metamodel (GWMM) is proposed giving the complete mapping semantics through specific link types (representing the abstract operations) and appropriate OCL constraints. Weaving models specifying the actual mappings must be in accordance with this proposed GWMM

Agora University Editing House: Journals

Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data

Author
Publication venue: BioMed Central
Publication date: 25/02/2016
Field of study

Springer - Publisher Connector

Methodological challenges and analytic opportunities for modeling and interpreting Big Healthcare Data

Author: Dinov Ivo D
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/02/2016
Field of study

Abstract Managing, processing and understanding big healthcare data is challenging, costly and demanding. Without a robust fundamental theory for representation, analysis and inference, a roadmap for uniform handling and analyzing of such complex data remains elusive. In this article, we outline various big data challenges, opportunities, modeling methods and software techniques for blending complex healthcare data, advanced analytic tools, and distributed scientific computing. Using imaging, genetic and healthcare data we provide examples of processing heterogeneous datasets using distributed cloud services, automated and semi-automated classification techniques, and open-science protocols. Despite substantial advances, new innovative technologies need to be developed that enhance, scale and optimize the management and processing of large, complex and heterogeneous data. Stakeholder investments in data acquisition, research and development, computational infrastructure and education will be critical to realize the huge potential of big data, to reap the expected information benefits and to build lasting knowledge assets. Multi-faceted proprietary, open-source, and community developments will be essential to enable broad, reliable, sustainable and efficient data-driven discovery and analytics. Big data will affect every sector of the economy and their hallmark will be ‘team science’.http://deepblue.lib.umich.edu/bitstream/2027.42/134522/1/13742_2016_Article_117.pd

Crossref

Springer - Publisher Connector

PubMed Central

Deep Blue Documents at the University of Michigan

Revisión sistemática de la integración de modelos de desarrollo de software dirigido por modelos y metodologías ágiles

Author: Morales Arias Juan José
Pardo Calvache César Jesús
Publication venue: 'Servicio Nacional de Aprendizaje SENA - SENNOVA'
Publication date: 01/01/2016
Field of study

Currently, in some instances of the software development industry are carried out by means of manual activities and/or robust methodologies which can be often heavy and inefficient. This situation brings several issues related to the difficulty to produce software in a timely manner, agile, at low cost and with a high quality level. A way to improve this situation is to incorporate in the software development process the formalism and abstraction needed to automate and optimize the most critical tasks defined from methodologies used in software companies and starting from an agile approach. This would add value to the business and would improve significantly the process of software. In this sense, in order to publicize the benefits of agile approaches and programming environments driven models, a systematic review of the literature has been conducted so as to the projects where these approaches have been integrated globally. Besides, it has been possible to identify some benefits, which have been reported by different studies.Actualmente, en algunas instancias, la industria de desarrollo de software se lleva a cabo por medio de actividades manuales y/o metodologías robustas que pueden llegar a ser en muchos casos pesadas e ineficientes. Esta situación trae consigo algunos problemas relacionados con la dificultad para producir software de manera oportuna, ágil, a bajo costo y con un alto nivel de calidad. Una manera de mejorar esta situación está en añadir al proceso de desarrollo de software el formalismo y la abstracción necesaria que permita automatizar y optimizar las tareas más críticas definidas, a partir de las metodologías utilizadas en las empresas de software, y desde una perspectiva ágil. Esto añadiría valor agregado a los negocios y mejoraría el proceso de software considerablemente. En este sentido, con el objetivo de conocer las bondades de los enfoques ágiles y los entornos de programación dirigidos por modelos, se llevó a cabo una revisión sistemática de la literatura en relación con los proyectos donde se integran estos enfoques a nivel mundial, así como la identificació

DIALNET

Sistema de Revistas - SENA (Servicio Nacional de Aprendizaje)

DSS from an RE perspective: A systematic mapping

Author: García Stephany
Raventós Pagès Ruth
Romero Moral Óscar
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Decision support systems (DSS) provide a unified analytical view of business data to better support decision-making processes. Such systems have shown a high level of user satisfaction and return on investment. However, several surveys stress the high failure rate of DSS projects. This problem results from setting the wrong requirements by approaching DSS in the same way as operational systems, whereas a specific approach is needed. Although this is well-known, there is still a surprising gap on how to address requirements engineering (RE) for DSS.; To overcome this problem, we conducted a systematic mapping study to identify and classify the literature on DSS from an RE perspective. Twenty-seven primary studies that addressed the main stages of RE were selected, mapped, and classified into 39 models, 27 techniques, and 54 items of guidance. We have also identified a gap in the literature on how to design the DSS main constructs (typically, the data warehouse and data flows) in a methodological manner from the business needs. We believe this study will help practitioners better address the RE stages of DSS projects.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

TPVS: treasury product valoration system

Author: Mejía Torres Magda Lucía
Moreno Galeano Oscar Alejandro
Moreno Gutiérrez María Paula
Publication venue: 'Facultad De Ingenieria Universidad Del Zulia'
Publication date: 14/06/2018
Field of study

En este documento encontrará todo el planteamiento y desarrollo de TPVS. Un sistema que surge a partir de una necesidad específica propuesta por la empresa Management Solutions Colombia, en el que se pretende actualizar y automatizar el proceso de análisis de datos, el cual tienen muy bien definido a nivel empresarial, más no a nivel tecnológico. Esto se debe principalmente a que el área de tesorería no cuenta con soluciones de software suficientes que se adapten a la necesidad propias de la empresa. Es por eso que se ha decidido implementar un sistema que cumpla con los requerimientos del usuario, pero que, a su vez, sea la base de un software que pueda ser explorado, construido y ampliado a la medida, por aquellos que quieran hacer uso de este proyecto.In this document. We present the TPVS Project, its creation and development. It is a system that emerges from specific needs from the company Management Solutions Colombia, which main idea is update and automate the data analysis process, which is very well defined at the enterprise level, but not at the technological. This is mainly because the treasury area doesn't have enough software solutions that adapt their functionality to the proposed need. That is why a solution that takes into account specific user requirements has been implemented, but at the same time it is the basis of a software that can be explored, built and expanded taking into consideration the needs of the people who can use this project.Ingeniero (a) de SistemasPregrad

Repositorio Institucional - Pontificia Universidad Javeriana

Biblioteca Digital Icaro

I2ECR: Integrated and Intelligent Environment for Clinical Research

Author: Zaccaria GIAN MARIA
Publication venue: Politecnico di Torino
Publication date
Field of study

Clinical trials are designed to produce new knowledge about a certain disease, drug or treatment. During these studies, a huge amount of data is collected about participants, therapies, clinical procedures, outcomes, adverse events and so on. A multicenter, randomized, phase III clinical trial in Hematology enrolls up to hundreds of subjects and evaluates post-treatment outcomes on stratified sub- groups of subjects for a period of many years. Therefore, data collection in clinical trials is becoming complex, with huge amount of clinical and biological variables. Outside the medical field, data warehouses (DWs) are widely employed. A Data Ware-house is a “collection of integrated, subject-oriented databases designed to support the decision-making process”. To verify whether DWs might be useful for data quality and association analysis, a team of biomedical engineers, clinicians, biologists and statisticians developed the “I2ECR” project. I2ECR is an Integrated and Intelligent Environment for Clinical Research where clinical and omics data stand together for clinical use (reporting) and for generation of new clinical knowledge. I2ECR has been built from the “MCL0208” phase III, prospective, clinical trial, sponsored by the Fondazione Italiana Linfomi (FIL); this is actually a translational study, accounting for many clinical data, along with several clinical prognostic indexes (e.g. MIPI - Mantle International Prognostic Index), pathological information, treatment and outcome data, biological assessments of disease (MRD - Minimal Residue Disease), as well as many biological, ancillary studies, such as Mutational Analysis, Gene Expression Profiling (GEP) and Pharmacogenomics. In this trial forty-eight Italian medical centers were actively involved, for a total of 300 enrolled subjects. Therefore, I2ECR main objectives are: • to propose an integration project on clinical and molecular data quality concepts. The application of a clear row-data analysis as well as clinical trial monitoring strategies to implement a digital platform where clinical, biological and “omics” data are imported from different sources and well-integrated in a data- ware-house • to be a dynamic repository of data congruency quality rules. I2ECR allows to monitor, in a semi-automatic manner, the quality of data, in relation to the clinical data imported from eCRFs (electronic Case Report Forms) and from biologic and mutational datasets internally edited by local laboratories. Therefore, I2ECR will be able to detect missing data and mistakes derived from non-conventional data- entry activities by centers. • to provide to clinical stake-holders a platform from where they can easily design statistical and data mining analysis. The term Data Mining (DM) identifies a set of tools to searching for hidden patterns of interest in large and multivariate datasets. The applications of DM techniques in the medical field range from outcome prediction and patient classification to genomic medicine and molecular biology. I2ECR allows to clinical stake-holders to propose innovative methods of supervised and unsupervised feature extraction, data classification and statistical analysis on heterogeneous datasets associated to the MCL0208 clinical trial. Although MCL0208 study is the first example of data-population of I2ECR, the environment will be able to import data from clinical studies designed for other onco-hematologic diseases, too

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Geração de esqueletos para sistemas de ETL a partir de redes de Petri colorida

Author: Guimarães Hugo Miguel Teixeira Lopes
Publication venue
Publication date: 10/12/2014
Field of study

As Redes de Petri Coloridas são uma linguagem gráfica com uma semântica bem definida, que permite o desenho, especificação, simulação e validação de sistemas, cujos processos a modelar exijam características específicas de comunicação, concorrência e sincronização entre si. A nível aplicacional, as Redes de Petri Coloridas surgem em áreas muito diferentes, tais como a especificação de protocolos de comunicação, sistemas de controlo, sistemas de hardware ou de sistemas de software. Devido às suas características as Redes de Petri Coloridas foram adotadas, também, na modelação de sistemas de ETL (Extract-Transformation-Load). Meta-tarefas como Change Data Capture ou Surrogate Key Pipelining, frequentemente encontradas em sistemas de ETL convencionais, foram modeladas e validadas através do uso de redes de Petri Coloridas. Tal sustenta, de forma bastante efetiva, o objetivo principal deste trabalho de dissertação: desenvolver e implementar um sistema para a geração de esqueletos para sistemas de ETL a partir da correspondente Rede de Petri Colorida.Coloured Petri Nets are a graphical language with a well-formed semantic, that allows the design, specification, simulation, and validation of systems, which specific characteristics such as, communication, concurrency and synchronization have a main role in the processes to model. At application level, Coloured Petri Nets are used in a wide variety of scientific areas, such as communication protocol, control systems, hardware systems or software systems. Due their characteristics Coloured Petri Nets were also adopted in modeling ETL (Extract-TransformationLoad) systems. Meta-tasks like Change Data Capture or Surrogate Key Pipelining, that are frequently founded in conventional ETL system, were modeling and validated using Coloured Petri Nets. All this support, quite effectively, the main propose of this dissertation work: develop and implement a system to generating skeletons to ETL systems from the corresponding Coloured Petri Nets

Universidade do Minho: RepositoriUM

Descubrimiento de conocimientos en la base de datos académica de la Universidad Autónoma de Manizales aplicando redes neuronales

Author: Gutiérrez Jairo Elías
Publication venue: 'Facultad De Ingenieria Universidad Del Zulia'
Publication date: 01/01/2012
Field of study

Contexto: La educación superior en Colombia es un derecho de todos y es responsabilidad del Ministerio de Educación Nacional garantizarlo. Sin embargo, existen múltiples problemas que representan un reto a la hora de hacer efectivo este derecho. A los problemas propios del sistema educativo como son la baja calidad, la pertinencia y los bajos índices de cobertura, se suman otros problemas tales como la deserción y la poca vocación generados por factores propios del sistema de educación superior y factores externos relacionados con los estudiantes y su entorno social. Objetivo: Este proyecto se propone generar conocimiento útil para encontrar posibles causas del problema de la deserción estudiantil de la Universidad Autónoma de Manizales a partir de las grandes cantidades de información académica generada por los sistemas transaccionales de la universidad. Metodología: La primera fase de este proyecto propone verificar investigaciones previas acerca del problema de la deserción académica y otros problemas asociados a la educación superior a nivel nacional e internacional. Durante la segunda etapa se lleva a cabo el proceso de extracción de la información académica de los sistemas transaccionales de la Universidad Autónoma de Manizales; Y en la fase final se ejecuta el análisis de la información mediante técnicas de minería de datos las cuales son aplicadas de acuerdo al análisis realizado y las técnicas definidas después del proceso de extracción. Resultados: Este proyecto pretende generar como resultados una fuente de datos consolidada y normalizada de información académica de la Universidad Autónoma de Manizales que sea utilizable durante la ejecución de este proyecto y en proyectos futuros de minería de datos e inteligencia de negocios, un framework de minería de datos con una implementación básica para este proyecto, pero extensible a gran variedad de nuevos problemas y técnicas, y por último un conjunto de conclusiones acerca del problema de la deserción a partir de la información académica y las técnicas de minería de datos aplicadas.Context: In Colombia the educations is a right for all and must be guarantee by the National Minister of Education. However this right is complicated due to many problems. The educative system’s problems such as Relevance and low index of coverage must be added to others problems such desertion and lack of vocation generated by external factors related to students and your social environment. Objective: This project is oriented to generate a consolidate data source for find possible causes of student desertion problem in Universidad Autónoma de Manizales from the academic information generated by de transactional systems of the University. Methodology: The first phase of project propose verify previous investigations about the education academic desertion problem and other problems on national and international institutions. By the second phase is executed the extraction of information process from the transactional systems on the University. On the last phase is executed the process of analysis of information through of the data mining techniques selected on previous phases. Results: This project intended generate a consolidated and normalized data source useful by this project and futures projects about of data mining or business intelligence. Other result is a data mining framework with a basic implementation by this project, but extensible for variety of problems and needs. By last a set of conclusions about the academicals desertion problem on the Universidad Autónoma de Manizales

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio institucional UAM