38 research outputs found
Proceedings of the Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015) Krakow, Poland
Proceedings of: Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015). Krakow (Poland), September 10-11, 2015
Ontology based data warehousing for mining of heterogeneous and multidimensional data sources
Heterogeneous and multidimensional big-data sources are virtually prevalent in all business environments. System and data analysts are unable to fast-track and access big-data sources. A robust and versatile data warehousing system is developed, integrating domain ontologies from multidimensional data sources. For example, petroleum digital ecosystems and digital oil field solutions, derived from big-data petroleum (information) systems, are in increasing demand in multibillion dollar resource businesses worldwide. This work is recognized by Industrial Electronic Society of IEEE and appeared in more than 50 international conference proceedings and journals
30th International Conference on Condition Monitoring and Diagnostic Engineering Management (COMADEM 2017)
Proceedings of COMADEM 201
Active provenance for data intensive research
The role of provenance information in data-intensive research is a significant topic of
discussion among technical experts and scientists. Typical use cases addressing traceability,
versioning and reproducibility of the research findings are extended with more
interactive scenarios in support, for instance, of computational steering and results
management. In this thesis we investigate the impact that lineage records can have on
the early phases of the analysis, for instance performed through near-real-time systems
and Virtual Research Environments (VREs) tailored to the requirements of a specific
community. By positioning provenance at the centre of the computational research
cycle, we highlight the importance of having mechanisms at the data-scientists’ side
that, by integrating with the abstractions offered by the processing technologies, such
as scientific workflows and data-intensive tools, facilitate the experts’ contribution to
the lineage at runtime. Ultimately, by encouraging tuning and use of provenance for
rapid feedback, the thesis aims at improving the synergy between different user groups
to increase productivity and understanding of their processes.
We present a model of provenance, called S-PROV, that uses and further extends
PROV and ProvONE. The relationships and properties characterising the workflow’s
abstractions and their concrete executions are re-elaborated to include aspects related
to delegation, distribution and steering of stateful streaming operators. The model is
supported by the Active framework for tuneable and actionable lineage ensuring the
user’s engagement by fostering rapid exploitation. Here, concepts such as provenance
types, configuration and explicit state management allow users to capture complex
provenance scenarios and activate selective controls based on domain and user-defined
metadata. We outline how the traces are recorded in a new comprehensive system,
called S-ProvFlow, enabling different classes of consumers to explore the provenance
data with services and tools for monitoring, in-depth validation and comprehensive
visual-analytics. The work of this thesis will be discussed in the context of an existing
computational framework and the experience matured in implementing provenance-aware
tools for seismology and climate VREs. It will continue to evolve through
newly funded projects, thereby providing generic and user-centred solutions for data-intensive
research
Aplicaciones e innovación de la ingenierÃa en ciencia y tecnologÃa
El mundo ha avanzado con la llegada de la ciencia y tecnologÃa desde los diversos campos que la conforman con una visión de
innovación involucrando a la sociedad y asà satisfacer las necesidades que se han convertido en una problemática para el campo cientÃfico.
El camino para llegar a un concepto de ciudades inteligentes, por ejemplo, puede conjugar varias aristas que dan cuenta de
un aporte de diversas competencias y destrezas por parte de la comunidad cientÃfica. De esta manera, podemos encontrar
aportes en redes eléctricas inteligentes, servicios de comunicación masiva, aprovechamiento de los recursos hÃdricos, análisis
de ondas sÃsmicas, manejo de datos en la nube o la interpretación de imagen para aplicaciones médicas, cumpliendo asÃ
una vasta demanda de oportunidades para la generación de nuevo conocimiento que aplica la ciencia y tecnologÃa en favor
de la sociedad.
Este libro es una recopilación de artÃculos cientÃficos del área de Ciencia y TecnologÃa de la Universidad Politécnica Salesiana,
trabajo presentado desde las ingenierÃas: Civil, Electricidad, Electrónica y Automatización, Computación, Telecomunicaciones
y Mecatrónica
Scheduling for Large Scale Distributed Computing Systems: Approaches and Performance Evaluation Issues
Although our everyday life and society now depends heavily oncommunication infrastructures and computation infrastructures,scientists and engineers have always been among the main consumers ofcomputing power. This document provides a coherent overview of theresearch I have conducted in the last 15 years and which targets themanagement and performance evaluation of large scale distributedcomputing infrastructures such as clusters, grids, desktop grids,volunteer computing platforms, ... when used for scientific computing.In the first part of this document, I present how I have addressedscheduling problems arising on distributed platforms (like computinggrids) with a particular emphasis on heterogeneity and multi-userissues, hence in connection with game theory. Most of these problemsare relaxed from a classical combinatorial optimization formulationinto a continuous form, which allows to easily account for keyplatform characteristics such as heterogeneity or complex topologywhile providing efficient practical and distributed solutions.The second part presents my main contributions to the SimGrid project,which is a simulation toolkit for building simulators of distributedapplications (originally designed for scheduling algorithm evaluationpurposes). It comprises a unified presentation of how the questions ofvalidation and scalability have been addressed in SimGrid as well asthoughts on specific challenges related to methodological aspects andto the application of SimGrid to the HPC context
Shortest Route at Dynamic Location with Node Combination-Dijkstra Algorithm
Abstract— Online transportation has become a basic
requirement of the general public in support of all activities to go
to work, school or vacation to the sights. Public transportation
services compete to provide the best service so that consumers
feel comfortable using the services offered, so that all activities
are noticed, one of them is the search for the shortest route in
picking the buyer or delivering to the destination. Node
Combination method can minimize memory usage and this
methode is more optimal when compared to A* and Ant Colony
in the shortest route search like Dijkstra algorithm, but can’t
store the history node that has been passed. Therefore, using
node combination algorithm is very good in searching the
shortest distance is not the shortest route. This paper is
structured to modify the node combination algorithm to solve the
problem of finding the shortest route at the dynamic location
obtained from the transport fleet by displaying the nodes that
have the shortest distance and will be implemented in the
geographic information system in the form of map to facilitate
the use of the system.
Keywords— Shortest Path, Algorithm Dijkstra, Node
Combination, Dynamic Location (key words