Search CORE

650 research outputs found

The impact of spatial data redundancy on SOLAP query performance

Author: CIFERRI Cristina Dutra de Aguiar
CIFERRI Ricardo Rodrigues
OLIVEIRA Anjolina Grisi de
SIQUEIRA Thiago Luís Lopes
TIMES Valéria Cesário
Publication venue: Sociedade Brasileira de Computação
Publication date: 01/01/2009
Field of study

Geographic Data Warehouses (GDW) are one of the main technologies used in decision-making processes and spatial analysis, and the literature proposes several conceptual and logical data models for GDW. However, little effort has been focused on studying how spatial data redundancy affects SOLAP (Spatial On-Line Analytical Processing) query performance over GDW. In this paper, we investigate this issue. Firstly, we compare redundant and non-redundant GDW schemas and conclude that redundancy is related to high performance losses. We also analyze the issue of indexing, aiming at improving SOLAP query performance on a redundant GDW. Comparisons of the SB-index approach, the star-join aided by R-tree and the star-join aided by GiST indicate that the SB-index significantly improves the elapsed time in query processing from 25% up to 99% with regard to SOLAP queries defined over the spatial predicates of intersection, enclosure and containment and applied to roll-up and drill-down operations. We also investigate the impact of the increase in data volume on the performance. The increase did not impair the performance of the SB-index, which highly improved the elapsed time in query processing. Performance tests also show that the SB-index is far more compact than the star-join, requiring only a small fraction of at most 0.20% of the volume. Moreover, we propose a specific enhancement of the SB-index to deal with spatial data redundancy. This enhancement improved performance from 80 to 91% for redundant GDW schemas.FAPESPCNPqCoordenacao de Aperfeicoamento de Pessoal de Nivel Superior (CAPES)INEPFINE

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Universidade de São Paulo

Automatic assistants for database exploration

Author: Sellam T.H.J. (Thibault)
Publication venue
Publication date: 03/11/2016
Field of study

CWI's Institutional Repository

A Quantitative Framework for Assessing Vulnerability and Redundancy of Freight Transportation Networks

Author: Jansuwan Sarawut
Publication venue: DigitalCommons@USU
Publication date: 01/05/2013
Field of study

Freight transportation networks are an important component of everyday life in modern society. Disruption to these networks can make peoples’ daily lives extremely difficult as well as seriously cripple economic productivity. This dissertation develops a quantitative framework for assessing vulnerability and redundancy of freight transportation networks. The framework consists of three major contributions: (1) a two- stage approach for estimating a statewide truck origin-destination (O-D) trip table, (2) a decision support tool for assessing vulnerability of freight transportation networks, and (3) a quantitative approach for measuring redundancy of freight transportation networks.The dissertation first proposes a two-stage approach to estimate a statewide truck O-D trip table. The proposed approach is supported by two sequential stages: the first stage estimates a commodity-based truck O-D trip table using the commodity flows derived from the Freight Analysis Framework (FAF) database, and the second stage uses the path flow estimator (PFE) concept to refine the truck trip table obtained from the first stage using the truck counts from the statewide truck count program. The model allows great flexibility of incorporating data at different spatial levels for estimating the truck O- D trip table. The results from the second stage provide us a better understanding of truck flows on the statewide truck routes and corridors, and allow us to better manage the anticipated impacts caused by network disruptions.A decision support tool is developed to facilitate the decision making system through the application of its database management capabilities, graphical user interface, GIS-based visualization, and transportation network vulnerability analysis. The vulnerability assessment focuses on evaluating the statewide truck-freight bottlenecks/chokepoints. This dissertation proposes two quantitative measures: O-D connectivity (or detour route) in terms of distance and freight flow pattern change in terms of vehicle miles traveled (VMT). The case study adopts a “what-if” analysis approach by generating the disruption scenarios of the structurally deficient bridges in Utah due to earthquakes. In addition, the potential impacts of disruptions to multiple bridges in both rural and urban areas are evaluated and compared to the single bridge failure scenarios.This dissertation also proposes an approach to measure the redundancy of freight transportation networks based on two main dimensions: route diversity and network spare capacity. The route diversity dimension is used to evaluate the existence of multiple efficient routes available for users or the degree of connections between a specific O-D pair. The network spare capacity dimension is used to quantify the network- wide spare capacity with an explicit consideration of congestion effect. These two dimensions can complement each other by providing a two-dimensional characterization of freight transportation network redundancy. Case studies of the Utah statewide transportation network and coal multimodal network are conducted to demonstrate the features of the vulnerability and redundancy measures and the applicability of the quantitative assessment methodology

DigitalCommons@USU

Wireless sensor data processing for on-site emergency response

Author: Yanning Yang (667614)
Publication venue
Publication date: 01/01/2011
Field of study

This thesis is concerned with the problem of processing data from Wireless Sensor Networks (WSNs) to meet the requirements of emergency responders (e.g. Fire and Rescue Services). A WSN typically consists of spatially distributed sensor nodes to cooperatively monitor the physical or environmental conditions. Sensor data about the physical or environmental conditions can then be used as part of the input to predict, detect, and monitor emergencies. Although WSNs have demonstrated their great potential in facilitating Emergency Response, sensor data cannot be interpreted directly due to its large volume, noise, and redundancy. In addition, emergency responders are not interested in raw data, they are interested in the meaning it conveys. This thesis presents research on processing and combining data from multiple types of sensors, and combining sensor data with other relevant data, for the purpose of obtaining data of greater quality and information of greater relevance to emergency responders. The current theory and practice in Emergency Response and the existing technology aids were reviewed to identify the requirements from both application and technology perspectives (Chapter 2). The detailed process of information extraction from sensor data and sensor data fusion techniques were reviewed to identify what constitutes suitable sensor data fusion techniques and challenges presented in sensor data processing (Chapter 3). A study of Incident Commanders’ requirements utilised a goal-driven task analysis method to identify gaps in current means of obtaining relevant information during response to fire emergencies and a list of opportunities for WSN technology to fill those gaps (Chapter 4). A high-level Emergency Information Management System Architecture was proposed, including the main components that are needed, the interaction between components, and system function specification at different incident stages (Chapter 5). A set of state-awareness rules was proposed, and integrated with Kalman Filter to improve the performance of filtering. The proposed data pre-processing approach achieved both improved outlier removal and quick detection of real events (Chapter 6). A data storage mechanism was proposed to support timely response to queries regardless of the increase in volume of data (Chapter 7). What can be considered as “meaning” (e.g. events) for emergency responders were identified and a generic emergency event detection model was proposed to identify patterns presenting in sensor data and associate patterns with events (Chapter 8). In conclusion, the added benefits that the technical work can provide to the current Emergency Response is discussed and specific contributions and future work are highlighted (Chapter 9)

Loughborough University Institutional Repository

A cooperative framework for molecular biology database integration using image object selection.

Author: Khan N.
Khan N.
Publication venue
Publication date: 01/01/2004
Field of study

The theme and the concept of 'Molecular Biology Database Integration’ and the problems associated with this concept initiated the idea for this Ph.D research. The available technologies facilitate to analyse the data independently and discretely but it fails to integrate the data resources for more meaningful information. This along with the integration issues created the scope for this Ph.D research. The research has reviewed the 'database interoperability' problems and it has suggested a framework for integrating the molecular biology databases. The framework has proposed to develop a cooperative environment to share information on the basis of common purpose for the molecular biology databases. The research has also reviewed other implementation and interoperability issues for laboratory based, dedicated and target specific database. The research has addressed the following issues: - diversity of molecular biology databases schemas, schema constructs and schema implementation -multi-database query using image object keying -database integration technologies using context graph - automated navigation among these databases This thesis has introduced a new approach for database implementation. It has introduced an interoperable component database concept to initiate multidatabase query on gene mutation data. A number of data models have been proposed for gene mutation data which is the basis for integrating the target specific component database to be integrated with the federated information system. The proposed data models are: data models for genetic trait analysis, classification of gene mutation data, pathological lesion data and laboratory data. The main feature of this component database is non-overlapping attributes and it will follow non-redundant integration approach as explained in the thesis. This will be achieved by storing attributes which will not have the union or intersection of any attributes that exist in public domain molecular biology databases. Unlike data warehousing technique, this feature is quite unique and novel. The component database will be integrated with other biological data sources for sharing information in a cooperative environment. This/involves developing new tools. The thesis explains the role of these new tools which are: meta data extractor, mapping linker, query generator and result interpreter. These tools are used for a transparent integration without creating any global schema of the participating databases. The thesis has also established the concept of image object keying for multidatabase query and it has proposed a relevant algorithm for matching protein spot in gel electrophoresis image. An object spot in gel electrophoresis image will initiate the query when it is selected by the user. It matches the selected spot with other similar spots in other resource databases. This image object keying method is an alternative to conventional multidatabase query which requires writing complex SQL scripts. This method also resolve the semantic conflicts that exist among molecular biology databases. The research has proposed a new framework based on the context of the web data for interactions with different biological data resources. A formal description of the resource context is described in the thesis. The implementation of the context into Resource Document Framework (RDF) will be able to increase the interoperability by providing the description of the resources and the navigation plan for accessing the web based databases. A higher level construct is developed (has, provide and access) to implement the context into RDF for web interactions. The interactions within the resources are achieved by utilising an integration domain to extract the required information with a single instance and without writing any query scripts. The integration domain allows to navigate and to execute the query plan within the resource databases. An extractor module collects elements from different target webs and unify them as a whole object in a single page. The proposed framework is tested to find specific information e.g., information on Alzheimer's disease, from public domain biology resources, such as, Protein Data Bank, Genome Data Bank, Online Mendalian Inheritance in Man and local database. Finally, the thesis proposes further propositions and plans for future work

Middlesex University Research Repository

Multimodal and multidimensional geodata interaction and visualization

Author: Abed-Il Fattah Al-Lami Zaid Mustafa
Publication venue
Publication date: 01/01/2019
Field of study

This PhD proposes the development of a Science Data Visualization System, SdVS, that analyzes and presents different kinds of visualizing and interacting techniques with Geo-data, in order to deal with knowledge about Geo-data using GoogleEarth. After that, we apply the archaeological data as a case study, and, as a result, we develop the Archaeological Visualization System, ArVS, using new visualization paradigms and Human-Computer-Interaction techniques based on SdVS. Furthermore, SdVS provides guidelines for developing any other visualization and interacting applications in the future, and how the users can use SdVS system to enhance the understanding and dissemination of knowledge

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional da Universidade de Santiago de Compostela

A cooperative framework for molecular biology database integration using image object selection

Author: Khan N.
Khan N.
Publication venue
Publication date: 01/01/2004
Field of study

The theme and the concept of 'Molecular Biology Database Integration' and the problems associated with this concept initiated the idea for this Ph.D research. The available technologies facilitate to analyse the data independently and discretely but it fails to integrate the data resources for more meaningful information. This along with the integration issues created the scope for this Ph.D research. The research has reviewed the 'database interoperability' problems and it has suggested a framework for integrating the molecular biology databases. The framework has proposed to develop a cooperative environment to share information on the basis of common purpose for the molecular biology databases. The research has also reviewed other implementation and interoperability issues for laboratory based, dedicated and target specific database. The research has addressed the following issues: diversity of molecular biology databases schemas, schema constructs and schema implementation multi-database query using image object keying, database integration technologies using context graph, automated navigation among these databases. This thesis has introduced a new approach for database implementation. It has introduced an interoperable component database concept to initiate multidatabase query on gene mutation data. A number of data models have been proposed for gene mutation data which is the basis for integrating the target specific component database to be integrated with the federated information system. The proposed data models are: data models for genetic trait analysis, classification of gene mutation data, pathological lesion data and laboratory data. The main feature of this component database is non-overlapping attributes and it will follow non-redundant integration approach as explained in the thesis. This will be achieved by storing attributes which will not have the union or intersection of any attributes that exist in public domain molecular biology databases. Unlike data warehousing technique, this feature is quite unique and novel. The component database will be integrated with other biological data sources for sharing information in a cooperative environment. This involves developing new tools. The thesis explains the role of these new tools which are: meta data extractor, mapping linker, query generator and result interpreter. These tools are used for a transparent integration without creating any global schema of the participating databases. The thesis has also established the concept of image object keying for multidatabase query and it has proposed a relevant algorithm for matching protein spot in gel electrophoresis image. An object spot in gel electrophoresis image will initiate the query when it is selected by the user. It matches the selected spot with other similar spots in other resource databases. This image object keying method is an alternative to conventional multidatabase query which requires writing complex SQL scripts. This method also resolve the semantic conflicts that exist among molecular biology databases. The research has proposed a new framework based on the context of the web data for interactions with different biological data resources. A formal description of the resource context is described in the thesis. The implementation of the context into Resource Document Framework (RDF) will be able to increase the interoperability by providing the description of the resources and the navigation plan for accessing the web based databases. A higher level construct is developed (has, provide and access) to implement the context into RDF for web interactions. The interactions within the resources are achieved by utilising an integration domain to extract the required information with a single instance and without writing any query scripts. The integration domain allows to navigate and to execute the query plan within the resource databases. An extractor module collects elements from different target webs and unify them as a whole object in a single page. The proposed framework is tested to find specific information e.g., information on Alzheimer's disease, from public domain biology resources, such as, Protein Data Bank, Genome Data Bank, Online Mendalian Inheritance in Man and local database. Finally, the thesis proposes further propositions and plans for future work

Middlesex University Research Repository

Disaster Relief Network Design: Investigating the Effects of Physical Barriers and Information Sharing

Author: Ahmed Ali
Publication venue
Publication date: 01/06/2013
Field of study

Planning, organizing, and managing logistics activities by humanitarian organizations before and after a disaster like a flood, plays an important role in the minimization of public suffering. This thesis investigates two crucial issues that define disaster relief network designs; these are the presence of physical barriers, such as flooded regions of different impacts, and the effect or lack of information sharing. It is common that natural and/or man-made disasters cause major disruptions in critical infrastructure. The availability and proper dissemination of information amongst key players provides efficient operations which are reflected in minimizing suffering. The integrated model analyzes six barrier - information sharing scenarios using modern decision support tools, such as geographic information systems and optimization tools. Montreal districts' populations and road network map are used for the investigation. First, Demand is forecasted based on flood damage estimates, locating central warehouses follows, then allocating regional warehouses, and finally routing solutions are computed. Both location-allocation and routing integrated models take capacity into consideration. The findings are, the lack of information sharing and the presence of barriers cause increase in travel distance as opposed to having full information disclosure and no barriers. Total distance traveled in the presence of scaled-cost-barriers were more than that of having forbidden-zone-barriers or no-barriers

Concordia University Research Repository

Using data analysis and Information visualization techniques to support the effective analysis of large financial data sets

Author: Nyumbeka Dumisani Joshua
Publication venue: 'University of Zagreb, Faculty of Science, Department of Mathematics'
Publication date: 01/01/2016
Field of study

There have been a number of technological advances in the last ten years, which has resulted in the amount of data generated in organisations increasing by more than 200% during this period. This rapid increase in data means that if financial institutions are to derive significant value from this data, they need to identify new ways to analyse this data effectively. Due to the considerable size of the data, financial institutions also need to consider how to effectively visualise the data. Traditional tools such as relational database management systems have problems processing large amounts of data due to memory constraints, latency issues and the presence of both structured and unstructured data The aim of this research was to use data analysis and information visualisation techniques (IV) to support the effective analysis of large financial data sets. In order to visually analyse the data effectively, the underlying data model must produce results that are reliable. A large financial data set was identified, and used to demonstrate that IV techniques can be used to support the effective analysis of large financial data sets. A review of the literature on large financial data sets, visual analytics, existing data management and data visualisation tools identified the shortcomings of existing tools. This resulted in the determination of the requirements for the data management tool, and the IV tool. The data management tool identified was a data warehouse and the IV toolkit identified was Tableau. The IV techniques identified included the Overview, Dashboards and Colour Blending. The IV tool was implemented and published online and can be accessed through a web browser interface. The data warehouse and the IV tool were evaluated to determine their accuracy and effectiveness in supporting the effective analysis of the large financial data set. The experiment used to evaluate the data warehouse yielded positive results, showing that only about 4% of the records had incorrect data. The results of the user study were positive and no major usability issues were identified. The participants found the IV techniques effective for analysing the large financial data set

Nelson Mandela University

South East Academic Libraries System (SEALS)