52 research outputs found
Growth of relational model: Interdependence and complementary to big data
A database management system is a constant application of science that provides a platform for the creation, movement, and use of voluminous data. The area has witnessed a series of developments and technological advancements from its conventional structured database to the recent buzzword, bigdata. This paper aims to provide a complete model of a relational database that is still being widely used because of its well known ACID properties namely, atomicity, consistency, integrity and durability. Specifically, the objective of this paper is to highlight the adoption of relational model approaches by bigdata techniques. Towards addressing the reason for this in corporation, this paper qualitatively studied the advancements done over a while on the relational data model. First, the variations in the data storage layout are illustrated based on the needs of the application. Second, quick data retrieval techniques like indexing, query processing and concurrency control methods are revealed. The paper provides vital insights to appraise the efficiency of the structured database in the unstructured environment, particularly when both consistency and scalability become an issue in the working of the hybrid transactional and analytical database management system
Handbook of Computational Intelligence in Manufacturing and Production Management
Artificial intelligence (AI) is simply a way of providing a computer or a machine to think intelligently like human beings. Since human intelligence is a complex abstraction, scientists have only recently began to understand and make certain assumptions on how people think and to apply these assumptions in order to design AI programs. It is a vast knowledge base discipline that covers reasoning, machine learning, planning, intelligent search, and perception building. Traditional AI had the limitations to meet the increasing demand of search, optimization, and machine learning in the areas of large, biological, and commercial database information systems and management of factory automation for different industries such as power, automobile, aerospace, and chemical plants. The drawbacks of classical AI became more pronounced due to successive failures of the decade long Japanese project on fifth generation computing machines. The limitation of traditional AI gave rise to development of new computational methods in various applications of engineering and management problems. As a result, these computational techniques emerged as a new discipline called computational intelligence (CI)
Recommended from our members
Arc Marine as a spatial data infrastructure : a marine data model case study in whale tracking by satellite telemetry
The Arc Marine data model is a generalized template to guide the implementation of geographic information systems (GIS) projects in the marine environment. Arc Marine developed out of a collaborative process involving research and industry shareholders in coastal and marine research. This template models and attempts to standardize common marine data types to facilitate data sharing and analytical tool development. The next step in the development of Arc Marine is adaptation to the problems of specific research communities, and specific programs, under the broad umbrella of coastal and marine research by community specific customization of Arc Marine. In this study, Arc Marine was customized from its core model to fit the research goals of the whale satellite telemetry tagging program of the Oregon State University Marine Mammal Institute (MMI). This customization serves as a case study of the ability of Arc Marine to achieve its six primary objectives in the context of the marine animal tracking community. These objectives are: 1) to create a common model for assembling, managing, and publishing tracking data sets; 2) to produce, share, and exchange these tracking data in a similar format and standard structure; 3) to provide a unified approach for community software developers extending the capabilities of ArcGIS; 4) to extend the power of marine geospatial analysis through a framework for incorporating object-oriented behaviors and for dealing with scale dependencies; 5) to provide a mechanism for the implementation of data content standards; and 6) to aid researchers in a fuller understanding of object-oriented GISs and the power of advanced spatial data structures. The primary question examined in this thesis is: How can the Arc Marine data model be customized to best meet the research objectives of the OSU MMI and the marine mammal tracking community, in order to explore the relationship of the distribution and movement of endangered marine mammal species to underlying physical and biological oceanographic processes? The MMI customization of Arc Marine is focused on the use of Argos satellite telemetry tagging. The customized database schema was described in Universal Markup Language by modification of the core Arc Marine data model in Microsoft Visio 2003 and implemented as an ArcGIS 9.2 geodatabase (personal, file, and ArcSDE). Tool development and scripting were carried out predominantly in Python 2.4. The two major schema modifications of the MMI customization were the implementation of the Animal and AnimalEvent object classes. The Animal class is a subclass of Vehicle and models the tagged animal as a tracked instrument platform carrying an array of sensors to measure its environment. The AnimalEvent class represents interactions in time between the Animal and an open-ended range of event types including field observations, tagging, sensor measurements, and satellite geolocating. A programming interface is described for AnimalEvent (AnimalEventUI) and the InstantaneousPoint feature class (InstantaneousPointUI) that represents observed animal locations. Further customization came through the development of a comprehensive development framework for animal tracking in Arc Marine. This framework implements front-end analysis tools through Python scripting, ArcGIS extensions, or standalone applications developed in VB.NET. Back-end database loading is implemented in Python through the ArcGIS geoprocessing object and the DB-API 2.0 database abstraction layer. Through a description of the multidimensional data cube model of Arc Marine, Arc Marine and the MMI customization are demonstrated to be foundation schemas for a relational database management system (RDBMS), object relational database management system (ORDBMS), or enterprise spatial data warehouse. This modeling method shows that Arc Marine is built upon atomic measures (scalar quantities, vector quantities, points, lines, and polygons) that are described by related dimensional tables (such as time, data parameters, tagged animal, or species) and concept hierarchies of different levels generalization (for example, tag < animal < social group < population < species). This data cube structure further shows that Arc Marine is an appropriate target schema for the application of on-line analytical processing (OLAP) tools, data mining, and spatial data mining to satellite telemetry tracking datasets. In this customization case study, Arc Marine partially meets each of its six major goals. In particular, the development of the MMI application development platform demonstrates full implementation of a unified approach for community software developers. Meanwhile, the data cube model of Arc Marine for OLAP demonstrates a successful extension of marine geospatial analysis to deal more effectively with scale dependencies and a mechanism for the expansion of researchers’ understanding of high power analytical methods.Keywords: geographic information systems, data model, cetacean, satellite telemetry, Argos, GIS, Arc Marine, on-line analytical processin
Integrating analytics with relational databases
The database research community has made tremendous strides in developing powerful database engines that allow for efficient analytical query processing. However, these powerful systems have gone largely unused by analysts and data scientists. This poor adoption is caused primarily by the state of database-client integration. In this thesis we attempt to overcome this challenge by investigating how we can facilitate efficient and painless integration of analytical tools and relational database management systems. We focus our investigation on the three primary methods for database-client integration: client-server connections, in-database processing and embedding the database inside the client application.PROMIMOOCAlgorithms and the Foundations of Software technolog
Weiterentwicklung analytischer Datenbanksysteme
This thesis contributes to the state of the art in analytical database systems. First, we identify and explore extensions to better support analytics on event streams. Second, we propose a novel polygon index to enable efficient geospatial data processing in main memory. Third, we contribute a new deep learning approach to cardinality estimation, which is the core problem in cost-based query optimization.Diese Arbeit trägt zum aktuellen Forschungsstand von analytischen Datenbanksystemen bei. Wir identifizieren und explorieren Erweiterungen um Analysen auf Eventströmen besser zu unterstützen. Wir stellen eine neue Indexstruktur für Polygone vor, die eine effiziente Verarbeitung von Geodaten im Hauptspeicher ermöglicht. Zudem präsentieren wir einen neuen Ansatz für Kardinalitätsschätzungen mittels maschinellen Lernens
Just-in-time Analytics Over Heterogeneous Data and Hardware
Industry and academia are continuously becoming more data-driven and data-intensive, relying on the analysis of a wide variety of datasets to gain insights. At the same time, data variety increases continuously across multiple axes. First, data comes in multiple formats, such as the binary tabular data of a DBMS, raw textual files, and domain-specific formats. Second, different datasets follow different data models, such as the relational and the hierarchical one. Data location also varies: Some datasets reside in a central "data lake", whereas others lie in remote data sources. In addition, users execute widely different analysis tasks over all these data types. Finally, the process of gathering and integrating diverse datasets introduces several inconsistencies and redundancies in the data, such as duplicate entries for the same real-world concept. In summary, heterogeneity significantly affects the way data analysis is performed. In this thesis, we aim for data virtualization: Abstracting data out of its original form and manipulating it regardless of the way it is stored or structured, without a performance penalty. To achieve data virtualization, we design and implement systems that i) mask heterogeneity through the use of heterogeneity-aware, high-level building blocks and ii) offer fast responses through on-demand adaptation techniques. Regarding the high-level building blocks, we use a query language and algebra to handle multiple collection types, such as relations and hierarchies, express transformations between these collection types, as well as express complex data cleaning tasks over them. In addition, we design a location-aware compiler and optimizer that masks away the complexity of accessing multiple remote data sources. Regarding on-demand adaptation, we present a design to produce a new system per query. The design uses customization mechanisms that trigger runtime code generation to mimic the system most appropriate to answer a query fast: Query operators are thus created based on the query workload and the underlying data models; the data access layer is created based on the underlying data formats. In addition, we exploit emerging hardware by customizing the system implementation based on the available heterogeneous processors â CPUs and GPGPUs. We thus pair each workload with its ideal processor type. The end result is a just-in-time database system that is specific to the query, data, workload, and hardware instance. This thesis redesigns the data management stack to natively cater for data heterogeneity and exploit hardware heterogeneity. Instead of centralizing all relevant datasets, converting them to a single representation, and loading them in a monolithic, static, suboptimal system, our design embraces heterogeneity. Overall, our design decouples the type of performed analysis from the original data layout; users can perform their analysis across data stores, data models, and data formats, but at the same time experience the performance offered by a custom system that has been built on demand to serve their specific use case
An expert advisor system for college management.
This thesis has explored the economic, political, legal and technological changes that have an impact on decision support requirements in many organisations. It has looked particularly at the Public Sector and the FE Sector and has established the need for an intelligent decision support system. Critical Success Factors have been identified that have influenced the design of a specific Expert Advisor System experimental prototype, the development of which has been central to the research. A range of system development methodologies have been reviewed, and justification has been provided for the selection of the CommonKADS methodology. Technologies and techniques to be used in the development of the Expert Advisor System have also been reviewed and justification has been provided for incorporating Case-Based Reasoning and Data Warehousing. The analysis, design and development of the system has been strongly influenced by the Critical Success Factors that were identified to ensure the system met the decision support needs. The experimental prototype has been developed specifically to assist Senior Managers at an FE college with the decision making that is used to complete ISR Funding Returns. The system gives access to historic data, provides auditable data trails to substantiate decisions and facilitates what-if projections. Case-based knowledge discovery, data-based knowledge discovery, graph-based knowledge discovery and projection-based knowledge discovery have been achieved through the use of the prototype. An important part of the development process was the adaptation of cases and the adaptation of queries that extracted and aggregated data to provide system adaptation. The research has focused around addressing two research hypotheses. Evidence has been provided to show that the two research hypotheses have been addressed. This demonstrates that (hypothesis 1) CommonKADS Models are well suited to providing a template for the design and documentation of Decision Support Systems that need to operate in rapidly changing domains. Justification has also been given to show that (hypothesis 2) CBR principles can be used together with other knowledge discovery techniques to provide useful adaptive systems. The research concludes by looking at how new technologies could be incorporated in later versions of the Expert Advisor System
- …