4,126 research outputs found

    Layout Optimization for Distributed Relational Databases Using Machine Learning

    Get PDF
    A common problem when running Web-based applications is how to scale-up the database. The solution to this problem usually involves having a smart Database Administrator determine how to spread the database tables out amongst computers that will work in parallel. Laying out database tables across multiple machines so they can act together as a single efficient database is hard. Automated methods are needed to help eliminate the time required for database administrators to create optimal configurations. There are four operators that we consider that can create a search space of possible database layouts: 1) denormalizing, 2) horizontally partitioning, 3) vertically partitioning, and 4) fully replicating. Textbooks offer general advice that is useful for dealing with extreme cases - for instance you should fully replicate a table if the level of insert to selects is close to zero. But even this seemingly obvious statement is not necessarily one that will lead to a speed up once you take into account that some nodes might be a bottle neck. There can be complex interactions between the 4 different operators which make it even more difficult to predict what the best thing to do is. Instead of using best practices to do database layout, we need a system that collects empirical data on when these 4 different operators are effective. We have implemented a state based search technique to try different operators, and then we used the empirically measured data to see if any speed up occurred. We recognized that the costs of creating the physical database layout are potentially large, but it is necessary since we want to know the Ground Truth about what is effective and under what conditions. After creating a dataset where these four different operators have been applied to make different databases, we can employ machine learning to induce rules to help govern the physical design of the database across an arbitrary number of computer nodes. This learning process, in turn, would allow the database placement algorithm to get better over time as it trains over a set of examples. What this algorithm calls for is that it will try to learn 1) What is a good database layout for a particular application given a query workload? and 2) Can this algorithm automatically improve itself in making recommendations by using machine learned rules to try to generalize when it makes sense to apply each of these operators? There has been considerable research done in parallelizing databases where large amounts of data are shipped from one node to another to answer a single query. Sometimes the costs of shipping the data back and forth might be high, so in this work we assume that it might be more efficient to create a database layout where each query can be answered by a single node. To make this assumption requires that all the incoming query templates are known beforehand. This requirement can easily be satisfied in the case of a Web-based application due to the characteristic that users typically interact with the system through a web interface such as web forms. In this case, unseen queries are not necessarily answerable, without first possibly reconstructing the data on a single machine. Prior knowledge of these exact query templates allows us to select the best possible database table placements across multiple nodes. But in the case of trying to improve the efficiency of a Web-based application, a web site provider might feel that they are willing to suffer the inconvenience of not being able to answer an arbitrary query, if they are in turn provided with a system that runs more efficiently

    Universal Workload-based Graph Partitioning and Storage Adaption for Distributed RDF Stores

    Get PDF
    The publication of machine-readable information has been significantly increasing both in the magnitude and complexity of the embedded relations. The Resource Description Framework(RDF) plays a big role in modeling and linking web data and their relations. In line with that important role, dedicated systems were designed to store and query the RDF data using a special queering language called SPARQL similar to the classic SQL. However, due to the high size of the data, several federated working nodes were used to host a distributed RDF store. The data needs to be partitioned, assigned, and stored in each working node. After partitioning, some of the data needs to be replicated in order to avoid the communication cost, and balance the loads for better system throughput. Since replications require more storage space, the important two questions are: what data to replicate? And how much? The answer to the second question is related to other storage-space requirements at each working node like indexes and cache. In order to efficiently answer SPARQL queries, each working node needs to put its share of data into multiple indexes. Those indexes have a data-wide size and consume a considerable amount of storage space. In this context, the same two questions about replications are also raised about indexes. The third storage-consuming structure is the join cache. It is a special index where the frequent join results are cached and save a considerable amount of running time on the cost of high storage space consumption. Again, the same two questions of replication and indexes are applicable to the join-cache. In this thesis, we present a universal adaption approach to the storage of a distributed RDF store. The system aims to find optimal data assignments to the different indexes, replications, and join cache within the limited storage space. To achieve this, we present a cost model based on the workload that often contains frequent patterns. The workload is dynamically analyzed to evaluate predefined rules. Those rules tell the system about the benefits and costs of assigning which data to what structure. The objective is to have better query execution time. Besides the storage adaption, the system adapts its processing resources with the queries' arrival rate. The aim of this adaption is to have better parallelization per query while still provides high system throughput

    SwiftSpatial: Spatial Joins on Modern Hardware

    Full text link
    Spatial joins are among the most time-consuming queries in spatial data management systems. In this paper, we propose SwiftSpatial, a specialized accelerator architecture tailored for spatial joins. SwiftSpatial contains multiple high-performance join units with innovative hybrid parallelism, several efficient memory management units, and an integrated on-chip join scheduler. We prototype SwiftSpatial on an FPGA and incorporate the R-tree synchronous traversal algorithm as the control flow. Benchmarked against various CPU and GPU-based spatial data processing systems, SwiftSpatial demonstrates a latency reduction of up to 5.36x relative to the best-performing baseline, while requiring 6.16x less power. The remarkable performance and energy efficiency of SwiftSpatial lay a solid foundation for its future integration into spatial data management systems, both in data centers and at the edge

    Towards Next Generation Business Process Model Repositories – A Technical Perspective on Loading and Processing of Process Models

    Get PDF
    Business process management repositories manage large collections of process models ranging in the thousands. Additionally, they provide management functions like e.g. mining, querying, merging and variants management for process models. However, most current business process management repositories are built on top of relation database management systems (RDBMS) although this leads to performance issues. These issues result from the relational algebra, the mismatch between relational tables and object oriented programming (impedance mismatch) as well as new technological developments in the last 30 years as e.g. more and cheap disk and memory space, clusters and clouds. The goal of this paper is to present current paradigms to overcome the performance problems inherent in RDBMS. Therefore, we have to fuse research about data modeling along database technologies as well as algorithm design and parallelization for the technology paradigms occurring nowadays. Based on these research streams we have shown how the performance of business process management repositories could be improved in terms of loading performance of processes (from e.g. a disk) and the computation of management techniques resulting in even faster application of such a technique. Exemplarily, applications of the compiled paradigms are presented to show their applicability

    Physical database design in document stores

    Get PDF
    Tesi en modalitat de cotutela, Universitat PolitĂšcnica de Catalunya i UniversitĂ© libre de BruxellesNoSQL is an umbrella term used to classify alternate storage systems to the traditional Relational Database Management Systems (RDBMSs). Among these, Document stores have gained popularity mainly due to the semi-structured data storage model and the rich query capabilities. They encourage users to use a data-first approach as opposed to a design-first one. Database design on document stores is mainly carried out in a trial-and-error or ad-hoc rule-based manner instead of a formal process such as normalization in an RDBMS. However, these approaches could easily lead to a non-optimal design resulting additional costs in the long run. This PhD thesis aims to provide a novel multi-criteria-based approach to database design in document stores. Most of such existing approaches are based on optimizing query performance. However, other factors include storage requirement and complexity of the stored documents specific to each use case. There is a large solution space of alternative designs due to the different combinations of referencing and nesting of data. Thus, we believe multi-criteria optimization is ideal to solve this problem. To achieve this, we need to address several issues that will enable us to apply multi-criteria optimization for the data design problem. First, we evaluate the impact of alternate storage representations of semi-structured data. There are multiple and equivalent ways to physically represent semi-structured data, but there is a lack of evidence about the potential impact on space and query performance. Thus, we embark on the task of quantifying that precisely for document stores. We empirically compare multiple ways of representing semi-structured data, allowing us to derive a set of guidelines for efficient physical database design considering both JSON and relational options in the same palette. Then, we need a formal canonical model that can represent alternative designs. We propose a hypergraph-based approach for representing heterogeneous datastore designs. We extend and formalize an existing common programming interface to NoSQL systems as hypergraphs. We define design constraints and query transformation rules for representative data store types. Next, we propose a simple query rewriting algorithm and provide a prototype implementation together with storage statistics estimator. Next, we require a formal query cost model to estimate and evaluate query performance on alternative document store designs. Document stores use primitive approaches to query processing, such as relying on the end-user to specify the usage of indexes instead of a formal cost model. But we require a reliable approach to compare alternative designs on how they perform on a specific query. For this, we define a generic storage and query cost model based on disk access and memory allocation. As all document stores carry out data operations in memory, we first estimate the memory usage by considering the characteristics of the stored documents, their access patterns, and memory management algorithms. Then, using this estimation and metadata storage size, we introduce a cost model for random access queries. We validate our work on two well-known document store implementations. The results show that the memory usage estimates have an average precision of 91% and predicted costs are highly correlated to the actual execution times. During this work, we also managed to suggest several improvements to document stores. Finally, we implement the automated database design solution using multi-criteria optimization. We introduce an algebra of transformations that can systematically modify a design of our canonical representation. Then, using them, we implement a local search algorithm driven by a loss function that can propose near-optimal designs with high probability. We compare our prototype against an existing document store data design solution. Our proposed designs have better performance and are more compact with less redundancy.NoSQL descriu sistemes d'emmagatzematge alternatius als tradicionals de gestiĂł de bases de dades relacionals (RDBMS). Entre aquests, els magatzems de documents han guanyat popularitat principalment a causa del model de dades semiestructurat i les riques capacitats de consulta. Animen els usuaris a utilitzar un enfocament de dades primer, en lloc d'un enfocament de disseny primer. El disseny de dades en magatzems de documents es porta a terme principalment en forma d'assaig-error o basat en regles ad-hoc en lloc d'un procĂ©s formal i sistemĂ tic com ara la normalitzaciĂł en un RDBMS. Aquest enfocament condueix fĂ cilment a un disseny no ĂČptim que generarĂ  costos addicionals a llarg termini. La majoria dels enfocaments existents es basen en l'optimitzaciĂł del rendiment de les consultes. Aquesta tesi pretĂ©n, en canvi, proporcionar un nou enfocament basat en diversos criteris per al disseny de bases de dades en magatzems de documents, inclouen el requisit d'espai i la complexitat dels documents emmagatzemats especĂ­fics per a cada cas d'Ășs. En general, hi ha un gran espai de solucions de dissenys alternatives. Per tant, creiem que l'optimitzaciĂł multicriteri Ă©s ideal per resoldre aquest problema. Per aconseguir-ho, hem d'abordar diversos problemes que ens permetran aplicar l'optimitzaciĂł multicriteri. En primer, avaluem l'impacte de les representacions alternatives de dades semiestructurades. Hi ha maneres mĂșltiples i equivalents de representar dades semiestructurades, perĂČ hi ha una manca d'evidĂšncia sobre l'impacte potencial en l'espai i el rendiment de les consultes. AixĂ­, ens embarquem en la tasca de quantificar-ho. Comparem empĂ­ricament mĂșltiples representacions de dades semiestructurades, cosa que ens permet derivar directrius per a un disseny eficient tenint en compte les opcions dels JSON i relacionals alhora. Aleshores, necessitem un model canĂČnic que pugui representar dissenys alternatius i proposem un enfocament basat en hipergrafs. Estenem i formalitzem una interfĂ­cie de programaciĂł comuna existent als sistemes NoSQL com a hipergrafs. Definim restriccions de disseny i regles de transformaciĂł de consultes per a tipus de magatzem de dades representatius. A continuaciĂł, proposem un algorisme de reescriptura de consultes senzill i proporcionem una implementaciĂł juntament amb un estimador d'estadĂ­stiques d'emmagatzematge. Els magatzems de documents utilitzen enfocaments primitius per al processament de consultes, com ara confiar en l'usuari final per especificar l'Ășs d'Ă­ndexs en lloc d'un model de cost. ConseqĂŒentment, necessitem un model de cost de consulta per estimar i avaluar el rendiment en dissenys alternatius. Per aixĂČ, definim un model genĂšric propi basat en l'accĂ©s a disc i l'assignaciĂł de memĂČria. Com que tots els magatzems de documents duen a terme operacions de dades a memĂČria, primer estimem l'Ășs de la memĂČria tenint en compte les caracterĂ­stiques dels documents emmagatzemats, els seus patrons d'accĂ©s i els algorismes de gestiĂł de memĂČria. A continuaciĂł, utilitzant aquesta estimaciĂł i la mida d'emmagatzematge de metadades, introduĂŻm un model de costos per a consultes d'accĂ©s aleatori. Validem el nostre treball en dues implementacions conegudes. Els resultats mostren que les estimacions d'Ășs de memĂČria tenen una precisiĂł mitjana del 91% i els costos previstos estan altament correlacionats amb els temps d'execuciĂł reals. Finalment, implementem la soluciĂł de disseny automatitzat de bases de dades mitjançant l'optimitzaciĂł multicriteri. IntroduĂŻm una Ă lgebra de transformacions que pot modificar sistemĂ ticament un disseny en la nostra representaciĂł canĂČnica. A continuaciĂł, utilitzant-la, implementem un algorisme de cerca local impulsat per una funciĂł de pĂšrdua que pot proposar dissenys gairebĂ© ĂČptims amb alta probabilitat. Comparem el nostre prototip amb una soluciĂł de disseny de dades de magatzem de documents existent. Els nostres dissenys proposats tenen un millor rendiment i sĂłn mĂ©s compactes, amb menys redundĂ nciaNoSQL est un terme gĂ©nĂ©rique utilisĂ© pour classer les systĂšmes de stockage alternatifs aux systĂšmes de gestion de bases de donnĂ©es relationnelles (SGBDR) traditionnels. Au moment de la rĂ©daction de cet article, il existe plus de 200 systĂšmes NoSQL disponibles qui peuvent ĂȘtre classĂ©s en quatre catĂ©gories principales sur le modĂšle de stockage de donnĂ©es : magasins de valeurs-clĂ©s, magasins de documents, magasins de familles de colonnes et magasins de graphiques. Les magasins de documents ont gagnĂ© en popularitĂ© principalement en raison du modĂšle de stockage de donnĂ©es semi-structurĂ© et des capacitĂ©s de requĂȘtes riches par rapport aux autres systĂšmes NoSQL, ce qui en fait un candidat idĂ©al pour le prototypage rapide. Les magasins de documents encouragent les utilisateurs Ă  utiliser une approche axĂ©e sur les donnĂ©es plutĂŽt que sur la conception. La conception de bases de donnĂ©es sur les magasins de documents est principalement effectuĂ©e par essais et erreurs ou selon des rĂšgles ad hoc plutĂŽt que par un processus formel tel que la normalisation dans un SGBDR. Cependant, ces approches pourraient facilement conduire Ă  une conception de base de donnĂ©es non optimale entraĂźnant des coĂ»ts supplĂ©mentaires de traitement des requĂȘtes, de stockage des donnĂ©es et de refonte. Cette thĂšse de doctorat vise Ă  fournir une nouvelle approche multicritĂšre de la conception de bases de donnĂ©es dans les magasins de documents. La plupart des approches existantes de conception de bases de donnĂ©es sont basĂ©es sur l’optimisation des performances des requĂȘtes. Cependant, d’autres facteurs incluent les exigences de stockage et la complexitĂ© des documents stockĂ©s spĂ©cifique Ă  chaque cas d’utilisation. De plus, il existe un grand espace de solution de conceptions alternatives en raison des diffĂ©rentes combinaisons de rĂ©fĂ©rencement et d’imbrication des donnĂ©es. Par consĂ©quent, nous pensons que l’optimisation multicritĂšres est idĂ©ale par l’intermĂ©diaire d’une expĂ©rience Ă©prouvĂ©e dans la rĂ©solution de tels problĂšmes dans divers domaines. Cependant, pour y parvenir, nous devons rĂ©soudre plusieurs problĂšmes qui nous permettront d’appliquer une optimisation multicritĂšre pour le problĂšme de conception de donnĂ©es. PremiĂšrement, nous Ă©valuons l’impact des reprĂ©sentations alternatives de stockage des donnĂ©es semi-structurĂ©es. Il existe plusieurs maniĂšres Ă©quivalentes de reprĂ©senter physiquement des donnĂ©es semi-structurĂ©es, mais il y a un manque de preuves concernant l’impact potentiel sur l’espace et sur les performances des requĂȘtes. Ainsi, nous nous lançons dans la tĂąche de quantifier cela prĂ©cisĂ©ment pour les magasins de documents. Nous comparons empiriquement plusieurs façons de reprĂ©senter des donnĂ©es semi-structurĂ©es, ce qui nous permet de dĂ©river un ensemble de directives pour une conception de base de donnĂ©es physique efficace en tenant compte Ă  la fois des options JSON et relationnelles dans la mĂȘme palette. Ensuite, nous avons besoin d’un modĂšle canonique formel capable de reprĂ©senter des conceptions alternatives. Dans cette mesure, nous proposons une approche basĂ©e sur des hypergraphes pour reprĂ©senter des conceptions de magasins de donnĂ©es hĂ©tĂ©rogĂšnes. Prenant une interface de programmation commune existante aux systĂšmes NoSQL, nous l’étendons et la formalisons sous forme d’hypergraphes. Ensuite, nous dĂ©finissons les contraintes de conception et les rĂšgles de transformation des requĂȘtes pour trois types de magasins de donnĂ©es reprĂ©sentatifs. Ensuite, nous proposons un algorithme de rĂ©Ă©criture de requĂȘte simple Ă  partir d’un algorithme gĂ©nĂ©rique dans un magasin de donnĂ©es sous-jacent spĂ©cifique et fournissons une implĂ©mentation prototype. De plus, nous introduisons un estimateur de statistiques de stockage sur les magasins de donnĂ©es sous-jacents. Enfin, nous montrons la faisabilitĂ© de notre approche sur un cas d’utilisation d’un systĂšme polyglotte existant ainsi que son utilitĂ© dans les calculs de mĂ©tadonnĂ©es et de chemins de requĂȘtes physiques. Ensuite, nous avons besoin d’un modĂšle de coĂ»ts de requĂȘtes formel pour estimer et Ă©valuer les performances des requĂȘtes sur des conceptions alternatives de magasin de documents. Les magasins de documents utilisent des approches primitives du traitement des requĂȘtes, telles que l’évaluation de tous les plans de requĂȘte possibles pour trouver le plan gagnant et son utilisation dans les requĂȘtes similaires ultĂ©rieures, ou l’appui sur l’usager final pour spĂ©cifier l’utilisation des index au lieu d’un modĂšle de coĂ»ts formel. Cependant, nous avons besoin d’une approche fiable pour comparer deux conceptions alternatives sur la façon dont elles fonctionnent sur une requĂȘte spĂ©cifique. Pour cela, nous dĂ©finissons un modĂšle de coĂ»ts de stockage et de requĂȘte gĂ©nĂ©rique basĂ© sur l’accĂšs au disque et l’allocation de mĂ©moire qui permet d’estimer l’impact des dĂ©cisions de conception. Étant donnĂ© que tous les magasins de documents effectuent des opĂ©rations sur les donnĂ©es en mĂ©moire, nous estimons d’abord l’utilisation de la mĂ©moire en considĂ©rant les caractĂ©ristiques des documents stockĂ©s, leurs modĂšles d’accĂšs et les algorithmes de gestion de la mĂ©moire. Ensuite, en utilisant cette estimation et la taille de stockage des mĂ©tadonnĂ©es, nous introduisons un modĂšle de coĂ»ts pour les requĂȘtes Ă  accĂšs alĂ©atoire. Il s’agit de la premiĂšre tenta ive d’une telle approche au meilleur de notre connaissance. Enfin, nous validons notre travail sur deux implĂ©mentations de magasin de documents bien connues : MongoDB et Couchbase. Les rĂ©sultats dĂ©montrent que les estimations d’utilisation de la mĂ©moire ont une prĂ©cision moyenne de 91% et que les coĂ»ts prĂ©vus sont fortement corrĂ©lĂ©s aux temps d’exĂ©cution rĂ©els. Au cours de ce travail, nous avons rĂ©ussi Ă  proposer plusieurs amĂ©liorations aux systĂšmes de stockage de documents. Ainsi, ce modĂšle de coĂ»ts contribue Ă©galement Ă  identifier les discordances entre les implĂ©mentations de stockage de documents et leurs attentes thĂ©oriques. Enfin, nous implĂ©mentons la solution de conception automatisĂ©e de bases de donnĂ©es en utilisant l’optimisation multicritĂšres. Tout d’abord, nous introduisons une algĂšbre de transformations qui peut systĂ©matiquement modifier une conception de notre reprĂ©sentation canonique. Ensuite, en utilisant ces transformations, nous implĂ©mentons un algorithme de recherche locale pilotĂ© par une fonction de perte qui peut proposer des conceptions quasi optimales avec une probabilitĂ© Ă©levĂ©e. Enfin, nous comparons notre prototype Ă  une solution de conception de donnĂ©es de magasin de documents existante uniquement basĂ©e sur le coĂ»t des requĂȘtes. Nos conceptions proposĂ©es ont de meilleures performances et sont plus compactes avec moins de redondancePostprint (published version

    Fachlich erweiterbare 3D-Stadtmodelle – Management, Visualisierung und Interaktion

    Get PDF
    Domain-extendable semantic 3D city models are complex mappings and inventories of the urban environment which can be utilized as an integrative information backbone to facilitate a range of application fields like urban planning, environmental simulations, disaster management, and energy assessment. Today, more and more countries and cities worldwide are creating their own 3D city models based on the CityGML specification which is an international standard issued by the Open Geospatial Consortium (OGC) to provide an open data model and XML-based format for describing the relevant urban objects with regards to their 3D geometry, topology, semantics, and appearance. It especially provides a flexible and systematic extension mechanism called “Application Domain Extension (ADE)” which allows third parties to dynamically extend the existing CityGML definitions with additional information models from different application domains for representing the extended or newly introduced geographic object types within a common framework. However, due to the consequent large size and high model complexity, the practical utilization of country-wide CityGML datasets has posed a tremendous challenge regarding the setup of an extensive application system to support the efficient data storage, analysis, management, interaction, and visualization. These requirements have been partly solved by the existing free 3D geo-database solution called ‘3D City Database (3DCityDB)’ which offers a rich set of functionalities for dealing with standard CityGML data models, but lacked the support for CityGML ADEs. The key motivation of this thesis is to develop a reliable approach for extending the existing database solution to support the efficient management, visualization, and interaction of large geospatial data elements of arbitrary CityGML ADEs. Emphasis is first placed on answering the question of how to dynamically extend the relational database schema by parsing and interpreting the XML schema files of the ADE and dynamically create new database tables accordingly. Based on a comprehensive survey of the related work, a new graph-based framework has been proposed which uses typed and attributed graphs for semantically representing the object-oriented data models of CityGML ADEs and utilizes graph transformation systems to automatically generate compact table structures extending the 3DCityDB. The transformation process is performed by applying a series of fine-grained graph transformation rules which allow users to declaratively describe the complex mapping rules including the optimization concepts that are employed in the development of the 3DCityDB database schema. The second major contribution of this thesis is the development of a new multi-level system which can serve as a complete and integrative platform for facilitating the various analysis, simulation, and modification operations on the complex-structured 3D city models based on CityGML and 3DCityDB. It introduces an additional application level based on a so-called ‘app-concept’ that allows for constructing a light-weight web application to reach a good balance between the high data model complexity and the specific application requirements of the end users. Each application can be easily built on top of a developed 3D web client whose functionalities go beyond the efficient 3D geo-visualization and interactive exploration, and also allows for performing collaborative modifications and analysis of 3D city models by taking advantage of the Cloud Computing technology. This multi-level system along with the extended 3DCityDB have been successfully utilized and evaluated by many practical projects.Fachlich erweiterbare semantische 3D-Stadtmodelle sind komplexe Abbildungen und DatenbestĂ€nde der stĂ€dtischen Umgebung, die als ein integratives InformationsrĂŒckgrat genutzt werden können, um eine Reihe von Anwendungsfeldern wie z. B. Stadtplanung, Umweltsimulationen, Katastrophenmanagement und Energiebewertung zu ermöglichen. Heute schaffen immer mehr LĂ€nder und StĂ€dte weltweit ihre eigenen 3D-Stadtmodelle auf Basis des internationalen Standards CityGML des Open Geospatial Consortium (OGC), um ein offenes Datenmodell und ein XML-basiertes Format zur Beschreibung der relevanten Stadtobjekte in Bezug auf ihre 3D-Geometrien, Topologien, Semantik und Erscheinungen zur VerfĂŒgung zu stellen. Es bietet insbesondere einen flexiblen und systematischen Erweiterungsmechanismus namens „Application Domain Extension“ (ADE), der es Dritten ermöglicht, die bestehenden CityGML-Definitionen mit zusĂ€tzlichen Informationsmodellen aus verschiedenen AnwendungsdomĂ€nen dynamisch zu erweitern, um die erweiterten oder neu eingefĂŒhrten Stadtobjekt-Typen innerhalb eines gemeinsamen Framework zu reprĂ€sentieren. Aufgrund der konsequent großen Datenmenge und hohen ModellkomplexitĂ€t bei der praktischen Nutzung der landesweiten CityGML-DatensĂ€tze wurden jedoch enorme Anforderungen an den Aufbau eines umfangreichen Anwendungssystems zur UnterstĂŒtzung der effizienten Speicherung, Analyse, Verwaltung, Interaktion und Visualisierung der Daten gestellt. Die bestehende kostenlose 3D-Geodatenbank-Lösung „3D City Database“ (3DCityDB) entsprach bereits teilweise diesen Anforderungen, indem sie zwar eine umfangreiche FunktionalitĂ€t fĂŒr den Umgang mit den Standard-CityGML-Datenmodellen, jedoch keine UnterstĂŒtzung fĂŒr CityGML-ADEs bietet. Die SchlĂŒsselmotivation fĂŒr diese Arbeit ist es, einen zuverlĂ€ssigen Ansatz zur Erweiterung der bestehenden Datenbanklösung zu entwickeln, um das effiziente Management, die Visualisierung und Interaktion großer DatensĂ€tze beliebiger CityGML-ADEs zu unterstĂŒtzen. Der Schwerpunkt liegt zunĂ€chst auf der Beantwortung der SchlĂŒsselfrage, wie man das relationale Datenbankschema dynamisch erweitern kann, indem die XML-Schemadateien der ADE analysiert und interpretiert und anschließend dem entsprechende neue Datenbanktabellen erzeugt werden. Auf Grundlage einer umfassenden Studie verwandter Arbeiten wurde ein neues graphbasiertes Framework entwickelt, das die typisierten und attributierten Graphen zur semantischen Darstellung der objektorientierten Datenmodelle von CityGML-ADEs verwendet und anschließend Graphersetzungssysteme nutzt, um eine kompakte Tabellenstruktur zur Erweiterung der 3DCityDB zu generieren. Der Transformationsprozess wird durch die Anwendung einer Reihe feingranularer Graphersetzungsregeln durchgefĂŒhrt, die es Benutzern ermöglicht, die komplexen Mapping-Regeln einschließlich der Optimierungskonzepte aus der Entwicklung des 3DCityDB-Datenbankschemas deklarativ zu formalisieren. Der zweite wesentliche Beitrag dieser Arbeit ist die Entwicklung eines neuen mehrstufigen Systemkonzepts, das auf CityGML und 3DCityDB basiert und gleichzeitig als eine komplette und integrative Plattform zur Erleichterung der Analyse, Simulationen und Modifikationen der komplex strukturierten 3D-Stadtmodelle dienen kann. Das Systemkonzept enthĂ€lt eine zusĂ€tzliche Anwendungsebene, die auf einem sogenannten „App-Konzept“ basiert, das es ermöglicht, eine leichtgewichtige Applikation bereitzustellen, die eine gute Balance zwischen der hohen ModellkomplexitĂ€t und den spezifischen Anwendungsanforderungen der Endbenutzer erreicht. Jede Applikation lĂ€sst sich ganz einfach mittels eines bereits entwickelten 3D-Webclients aufbauen, dessen FunktionalitĂ€ten ĂŒber die effiziente 3D-Geo-Visualisierung und interaktive Exploration hinausgehen und auch die DurchfĂŒhrung kollaborativer Modifikationen und Analysen von 3D-Stadtmodellen mit Hilfe von der Cloud-Computing-Technologie ermöglichen. Dieses mehrstufige System zusammen mit dem erweiterten 3DCityDB wurde erfolgreich in vielen praktischen Projekten genutzt und bewertet
    • 

    corecore