397,029 research outputs found

    Thinking spatial

    Get PDF
    The systems community in both academia and industry has tremendous success in building widely used general purpose systems for various types of data and applications. Examples include database systems, big data systems, data streaming systems, and machine learning systems. The vast majority of these systems are ill equipped in terms of supporting spatial data. The main reason is that system builders mostly think of spatial data as just one more type of data. Any spatial support can be considered as an afterthought problem that can be supported via on-top functions or spatial cartridges that can be added to the already built systems. This article advocates that spatial data and applications need to be natively supported in special purpose systems, where spatial data is considered as a first class citizen, while spatial operations are built inside the engine rather than on-top of it. System builders should consider spatial data while building their systems. The article gives examples of five categories of systems, namely, database systems, big data systems, machine learning systems, recommender systems, and social network systems, that would benefit tremendously, in terms of both accuracy and performance, when considering spatial data as an integral part of the system engine

    A Analysis of Different Type of Advance database System For Data Mining Based on Basic Factor

    Get PDF
    Normal databases are unable to handling such as large range and large amount of data. Then we need database to support creating, storage, indexing and retrieval of large and wide variety of data for mining. This research paper presents different ways of data mining for advance data as multimedia, spatial, Time-series and heterogeneous data and management of database is given to help to creating, storing, indexing and retrieval. This includes advanced data structures and use of metadata to store advance data like multimedia, spatial, Time-series and heterogeneous database. This research paper claim the database management systems should be extended to arrange new type of data and enable to search based on their contents. Media, Geometry, Time, Calendar objects and all type objects are modeled as attributes of abstract data types. This paper will be describe multimedia, spatial, time series and heterogeneous database as a point of data mining methods, database management technique, data type and application for advance database. DOI: 10.17762/ijritcc2321-8169.15020

    Spatial Databases

    Get PDF
    Databases are computer systems designed to store information in a systematic way so that their contents can be easily accessed, managed, changed, and augmented. Spatial databases are such systems designed specifically to include data with spatial attributes, such as geographical location, distance, and extent. The software used to manage and query a database is known as a database management system (DBMS). The most prevalent type of database is the relational database, a tabular scheme in which data are defined in terms of simple data types and operations so that it can be reorganized and accessed in a number of different ways. Most spatial DBMSs extend the relational model to include more-complex spatial data-types and operations. All geographic information systems (GISs) use a spatial database with, in addition, functions for presentation, visualization, and analysis of the spatially referenced data

    Performance comparison of point and spatial access methods

    Get PDF
    In the past few years a large number of multidimensional point access methods, also called multiattribute index structures, has been suggested, all of them claiming good performance. Since no performance comparison of these structures under arbitrary (strongly correlated nonuniform, short "ugly") data distributions and under various types of queries has been performed, database researchers and designers were hesitant to use any of these new point access methods. As shown in a recent paper, such point access methods are not only important in traditional database applications. In new applications such as CAD/CIM and geographic or environmental information systems, access methods for spatial objects are needed. As recently shown such access methods are based on point access methods in terms of functionality and performance. Our performance comparison naturally consists of two parts. In part I we w i l l compare multidimensional point access methods, whereas in part I I spatial access methods for rectangles will be compared. In part I we present a survey and classification of existing point access methods. Then we carefully select the following four methods for implementation and performance comparison under seven different data files (distributions) and various types of queries: the 2-level grid file, the BANG file, the hB-tree and a new scheme, called the BUDDY hash tree. We were surprised to see one method to be the clear winner which was the BUDDY hash tree. It exhibits an at least 20 % better average performance than its competitors and is robust under ugly data and queries. In part I I we compare spatial access methods for rectangles. After presenting a survey and classification of existing spatial access methods we carefully selected the following four methods for implementation and performance comparison under six different data files (distributions) and various types of queries: the R-tree, the BANG file, PLOP hashing and the BUDDY hash tree. The result presented two winners: the BANG file and the BUDDY hash tree. This comparison is a first step towards a standardized testbed or benchmark. We offer our data and query files to each designer of a new point or spatial access method such that he can run his implementation in our testbed

    Similarity search and data mining techniques for advanced database systems.

    Get PDF
    Modern automated methods for measurement, collection, and analysis of data in industry and science are providing more and more data with drastically increasing structure complexity. On the one hand, this growing complexity is justified by the need for a richer and more precise description of real-world objects, on the other hand it is justified by the rapid progress in measurement and analysis techniques that allow the user a versatile exploration of objects. In order to manage the huge volume of such complex data, advanced database systems are employed. In contrast to conventional database systems that support exact match queries, the user of these advanced database systems focuses on applying similarity search and data mining techniques. Based on an analysis of typical advanced database systems — such as biometrical, biological, multimedia, moving, and CAD-object database systems — the following three challenging characteristics of complexity are detected: uncertainty (probabilistic feature vectors), multiple instances (a set of homogeneous feature vectors), and multiple representations (a set of heterogeneous feature vectors). Therefore, the goal of this thesis is to develop similarity search and data mining techniques that are capable of handling uncertain, multi-instance, and multi-represented objects. The first part of this thesis deals with similarity search techniques. Object identification is a similarity search technique that is typically used for the recognition of objects from image, video, or audio data. Thus, we develop a novel probabilistic model for object identification. Based on it, two novel types of identification queries are defined. In order to process the novel query types efficiently, we introduce an index structure called Gauss-tree. In addition, we specify further probabilistic models and query types for uncertain multi-instance objects and uncertain spatial objects. Based on the index structure, we develop algorithms for an efficient processing of these query types. Practical benefits of using probabilistic feature vectors are demonstrated on a real-world application for video similarity search. Furthermore, a similarity search technique is presented that is based on aggregated multi-instance objects, and that is suitable for video similarity search. This technique takes multiple representations into account in order to achieve better effectiveness. The second part of this thesis deals with two major data mining techniques: clustering and classification. Since privacy preservation is a very important demand of distributed advanced applications, we propose using uncertainty for data obfuscation in order to provide privacy preservation during clustering. Furthermore, a model-based and a density-based clustering method for multi-instance objects are developed. Afterwards, original extensions and enhancements of the density-based clustering algorithms DBSCAN and OPTICS for handling multi-represented objects are introduced. Since several advanced database systems like biological or multimedia database systems handle predefined, very large class systems, two novel classification techniques for large class sets that benefit from using multiple representations are defined. The first classification method is based on the idea of a k-nearest-neighbor classifier. It employs a novel density-based technique to reduce training instances and exploits the entropy impurity of the local neighborhood in order to weight a given representation. The second technique addresses hierarchically-organized class systems. It uses a novel hierarchical, supervised method for the reduction of large multi-instance objects, e.g. audio or video, and applies support vector machines for efficient hierarchical classification of multi-represented objects. User benefits of this technique are demonstrated by a prototype that performs a classification of large music collections. The effectiveness and efficiency of all proposed techniques are discussed and verified by comparison with conventional approaches in versatile experimental evaluations on real-world datasets

    Similarity search and data mining techniques for advanced database systems.

    Get PDF
    Modern automated methods for measurement, collection, and analysis of data in industry and science are providing more and more data with drastically increasing structure complexity. On the one hand, this growing complexity is justified by the need for a richer and more precise description of real-world objects, on the other hand it is justified by the rapid progress in measurement and analysis techniques that allow the user a versatile exploration of objects. In order to manage the huge volume of such complex data, advanced database systems are employed. In contrast to conventional database systems that support exact match queries, the user of these advanced database systems focuses on applying similarity search and data mining techniques. Based on an analysis of typical advanced database systems — such as biometrical, biological, multimedia, moving, and CAD-object database systems — the following three challenging characteristics of complexity are detected: uncertainty (probabilistic feature vectors), multiple instances (a set of homogeneous feature vectors), and multiple representations (a set of heterogeneous feature vectors). Therefore, the goal of this thesis is to develop similarity search and data mining techniques that are capable of handling uncertain, multi-instance, and multi-represented objects. The first part of this thesis deals with similarity search techniques. Object identification is a similarity search technique that is typically used for the recognition of objects from image, video, or audio data. Thus, we develop a novel probabilistic model for object identification. Based on it, two novel types of identification queries are defined. In order to process the novel query types efficiently, we introduce an index structure called Gauss-tree. In addition, we specify further probabilistic models and query types for uncertain multi-instance objects and uncertain spatial objects. Based on the index structure, we develop algorithms for an efficient processing of these query types. Practical benefits of using probabilistic feature vectors are demonstrated on a real-world application for video similarity search. Furthermore, a similarity search technique is presented that is based on aggregated multi-instance objects, and that is suitable for video similarity search. This technique takes multiple representations into account in order to achieve better effectiveness. The second part of this thesis deals with two major data mining techniques: clustering and classification. Since privacy preservation is a very important demand of distributed advanced applications, we propose using uncertainty for data obfuscation in order to provide privacy preservation during clustering. Furthermore, a model-based and a density-based clustering method for multi-instance objects are developed. Afterwards, original extensions and enhancements of the density-based clustering algorithms DBSCAN and OPTICS for handling multi-represented objects are introduced. Since several advanced database systems like biological or multimedia database systems handle predefined, very large class systems, two novel classification techniques for large class sets that benefit from using multiple representations are defined. The first classification method is based on the idea of a k-nearest-neighbor classifier. It employs a novel density-based technique to reduce training instances and exploits the entropy impurity of the local neighborhood in order to weight a given representation. The second technique addresses hierarchically-organized class systems. It uses a novel hierarchical, supervised method for the reduction of large multi-instance objects, e.g. audio or video, and applies support vector machines for efficient hierarchical classification of multi-represented objects. User benefits of this technique are demonstrated by a prototype that performs a classification of large music collections. The effectiveness and efficiency of all proposed techniques are discussed and verified by comparison with conventional approaches in versatile experimental evaluations on real-world datasets

    IMPLEMENTATION OF SPATIAL DATABASES

    Get PDF
    Kako bi se optimiziralo skladištenje, uređivanje, prikaz i analiza prostornih podataka, potrebno je proširiti postojeće objektno relacijske sustave za upravljanje bazama podataka s prostornim funkcionalnostima. U ovom članku prikazali smo pregled tehnologija za implementaciju prostornih baza podataka s naglaskom na PostGIS ekstenziju za PostgreSQL sustav za upravljanje relacijskim bazama podataka. Prikazane su prednosti korištenja navedene tehnologije te su uz tehnike i algoritme koji se koriste za izradu prostornih indeksa prikazani tipovi podataka koje obrađeni sustav podržava. Na kraju članka prikazali smo vlastito rješenje konfiguracije prostorne baze s podrškom za višekorisnički pristup, što smo demonstrirali kroz jednostavnu, ali efikasnu aplikaciju za upravljanje prostornim podacima.Existing relational database management systems need to be extended with spatial functions for storing, editing, viewing and analysis of spatial data. In this paper we showcase technologies for implementing spatial databases with focus on PostGIS extension for PostgreSQL relational database management system. The paper lists pros of using this technology and explains techniques and algorithms for indexing spatial data along with data types supported by mentioned systems. At the end of this paper we described our own solution of spatial database implementation for multiuser access, which we demonstrated by building simple but efficient application for using spatial data

    Land surveying plans as part of the GIS data base

    Get PDF
    Geographic Information Systems can provide support for business and organizations in terms of storage and processing of spatial and other related data. The present work defines storage in a single information system, together with existing data which can be obtained through surveying and mapping authorities. Logical and physical structure of files is defined, together with links between the types of objects in databases (Register of Geographical Names), Register of spatial units (RPE) and the consolidated register of public infrastructure with topographic simbology and processes to maintain the database. For the definition of metadata schemes and data quality SIST EN ISO 19113, 19115 SIST EN ISO standards and INSPIRE Directive are considered. Practical example illustrates the theoretical definition of a database with graphically presented spatial data

    KNOWLEDGE SHARING AND NEGOTIATION SUPPORT IN MULTIPERSON DECISION SUPPORT SYSTEMS

    Get PDF
    A number of DSS for supporting decisions by more than one person have been proposed. These can be categorized by spatial distance (local vs. remote), temporal distance (meeting vs. mailing), commonality of goals (cooperation vs. bargaining), and control (democratic vs. hierarchical). Existing frameworks for model management in single-user DSS seem insufficient for such systems. This paper views multiperson DSS as a loosely coupled system of model and data bases which may be human (the DSS builders and users) or computerized. The systems components have different knowledge bases and may have different interests. Their interaction is characterized by knowledge sharing for uncertainty reduction and cooperative problem-solving, and negotiation for view integration, consensus-seeking, and compromise. Requirements for the different types of multiperson DSS can be formalized as application-level communications protocols. Based on a literature review and recent experience with a number of multiperson DSS prototypes, artificial intelligence-based message-passing protocols are compared with database-centered approaches and model-based techniques, such as multicriteria decision making.Information Systems Working Papers Serie
    corecore