18 research outputs found
6 Access Methods and Query Processing Techniques
The performance of a database management system (DBMS) is fundamentally dependent on the access methods and query processing techniques available to the system. Traditionally, relational DBMSs have relied on well-known access methods, such as the ubiquitous B +-tree, hashing with chaining, and, in som
Access methods and query processing in temporal and spatiotemporal databases
Time is a very important concept related to almost all phenomena of the real world. Information and data correspond to specific time-points and usually change over time. One of the roles of databases is the support of the time evolving nature of the phenomena they model. This ability is of fundamental importance in many applications, such as accounting, banking, law, medical, commercial, econometrics, land and cartographic applications. Temporal and spatio-temporal databases are two categories of databases, which equally deal with the concept of time but are, however, related to different types of applications. Conventional databases have been designed to maintain only the most recently stored information that is current information. As this information is updated, the database content is modified and the last stored information is removed from the database. Therefore, the only retained version of the database is the current one. Temporal databases, on the other hand, support the maintenance of time-evolving data and the satisfaction of specialized queries that are related to three notions of time for these data: the past, the current and the present. Traditional spatial databases are restricted to represent, store and manipulate only static spatial data, such as points, lines, surfaces, volumes and hyper-volumes in multi-dimensional space. However, there are many applications that demand the storage and retrieval of continuously changing spatial information Geographical information systems, image and multi- media databases, urban planning, transportation, mobile communications, computer-aided design and medical databases are only some of the applications that would benefit from the management of this type of dynamically-changing spatial information. Spatio-temporal databases manipulate spatial data, the geometry of which changes dynamically. They provide the chronological framework for the efficient storage and retrieval of all the states of a spatial database over time. This includes the current and past states and the support of spatial queries that refer to present and past time-points as well. In this doctoral dissertation, the research over the temporal and spatio-temporal databases focuses on data that are indexed according to transaction time. More specifically, with regards to spatio-temporal databases, the present research focuses in time-evolving regional data. Real world examples of such applications include the storage and manipulation of data of meteorological phenomena (e.g. atmospheric pressure-zones; icebergs as they change and move over time), of faunal phenomena (e.g. movements of populations of animals/birds/fishes), of urban phenomena (e.g. traffic jams or traffic networks in big cities; city planning events: building and destroying), of natural catastrophes (e.g. fires; hurricanes; oil slicks; floods; pollution clouds) etc. In particular, the focus of the present dissertation is on designing efficient access methods and query processing algorithms for transaction-time databases and databases for time- evolving regional data. This contribution is considered to be of particular importance because access methods play a very important role in the development of efficient database management systems. One access method for transaction-time data and four access methods for time-evolving regional data are designed and implemented. Are also implemented efficient algorithms for the processing of three queries for temporal and five new queries for spatio-temporal databases. These queries exploit the advantage of the properties of these new access methods. The first in the bibliography generator for synthetic time-evolving regional data is also introduced. Finally, an extensive experimental performance evaluation and comparison of all the above four new access methods for time-evolving regional data, is presented. Because of the lack of real benchmark data, the regional data sets used in the experiments were synthetic raster images with real-world semantics that were generated by the new synthetic data generator. The comparison is made under a common and flexible benchmarking environment in order to make it possible to choose the best technique depending on the application and on the characteristics of the manipulated images
A robust gender inference model for online social networks and its application to LinkedIn and Twitter
Online social networking services have come to dominate the dot com world: Countless online communities coexist on the social Web. Some typically characteristic user attributes, such as gender, age group, sexual orientation, are not automatically part of the profile information. In some cases user attributes can even be deliberately and maliciously falsified. This paper examines automated inference of gender on online social networks by analyzing written text with a combination of natural language processing and classification techniques. Extensive experimentation on LinkedIn and Twitter has yielded accuracy of this gender identification technique of up to 98.4 percent
Processing of Spatio-Temporal Queries in Image Databases
Overlapping Linear Quadtrees is a structure suitable for storing consecutive raster images according to transaction time (a database of evolving images). This structure saves considerable space without sacrificing time performance in accessing every single image. Moreover, it can be used for answering efficiently window queries for a number of consecutive images (spatio-temporal queries). In this paper, we present three such temporal window queries: strict containment, border intersect and cover. Besides, based on a method of producing synthetic pairs of evolving images (random images with specified aggregation) we present empirical results on the I/O performance of these queries