61 research outputs found

    Semantic Similarity of Spatial Scenes

    Get PDF
    The formalization of similarity in spatial information systems can unleash their functionality and contribute technology not only useful, but also desirable by broad groups of users. As a paradigm for information retrieval, similarity supersedes tedious querying techniques and unveils novel ways for user-system interaction by naturally supporting modalities such as speech and sketching. As a tool within the scope of a broader objective, it can facilitate such diverse tasks as data integration, landmark determination, and prediction making. This potential motivated the development of several similarity models within the geospatial and computer science communities. Despite the merit of these studies, their cognitive plausibility can be limited due to neglect of well-established psychological principles about properties and behaviors of similarity. Moreover, such approaches are typically guided by experience, intuition, and observation, thereby often relying on more narrow perspectives or restrictive assumptions that produce inflexible and incompatible measures. This thesis consolidates such fragmentary efforts and integrates them along with novel formalisms into a scalable, comprehensive, and cognitively-sensitive framework for similarity queries in spatial information systems. Three conceptually different similarity queries at the levels of attributes, objects, and scenes are distinguished. An analysis of the relationship between similarity and change provides a unifying basis for the approach and a theoretical foundation for measures satisfying important similarity properties such as asymmetry and context dependence. The classification of attributes into categories with common structural and cognitive characteristics drives the implementation of a small core of generic functions, able to perform any type of attribute value assessment. Appropriate techniques combine such atomic assessments to compute similarities at the object level and to handle more complex inquiries with multiple constraints. These techniques, along with a solid graph-theoretical methodology adapted to the particularities of the geospatial domain, provide the foundation for reasoning about scene similarity queries. Provisions are made so that all methods comply with major psychological findings about people’s perceptions of similarity. An experimental evaluation supplies the main result of this thesis, which separates psychological findings with a major impact on the results from those that can be safely incorporated into the framework through computationally simpler alternatives

    The Internet of Things as a Privacy-Aware Database Machine

    Get PDF
    Instead of using a computer cluster with homogeneous nodes and very fast high bandwidth connections, we want to present the vision to use the Internet of Things (IoT) as a database machine. This is among others a key factor for smart (assistive) systems in apartments (AAL, ambient assisted living), offices (AAW, ambient assisted working), Smart Cities as well as factories (IIoT, Industry 4.0). It is important to massively distribute the calculation of analysis results on sensor nodes and other low-resource appliances in the environment, not only for reasons of performance, but also for reasons of privacy and protection of corporate knowledge. Thus, functions crucial for assistive systems, such as situation, activity, and intention recognition, are to be automatically transformed not only in database queries, but also in local nodes of lower performance. From a database-specific perspective, analysis operations on large quantities of distributed sensor data, currently based on classical big-data techniques and executed on large, homogeneously equipped parallel computers have to be automatically transformed to billions of processors with energy and capacity restrictions. In this visionary paper, we will focus on the database-specific perspective and the fundamental research questions in the underlying database theory

    On the Discovery of Semantically Meaningful SQL Constraints from Armstrong Samples: Foundations, Implementation, and Evaluation

    No full text
    A database is said to be C-Armstrong for a finite set Σ of data dependencies in a class C if the database satisfies all data dependencies in Σ and violates all data dependencies in C that are not implied by Σ. Therefore, Armstrong databases are concise, user-friendly representations of abstract data dependencies that can be used to judge, justify, convey, and test the understanding of database design choices. Indeed, an Armstrong database satisfies exactly those data dependencies that are considered meaningful by the current design choice Σ. Structural and computational properties of Armstrong databases have been deeply investigated in Codd’s Turing Award winning relational model of data. Armstrong databases have been incorporated in approaches towards relational database design. They have also been found useful for the elicitation of requirements, the semantic sampling of existing databases, and the specification of schema mappings. This research establishes a toolbox of Armstrong databases for SQL data. This is challenging as SQL data can contain null marker occurrences in columns declared NULL, and may contain duplicate rows. Thus, the existing theory of Armstrong databases only applies to idealized instances of SQL data, that is, instances without null marker occurrences and without duplicate rows. For the thesis, two popular interpretations of null markers are considered: the no information interpretation used in SQL, and the exists but unknown interpretation by Codd. Furthermore, the study is limited to the popular class C of functional dependencies. However, the presence of duplicate rows means that the class of uniqueness constraints is no longer subsumed by the class of functional dependencies, in contrast to the relational model of data. As a first contribution a provably-correct algorithm is developed that computes Armstrong databases for an arbitrarily given finite set of uniqueness constraints and functional dependencies. This contribution is based on axiomatic, algorithmic and logical characterizations of the associated implication problem that are also established in this thesis. While the problem to decide whether a given database is Armstrong for a given set of such constraints is precisely exponential, our algorithm computes an Armstrong database with a number of rows that is at most quadratic in the number of rows of a minimum-sized Armstrong database. As a second contribution the algorithms are implemented in the form of a design tool. Users of the tool can therefore inspect Armstrong databases to analyze their current design choice Σ. Intuitively, Armstrong databases are useful for the acquisition of semantically meaningful constraints, if the users can recognize the actual meaningfulness of constraints that they incorrectly perceived as meaningless before the inspection of an Armstrong database. As a final contribution, measures are introduced that formalize the term “useful” and it is shown by some detailed experiments that Armstrong tables, as computed by the tool, are indeed useful. In summary, this research establishes a toolbox of Armstrong databases that can be applied by database designers to concisely visualize constraints on SQL data. Such support can lead to database designs that guarantee efficient data management in practice

    The Internet of Things as a Privacy-Aware Database Machine

    Get PDF
    Instead of using a computer cluster with homogeneous nodes and very fast high bandwidth connections, we want to present the vision to use the Internet of Things (IoT) as a database machine. This is among others a key factor for smart (assistive) systems in apartments (AAL, ambient assisted living), offices (AAW, ambient assisted working), Smart Cities as well as factories (IIoT, Industry 4.0). It is important to massively distribute the calculation of analysis results on sensor nodes and other low-resource appliances in the environment, not only for reasons of performance, but also for reasons of privacy and protection of corporate knowledge. Thus, functions crucial for assistive systems, such as situation, activity, and intention recognition, are to be automatically transformed not only in database queries, but also in local nodes of lower performance. From a database-specific perspective, analysis operations on large quantities of distributed sensor data, currently based on classical big-data techniques and executed on large, homogeneously equipped parallel computers have to be automatically transformed to billions of processors with energy and capacity restrictions. In this visionary paper, we will focus on the database-specific perspective and the fundamental research questions in the underlying database theory

    FCAIR 2012 Formal Concept Analysis Meets Information Retrieval Workshop co-located with the 35th European Conference on Information Retrieval (ECIR 2013) March 24, 2013, Moscow, Russia

    Get PDF
    International audienceFormal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classifiation. The area came into being in the early 1980s and has since then spawned over 10000 scientific publications and a variety of practically deployed tools. FCA allows one to build from a data table with objects in rows and attributes in columns a taxonomic data structure called concept lattice, which can be used for many purposes, especially for Knowledge Discovery and Information Retrieval. The Formal Concept Analysis Meets Information Retrieval (FCAIR) workshop collocated with the 35th European Conference on Information Retrieval (ECIR 2013) was intended, on the one hand, to attract researchers from FCA community to a broad discussion of FCA-based research on information retrieval, and, on the other hand, to promote ideas, models, and methods of FCA in the community of Information Retrieval

    Content warehouses

    Get PDF
    Nowadays, content management systems are an established technology. Based on the experiences from several application scenarios we discuss the points of contact between content management systems and other disciplines of information systems engineering like data warehouses, data mining, and data integration. We derive a system architecture called "content warehouse" that integrates these technologies and defines a more general and more sophisticated view on content management. As an example, a system for the collection, maintenance, and evaluation of biological content like survey data or multimedia resources is shown as a case study

    Contextualized and personalized location-based services

    Get PDF
    Advances in the technologies of smart mobile devices and tiny sensors together with the increase in the number of web resources open up a plethora of new mobile information services where people can acquire and disseminate information at any place and any time. Location-based services (LBS) are characterized by providing users with useful and local information, i.e. information that belongs to a particular domain of interest to the user and can be of use while the user remains in a particular area. In addition, LBS need to take into account the interactions and dependencies between services, user and context for the information filtering and delivery in order to fulfill the needs and constraints of mobile users. We argue that consequently it brings up a series of technical challenges in terms of data semantics and infrastructure, context-awareness and personalization, as well as query formulation and answering etc. They can not be simply extended from existing traditional data management strategies. Instead, they need a new solution. Firstly, we propose a semantic LBS infrastructure on the basis of the modularized ontologies approach. We elaborate a core ontology which is mainly composed of three modules describing the services, users and contexts. The core ontology aims at presenting an abstract view (a model) of all information in LBS. In contrast, data describing the instances (of services user and actual contextual data) are stored in three independent data stores, called the service profiles, user profiles and context profiles. These data are semantically aligned with the concepts in the core ontology through a set of mappings. This approach enables the distributed data sources to be maintained in a autonomous manner, which is well adapted to the high dynamics and mobility of the data sources. Secondly, we separately address the function, features, and our modelling approach of the three major players, i.e. service, context and user in LBS. Then, we define a set of constructs to represent their interactions and inter-dependencies and illustrate how these semantic constructs can contribute to personalized and contextualized query processing. Service classes are organized in a taxonomy, which distinguishes the services by their business functions. This concept hierarchy helps to analyze and reformulate the users' queries. We introduce three new kinds of relationships in the service module to enhance the semantics of interactions and dependencies between services. We identify five key components of contexts in LBS and regard them as a semantic contextual basis for LBS. Component contexts are related together by specific composition relationships that can describe spatio-temporal constraints. A user profile contains personal information about a given user and possibly a set of self-defined rules, which offer hints on what the user likes or dislikes, and what could attract him or her. In the core ontology clustering users with common features can help the cooperative query answering. Each of the three modules of the core ontology is an ontology in itself. They are inter-related by relationships that link concepts belonging to two different modules. The LBS fully benefits from the modularized structure of the core ontology. It allows restricting the search space, as well as facilitating the maintenance of each module. Finally, we studied the query reformulation and processing issues in LBS. How to make the query interface tangible and provide rapid and relevant answers are typical concerns in all information services. Our query format not only fully obeys the "simple, tangible and effective" golden-rules of user-interface design, but also satisfies the needs of domain-independent interface and emphasizes the importance of spatio-temporal constraints in LBS. With pre-defined spatio-temporal operators, users can easily specify in their queries the spatio-temporal availability they need for the services they are looking for. This allows eliminating most of irrelevant answers that are usually generated by keyword-based approaches. Constraints in the various dimensions (what, when, where and what-else) can be expressed by a conjunctive query, and then be smoothly translated to RDF-patterns. We illustrate our query answering strategy by using the SPARQL syntax, and explain how the relaxation can be done with rules specified in the query relaxation profile
    • …
    corecore