Search CORE

391 research outputs found

Processing transitive nearest-neighbor queries in multi-channel access environments

Author: LEE Wang-Chien
MITRA Prasnjit
ZHANG Xiao
ZHENG Baihua
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

Wireless broadcast is an efficient way for information dissemination due to its good scalability [10]. Existing works typically assume mobile devices, such as cell phones and PDAs, can access only one channel at a time. In this paper, we consider a scenario of near future where a mobile device has the ability to process queries using information simultaneously received from multiple channels. We focus on the query processing of the transitive nearest neighbor (TNN) search [19]. Two TNN algorithms developed for a single broadcast channel environment are adapted to our new broadcast enviroment. Based on the obtained insights, we propose two new algorithms, namely Double-NN-Search and Hybrid-NN-Search algorithms. Further, we develop an optimization technique, called approximate-NN (ANN), to reduce the energy consumption in mobile devices. Finally, we conduct a comprehensive set of experiments to validate our proposals. The result shows that our new algorithms provide a better performance than the existing ones and the optimization technique efficiently reduces energy consumption. Keywords Multi-Channel access, transitive nearest neighbor, query processing

CiteSeerX

Crossref

Institutional Knowledge at Singapore Management University

Searching and mining in enriched geo-spatial data

Author: Schmid Klaus Arthur
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 09/12/2016
Field of study

The emergence of new data collection mechanisms in geo-spatial applications paired with a heightened tendency of users to volunteer information provides an ever-increasing flow of data of high volume, complex nature, and often associated with inherent uncertainty. Such mechanisms include crowdsourcing, automated knowledge inference, tracking, and social media data repositories. Such data bearing additional information from multiple sources like probability distributions, text or numerical attributes, social context, or multimedia content can be called multi-enriched. Searching and mining this abundance of information holds many challenges, if all of the data's potential is to be released. This thesis addresses several major issues arising in that field, namely path queries using multi-enriched data, trend mining in social media data, and handling uncertainty in geo-spatial data. In all cases, the developed methods have made significant contributions and have appeared in or were accepted into various renowned international peer-reviewed venues. A common use of geo-spatial data is path queries in road networks where traditional methods optimise results based on absolute and ofttimes singular metrics, i.e., finding the shortest paths based on distance or the best trade-off between distance and travel time. Integrating additional aspects like qualitative or social data by enriching the data model with knowledge derived from sources as mentioned above allows for queries that can be issued to fit a broader scope of needs or preferences. This thesis presents two implementations of incorporating multi-enriched data into road networks. In one case, a range of qualitative data sources is evaluated to gain knowledge about user preferences which is subsequently matched with locations represented in a road network and integrated into its components. Several methods are presented for highly customisable path queries that incorporate a wide spectrum of data. In a second case, a framework is described for resource distribution with reappearance in road networks to serve one or more clients, resulting in paths that provide maximum gain based on a probabilistic evaluation of available resources. Applications for this include finding parking spots. Social media trends are an emerging research area giving insight in user sentiment and important topics. Such trends consist of bursts of messages concerning a certain topic within a time frame, significantly deviating from the average appearance frequency of the same topic. By investigating the dissemination of such trends in space and time, this thesis presents methods to classify trend archetypes to predict future dissemination of a trend. Processing and querying uncertain data is particularly demanding given the additional knowledge required to yield results with probabilistic guarantees. Since such knowledge is not always available and queries are not easily scaled to larger datasets due to the #P-complete nature of the problem, many existing approaches reduce the data to a deterministic representation of its underlying model to eliminate uncertainty. However, data uncertainty can also provide valuable insight into the nature of the data that cannot be represented in a deterministic manner. This thesis presents techniques for clustering uncertain data as well as query processing, that take the additional information from uncertainty models into account while preserving scalability using a sampling-based approach, while previous approaches could only provide one of the two. The given solutions enable the application of various existing clustering techniques or query types to a framework that manages the uncertainty.Das Erscheinen neuer Methoden zur Datenerhebung in räumlichen Applikationen gepaart mit einer erhöhten Bereitschaft der Nutzer, Daten über sich preiszugeben, generiert einen stetig steigenden Fluss von Daten in großer Menge, komplexer Natur, und oft gepaart mit inhärenter Unsicherheit. Beispiele für solche Mechanismen sind Crowdsourcing, automatisierte Wissensinferenz, Tracking, und Daten aus sozialen Medien. Derartige Daten, angereichert mit mit zusätzlichen Informationen aus verschiedenen Quellen wie Wahrscheinlichkeitsverteilungen, Text- oder numerische Attribute, sozialem Kontext, oder Multimediainhalten, werden als multi-enriched bezeichnet. Suche und Datamining in dieser weiten Datenmenge hält viele Herausforderungen bereit, wenn das gesamte Potenzial der Daten genutzt werden soll. Diese Arbeit geht auf mehrere große Fragestellungen in diesem Feld ein, insbesondere Pfadanfragen in multi-enriched Daten, Trend-mining in Daten aus sozialen Netzwerken, und die Beherrschung von Unsicherheit in räumlichen Daten. In all diesen Fällen haben die entwickelten Methoden signifikante Forschungsbeiträge geleistet und wurden veröffentlicht oder angenommen zu diversen renommierten internationalen, von Experten begutachteten Konferenzen und Journals. Ein gängiges Anwendungsgebiet räumlicher Daten sind Pfadanfragen in Straßennetzwerken, wo traditionelle Methoden die Resultate anhand absoluter und oft auch singulärer Maße optimieren, d.h., der kürzeste Pfad in Bezug auf die Distanz oder der beste Kompromiss zwischen Distanz und Reisezeit. Durch die Integration zusätzlicher Aspekte wie qualitativer Daten oder Daten aus sozialen Netzwerken als Anreicherung des Datenmodells mit aus diesen Quellen abgeleitetem Wissen werden Anfragen möglich, die ein breiteres Spektrum an Anforderungen oder Präferenzen erfüllen. Diese Arbeit präsentiert zwei Ansätze, solche multi-enriched Daten in Straßennetze einzufügen. Zum einen wird eine Reihe qualitativer Datenquellen ausgewertet, um Wissen über Nutzerpräferenzen zu generieren, welches darauf mit Örtlichkeiten im Straßennetz abgeglichen und in das Netz integriert wird. Diverse Methoden werden präsentiert, die stark personalisierbare Pfadanfragen ermöglichen, die ein weites Spektrum an Daten mit einbeziehen. Im zweiten Fall wird ein Framework präsentiert, das eine Ressourcenverteilung im Straßennetzwerk modelliert, bei der einmal verbrauchte Ressourcen erneut auftauchen können. Resultierende Pfade ergeben einen maximalen Ertrag basieren auf einer probabilistischen Evaluation der verfügbaren Ressourcen. Eine Anwendung ist die Suche nach Parkplätzen. Trends in sozialen Medien sind ein entstehendes Forscchungsgebiet, das Einblicke in Benutzerverhalten und wichtige Themen zulässt. Solche Trends bestehen aus großen Mengen an Nachrichten zu einem bestimmten Thema innerhalb eines Zeitfensters, so dass die Auftrittsfrequenz signifikant über den durchschnittlichen Level liegt. Durch die Untersuchung der Fortpflanzung solcher Trends in Raum und Zeit präsentiert diese Arbeit Methoden, um Trends nach Archetypen zu klassifizieren und ihren zukünftigen Weg vorherzusagen. Die Anfragebearbeitung und Datamining in unsicheren Daten ist besonders herausfordernd, insbesondere im Hinblick auf das notwendige Zusatzwissen, um Resultate mit probabilistischen Garantien zu erzielen. Solches Wissen ist nicht immer verfügbar und Anfragen lassen sich aufgrund der \P-Vollständigkeit des Problems nicht ohne Weiteres auf größere Datensätze skalieren. Dennoch kann Datenunsicherheit wertvollen Einblick in die Struktur der Daten liefern, der mit deterministischen Methoden nicht erreichbar wäre. Diese Arbeit präsentiert Techniken zum Clustering unsicherer Daten sowie zur Anfragebearbeitung, die die Zusatzinformation aus dem Unsicherheitsmodell in Betracht ziehen, jedoch gleichzeitig die Skalierbarkeit des Ansatzes auf große Datenmengen sicherstellen

Scalable And Secure Provenance Querying For Scientific Workflows And Its Application In Autism Study

Author: Bhuyan Fahima Amin
Publication venue: DigitalCommons@WayneState
Publication date: 01/01/2018
Field of study

In the era of big data, scientific workflows have become essential to automate scientific experiments and guarantee repeatability. As both data and workflow increase in their scale, requirements for having a data lineage management system commensurate with the complexity of the workflow also become necessary, calling for new scalable storage, query, and analytics infrastructure. This system that manages and preserves the derivation history and morphosis of data, known as provenance system, is essential for maintaining quality and trustworthiness of data products and ensuring reproducibility of scientific discoveries. With a flurry of research and increased adoption of scientific workflows in processing sensitive data, i.e., health and medication domain, securing information flow and instrumenting access privileges in the system have become a fundamental precursor to deploying large-scale scientific workflows. That has become more important now since today team of scientists around the world can collaborate on experiments using globally distributed sensitive data sources. Hence, it has become imperative to augment scientific workflow systems as well as the underlying provenance management systems with data security protocols. Provenance systems, void of data security protocol, are susceptible to vulnerability. In this dissertation research, we delineate how scientific workflows can improve therapeutic practices in autism spectrum disorders. The data-intensive computation inherent in these workflows and sensitive nature of the data, necessitate support for scalable, parallel and robust provenance queries and secured view of data. With that in perspective, we propose

OPQL^{Pig}

, a parallel, robust, reliable and scalable provenance query language and introduce the concept of access privilege inheritance in the provenance systems. We characterize desirable properties of role-based access control protocol in scientific workflows and demonstrate how the qualities are integrated into the workflow provenance systems as well. Finally, we describe how these concepts fit within the DATAVIEW workflow management system

Digital Commons@Wayne State University

Incremental Processing and Optimization of Update Streams

Author: Liu Mengmeng
Publication venue: ScholarlyCommons
Publication date: 01/01/2016
Field of study

Over the recent years, we have seen an increasing number of applications in networking, sensor networks, cloud computing, and environmental monitoring, which monitor, plan, control, and make decisions over data streams from multiple sources. We are interested in extending traditional stream processing techniques to meet the new challenges of these applications. Generally, in order to support genuine continuous query optimization and processing over data streams, we need to systematically understand how to address incremental optimization and processing of update streams for a rich class of queries commonly used in the applications. Our general thesis is that efficient incremental processing and re-optimization of update streams can be achieved by various incremental view maintenance techniques if we cast the problems as incremental view maintenance problems over data streams. We focus on two incremental processing of update streams challenges currently not addressed in existing work on stream query processing: incremental processing of transitive closure queries over data streams, and incremental re-optimization of queries. In addition to addressing these specific challenges, we also develop a working prototype system Aspen, which serves as an end-to-end stream processing system that has been deployed as the foundation for a case study of our SmartCIS application. We validate our solutions both analytically and empirically on top of our prototype system Aspen, over a variety of benchmark workloads such as TPC-H and LinearRoad Benchmarks

ScholarlyCommons@Penn

Searching and mining in enriched geo-spatial data

Author: Schmid Klaus Arthur
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 09/12/2016
Field of study

Digitale Hochschulschriften der LMU

Data Management for Dynamic Multimedia Analytics and Retrieval

Author: Gasser Ralph Marc Philipp
Publication venue
Publication date: 01/01/2023
Field of study

Multimedia data in its various manifestations poses a unique challenge from a data storage and data management perspective, especially if search, analysis and analytics in large data corpora is considered. The inherently unstructured nature of the data itself and the curse of dimensionality that afflicts the representations we typically work with in its stead are cause for a broad range of issues that require sophisticated solutions at different levels. This has given rise to a huge corpus of research that puts focus on techniques that allow for effective and efficient multimedia search and exploration. Many of these contributions have led to an array of purpose-built, multimedia search systems. However, recent progress in multimedia analytics and interactive multimedia retrieval, has demonstrated that several of the assumptions usually made for such multimedia search workloads do not hold once a session has a human user in the loop. Firstly, many of the required query operations cannot be expressed by mere similarity search and since the concrete requirement cannot always be anticipated, one needs a flexible and adaptable data management and query framework. Secondly, the widespread notion of staticity of data collections does not hold if one considers analytics workloads, whose purpose is to produce and store new insights and information. And finally, it is impossible even for an expert user to specify exactly how a data management system should produce and arrive at the desired outcomes of the potentially many different queries. Guided by these shortcomings and motivated by the fact that similar questions have once been answered for structured data in classical database research, this Thesis presents three contributions that seek to mitigate the aforementioned issues. We present a query model that generalises the notion of proximity-based query operations and formalises the connection between those queries and high-dimensional indexing. We complement this by a cost-model that makes the often implicit trade-off between query execution speed and results quality transparent to the system and the user. And we describe a model for the transactional and durable maintenance of high-dimensional index structures. All contributions are implemented in the open-source multimedia database system Cottontail DB, on top of which we present an evaluation that demonstrates the effectiveness of the proposed models. We conclude by discussing avenues for future research in the quest for converging the fields of databases on the one hand and (interactive) multimedia retrieval and analytics on the other

edoc

マルチレベル並列化とアプリケーション指向データレイアウトを用いるハードウェアアクセラレータの設計と実装

Author: Koizumi Kenichi
小泉賢一
Publication venue: 情報理工学系研究科創造情報学専攻
Publication date: 22/03/2018
Field of study

学位の種別: 課程博士審査委員会委員 : （主査）東京大学教授稲葉雅幸, 東京大学教授須田礼仁, 東京大学教授五十嵐健夫, 東京大学教授山西健司, 東京大学准教授稲葉真理, 東京大学講師中山英樹University of Tokyo(東京大学

Interoperability of Traffic Infrastructure Planning and Geospatial Information Systems

Author: Nejatbakhsh Esfahani Nazereh
Publication venue
Publication date: 01/10/2018
Field of study

Building Information Modelling (BIM) as a Model-based design facilitates to investigate multiple solutions in the infrastructure planning process. The most important reason for implementing model-based design is to help designers and to increase communication between different design parties. It decentralizes and coordinates team collaboration and facilitates faster and lossless project data exchange and management across extended teams and external partners in project lifecycle. Infrastructure are fundamental facilities, services, and installations needed for the functioning of a community or society, such as transportation, roads, communication systems, water and power networks, as well as power plants. Geospatial Information Systems (GIS) as the digital representation of the world are systems for maintaining, managing, modelling, analyzing, and visualizing of the world data including infrastructure. High level infrastructure suits mostly facilitate to analyze the infrastructure design based on the international or user defined standards. Called regulation1-based design, this minimizes errors, reduces costly design conflicts, increases time savings and provides consistent project quality, yet mostly in standalone solutions. Tasks of infrastructure usually require both model based and regulation based design packages. Infrastructure tasks deal with cross-domain information. However, the corresponding data is split in several domain models. Besides infrastructure projects demand a lot of decision makings on governmental as well as on private level considering different data models. Therefore lossless flow of project data as well as documents like regulations across project team, stakeholders, governmental and private level is highly important. Yet infrastructure projects have largely been absent from product modelling discourses for a long time. Thus, as will be explained in chapter 2 interoperability is needed in infrastructure processes. Multimodel (MM) is one of the interoperability methods which enable heterogeneous data models from various domains get bundled together into a container keeping their original format. Existing interoperability methods including existing MM solutions can’t satisfactorily fulfill the typical demands of infrastructure information processes like dynamic data resources and a huge amount of inter model relations. Therefore chapter 3 concept of infrastructure information modelling investigates a method for loose and rule based coupling of exchangeable heterogeneous information spaces. This hypothesis is an extension for the existing MM to a rule-based Multimodel named extended Multimodel (eMM) with semantic rules – instead of static links. The semantic rules will be used to describe relations between data elements of various models dynamically in a link-database. Most of the confusion about geospatial data models arises from their diversity. In some of these data models spatial IDs are the basic identities of entities and in some other data models there are no IDs. That is why in the geospatial data, data structure is more important than data models. There are always spatial indexes that enable accessing to the geodata. The most important unification of data models involved in infrastructure projects is the spatiality. Explained in chapter 4 the method of infrastructure information modelling for interoperation in spatial domains generate interlinks through spatial identity of entities. Match finding through spatial links enables any kind of data models sharing spatial property get interlinked. Through such spatial links each entity receives the spatial information from other data models which is related to the target entity due to sharing equivalent spatial index. This information will be the virtual properties for the object. The thesis uses Nearest Neighborhood algorithm for spatial match finding and performs filtering and refining approaches. For the abstraction of the spatial matching results hierarchical filtering techniques are used for refining the virtual properties. These approaches focus on two main application areas which are product model and Level of Detail (LoD). For the eMM suggested in this thesis a rule based interoperability method between arbitrary data models of spatial domain has been developed. The implementation of this method enables transaction of data in spatial domains run loss less. The system architecture and the implementation which has been applied on the case study of this thesis namely infrastructure and geospatial data models are described in chapter 5. Achieving afore mentioned aims results in reducing the whole project lifecycle costs, increasing reliability of the comprehensive fundamental information, and consequently in independent, cost-effective, aesthetically pleasing, and environmentally sensitive infrastructure design.:ABSTRACT 4 KEYWORDS 7 TABLE OF CONTENT 8 LIST OF FIGURES 9 LIST OF TABLES 11 LIST OF ABBREVIATION 12 INTRODUCTION 13 1.1. A GENERAL VIEW 14 1.2. PROBLEM STATEMENT 15 1.3. OBJECTIVES 17 1.4. APPROACH 18 1.5. STRUCTURE OF THESIS 18 INTEROPERABILITY IN INFRASTRUCTURE ENGINEERING 20 2.1. STATE OF INTEROPERABILITY 21 2.1.1. Interoperability of GIS and BIM 23 2.1.2. Interoperability of GIS and Infrastructure 25 2.2. MAIN CHALLENGES AND RELATED WORK 27 2.3. INFRASTRUCTURE MODELING IN GEOSPATIAL CONTEXT 29 2.3.1. LamdXML: Infrastructure Data Standards 32 2.3.2. CityGML: Geospatial Data Standards 33 2.3.3. LandXML and CityGML 36 2.4. INTEROPERABILITY AND MULTIMODEL TECHNOLOGY 39 2.5. LIMITATIONS OF EXISTING APPROACHES 41 INFRASTRUCTURE INFORMATION MODELLING 44 3.1. MULTI MODEL FOR GEOSPATIAL AND INFRASTRUCTURE DATA MODELS 45 3.2. LINKING APPROACH, QUERYING AND FILTERING 48 3.2.1. Virtual Properties via Link Model 49 3.3. MULTI MODEL AS AN INTERDISCIPLINARY METHOD 52 3.4. USING LEVEL OF DETAIL (LOD) FOR FILTERING 53 SPATIAL MODELLING AND PROCESSING 58 4.1. SPATIAL IDENTIFIERS 59 4.1.1. Spatial Indexes 60 4.1.2. Tree-Based Spatial Indexes 61 4.2. NEAREST NEIGHBORHOOD AS A BASIC LINK METHOD 63 4.3. HIERARCHICAL FILTERING 70 4.4. OTHER FUNCTIONAL LINK METHODS 75 4.5. ADVANCES AND LIMITATIONS OF FUNCTIONAL LINK METHODS 76 IMPLEMENTATION OF THE PROPOSED IIM METHOD 77 5.1. IMPLEMENTATION 78 5.2. CASE STUDY 83 CONCLUSION 89 6.1. SUMMERY 90 6.2. DISCUSSION OF RESULTS 92 6.3. FUTURE WORK 93 BIBLIOGRAPHY 94 7.1. BOOKS AND PAPERS 95 7.2. WEBSITES 10

Technische Universität Dresden: Qucosa

Learning From Multi-Frame Data

Author: Wieschollek Patrick
Publication venue: Universität Tübingen
Publication date: 01/01/2019
Field of study

Multi-frame data-driven methods bear the promise that aggregating multiple observations leads to better estimates of target quantities than a single (still) observation. This thesis examines how data-driven approaches such as deep neural networks should be constructed to improve over single-frame-based counterparts. Besides algorithmic changes, as for example in the design of artificial neural network architectures or the algorithm itself, such an examination is inextricably linked with the consideration of the synthesis of synthetic training data in meaningful size (even if no annotations are available) and quality (if real ground-truth acquisition is not possible), which capture all temporal effects with high fidelity. We start with the introduction of a new algorithm to accelerate a nonparametric learning algorithm by using a GPU adapted implementation to search for the nearest neighbor. While the approaches known so far are clearly surpassed, this empirically reveals that the data generated can be managed within a reasonable time and that several inputs can be processed in parallel even under hardware restrictions. Based on a learning-based solution, we introduce a novel training protocol to bridge the need for carefully curated training data and demonstrate better performance and robustness than a non-parametric search for the nearest neighbor via temporal video alignments. Effective learning in the absence of labels is required when dealing with larger amounts of data that are easy to capture but not feasible or at least costly to label. In addition, we show new ways to generate plausible and realistic synthesized data and their inevitability when it comes to closing the gap to expensive and almost infeasible real-world acquisition. These eventually achieve state-of-the-art results in classical image processing tasks such as reflection removal and video deblurring

Publikationsserver der Universität Tübingen

MPG.PuRe

Routing Protocols in Modern IP Networks

Author: ΟΙΚΟΝΟΜΟΠΟΥΛΟΣ ΧΡΥΣΟΒΑΛΑΝΤΗΣ
ΟΙΚΟΝΟΜΟΠΟΥΛΟΣ ΧΡΥΣΟΒΑΛΑΝΤΗΣ
Publication venue
Publication date: 01/01/2019
Field of study

Τα σύγχρονα IP δίκτυα συνεχώς εξελίσσονται και μεγαλώνουν. Ο αυξανόμενος αριθμός των όλο και περισσότερο ο διασυνδεδεμένων "έξυπνων" συσκευών, υποχρεώνει τους μηχανικούς δικτύων να πρέπει να διαχειριστούν ποικίλα δίκτυα με εκατοντάδες ή χιλιάδες διασυνδεμένες συσκευές. Η δρομολόγηση του IP πρωτοκόλλου είναι ο συνδετικός κρίκος μεταξύ όλων αυτών των δικτύων. Σκοπός της παρούσας πτυχιακής εργασίας είναι να αποτελέσει ένα εργαλείο αναφοράς των πρωτόκολλων δρομολόγησης, για σπουδαστές και μηχανικούς, των οποίων κύρια δραστηριότητα είναι η διαχείριση και η εποπτεία τεχνολογιών και πρωτοκόλλων δρομολόγησης σε IP δίκτυα.Modern IP networks are continuously evolving and growing. The fact that more and more devices become “smart” and have the ability to connect to an IP network makes network engineers come across a variety of different network topologies, on a daily basis, interconnecting hundreds or thousands of different subnets. IP routing is the key link between these subnets. The purpose of this thesis is to become a reference tool for students or engineers whose main responsibility is the management or administration of core routing technologies

Pergamos : Unified Institutional Repository / Digital Library Platform of the National and Kapodistrian University of Athens