Search CORE

464 research outputs found

Warped Gaussian Processes Occupancy Mapping with Uncertain Inputs

Author: Dissanayake G
Ghaffari Jadidi M
Miro JV
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/01/2017
Field of study

© 2017 IEEE. In this paper, we study extensions to the Gaussian processes (GPs) continuous occupancy mapping problem. There are two classes of occupancy mapping problems that we particularly investigate. The first problem is related to mapping under pose uncertainty and how to propagate pose estimation uncertainty into the map inference. We develop expected kernel and expected submap notions to deal with uncertain inputs. In the second problem, we account for the complication of the robot's perception noise using warped Gaussian processes (WGPs). This approach allows for non-Gaussian noise in the observation space and captures the possible nonlinearity in that space better than standard GPs. The developed techniques can be applied separately or concurrently to a standard GP occupancy mapping problem. According to our experimental results, although taking into account pose uncertainty leads, as expected, to more uncertain maps, by modeling the nonlinearities present in the observation space WGPs improve the map quality

arXiv.org e-Print Archive

OPUS - University of Technology Sydney

A Survey on Intent-based Diversification for Fuzzy Keyword Search

Author: Champa H.N.
Sijin P.
Venugopal K.R.
Publication venue: IJCSIT
Publication date: 01/01/2017
Field of study

Keyword search is an interesting phenomenon, it is the process of finding important and relevant information from various data repositories. Structured and semistructured data can precisely be stored. Fully unstructured documents can annotate and be stored in the form of metadata. For the total web search, half of the web search is for information exploration process. In this paper, the earlier works for semantic meaning of keywords based on their context in the specified documents are thoroughly analyzed. In a tree data representation, the nodes are objects and could hold some intention. These nodes act as anchors for a Smallest Lowest Common Ancestor (SLCA) based pruning process. Based on their features, nodes are clustered. The feature is a distinctive attribute, it is the quality, property or traits of something. Automatic text classification algorithms are the modern way for feature extraction. Summarization and segmentation produce n consecutive grams from various forms of documents. The set of items which describe and summarize one important aspect of a query is known as the facet. Instead of exact string matching a fuzzy mapping based on semantic correlation is the new trend, whereas the correlation is quantified by cosine similarity. Once the outlier is detected, nearest neighbors of the selected points are mapped to the same hash code of the intend nodes with high probability. These methods collectively retrieve the relevant data and prune out the unnecessary data, and at the same time create a hash signature for the nearest neighbor search. This survey emphasizes the need for a framework for fuzzy oriented keyword search

ePrints@Bangalore University

A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge

Author: Han Yikun
Liu Chunjiang
Wang Pengfei
Publication venue
Publication date: 18/10/2023
Field of study

A vector database is used to store high-dimensional data that cannot be characterized by traditional DBMS. Although there are not many articles describing existing or introducing new vector database architectures, the approximate nearest neighbor search problem behind vector databases has been studied for a long time, and considerable related algorithmic articles can be found in the literature. This article attempts to comprehensively review relevant algorithms to provide a general understanding of this booming research area. The basis of our framework categorises these studies by the approach of solving ANNS problem, respectively hash-based, tree-based, graph-based and quantization-based approaches. Then we present an overview of existing challenges for vector databases. Lastly, we sketch how vector databases can be combined with large language models and provide new possibilities

arXiv.org e-Print Archive

Efficient Analysis in Multimedia Databases

Author: Kunath Peter
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 01/01/2006
Field of study

The rapid progress of digital technology has led to a situation where computers have become ubiquitous tools. Now we can find them in almost every environment, be it industrial or even private. With ever increasing performance computers assumed more and more vital tasks in engineering, climate and environmental research, medicine and the content industry. Previously, these tasks could only be accomplished by spending enormous amounts of time and money. By using digital sensor devices, like earth observation satellites, genome sequencers or video cameras, the amount and complexity of data with a spatial or temporal relation has gown enormously. This has led to new challenges for the data analysis and requires the use of modern multimedia databases. This thesis aims at developing efficient techniques for the analysis of complex multimedia objects such as CAD data, time series and videos. It is assumed that the data is modeled by commonly used representations. For example CAD data is represented as a set of voxels, audio and video data is represented as multi-represented, multi-dimensional time series. The main part of this thesis focuses on finding efficient methods for collision queries of complex spatial objects. One way to speed up those queries is to employ a cost-based decompositioning, which uses interval groups to approximate a spatial object. For example, this technique can be used for the Digital Mock-Up (DMU) process, which helps engineers to ensure short product cycles. This thesis defines and discusses a new similarity measure for time series called threshold-similarity. Two time series are considered similar if they expose a similar behavior regarding the transgression of a given threshold value. Another part of the thesis is concerned with the efficient calculation of reverse k-nearest neighbor (RkNN) queries in general metric spaces using conservative and progressive approximations. The aim of such RkNN queries is to determine the impact of single objects on the whole database. At the end, the thesis deals with video retrieval and hierarchical genre classification of music using multiple representations. The practical relevance of the discussed genre classification approach is highlighted with a prototype tool that helps the user to organize large music collections. Both the efficiency and the effectiveness of the presented techniques are thoroughly analyzed. The benefits over traditional approaches are shown by evaluating the new methods on real-world test datasets

CiteSeerX

Digitale Hochschulschriften der LMU

27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

Author: ESA <27. 2019, München>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/09/2019
Field of study

Digitale Bibliothek Thüringen

Machine learning approximation techniques using dual trees

Author: Ergashbaev Denis
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2015
Field of study

This master thesis explores a dual-tree framework as applied to a particular class of machine learning problems that are collectively referred to as generalized n-body problems. It builds a new algorithm on top of it and improves existing Boosted OGE classifier

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Predictive maintenance of electrical grid assets: internship at EDP Distribuição - Energia S.A

Author: Gameiro Carlos Filipe Teixeira
Publication venue
Publication date: 04/02/2021
Field of study

Internship Report presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThis report will describe the activities developed during an internship at EDP Distribuição, focusing on a Predictive Maintenance analytics project directed at high voltage electrical grid assets including Overhead Lines, Power Transformers and Circuit Breakers. The project’s main goal is to support EDP’s asset management processes by improving maintenance and investing planning. The project’s main deliverables are the Probability of Failure metric that forecast asset failures 15 days ahead of time, estimated through supervised machine learning models; the Health Index metric that indicates asset’s current state and condition, implemented though the Ofgem methodology; and two asset management dashboards. The project was implemented by an external service provider, a consultant company, and during the internship it was possible to integrate the team, and participate in the development activities

Repositório da Universidade Nova de Lisboa

LIPIcs, Volume 244, ESA 2022, Complete Volume

Author: Chechik Shiri
Herman Grzegorz
Navarro Gonzalo
Rotenberg Eva
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual European Symposium on Algorithms (ESA 2022)
Publication date: 01/01/2022
Field of study

LIPIcs, Volume 244, ESA 2022, Complete Volum

Dagstuhl Research Online Publication Server