Search CORE

732 research outputs found

Incremental file reorganization schemes

Author: Omiecinski Edward Robert
Publication venue: Georgia Institute of Technology
Publication date: 01/01/1986
Field of study

Issued as Final project report, Project no. G-36-66

Scholarly Materials And Research @ Georgia Tech

HD-Index: Pushing the Scalability-Accuracy Boundary for Approximate kNN Search in High-Dimensional Spaces

Author: Arora Akhil
Bhattacharya Arnab
Kumar Piyush
Sinha Sakshi
Publication venue: 'VLDB Endowment'
Publication date: 23/04/2018
Field of study

Nearest neighbor searching of large databases in high-dimensional spaces is inherently difficult due to the curse of dimensionality. A flavor of approximation is, therefore, necessary to practically solve the problem of nearest neighbor search. In this paper, we propose a novel yet simple indexing scheme, HD-Index, to solve the problem of approximate k-nearest neighbor queries in massive high-dimensional databases. HD-Index consists of a set of novel hierarchical structures called RDB-trees built on Hilbert keys of database objects. The leaves of the RDB-trees store distances of database objects to reference objects, thereby allowing efficient pruning using distance filters. In addition to triangular inequality, we also use Ptolemaic inequality to produce better lower bounds. Experiments on massive (up to billion scale) high-dimensional (up to 1000+) datasets show that HD-Index is effective, efficient, and scalable.Comment: PVLDB 11(8):906-919, 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Tensorizing Neural Networks

Author: Novikov Alexander
Osokin Anton
Podoprikhin Dmitry
Vetrov Dmitry
Publication venue
Publication date: 07/12/2015
Field of study

Deep neural networks currently demonstrate state-of-the-art performance in several domains. At the same time, models of this class are very demanding in terms of computational resources. In particular, a large amount of memory is required by commonly used fully-connected layers, making it hard to use the models on low-end devices and stopping the further increase of the model size. In this paper we convert the dense weight matrices of the fully-connected layers to the Tensor Train format such that the number of parameters is reduced by a huge factor and at the same time the expressive power of the layer is preserved. In particular, for the Very Deep VGG networks we report the compression factor of the dense weight matrix of a fully-connected layer up to 200000 times leading to the compression factor of the whole network up to 7 times

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Analysis and conception of tuple spaces in the eye of scalability

Author: Obreiter Philipp
Publication venue
Publication date: 02/08/2007
Field of study

Applications in the emerging fields of eCommerce and Ubiquitous Computing are composed of heterogenous systems that have been designed separately. Hence, these systems loosely coupled and require a coordination mechanism that is able to gap spatial and temporal remoteness. The use of tuple spaces for data-driven coordination of these systems has been proposed in the past. In addition, applications of eCommerce and Ubiquitous Computing are not bound to a predefined size, so that the underlying coordination mechanism has to be highly scalable. However, it seems to be difficult to conceive a scalable tuple space. This report is an English version of the author\u27s diploma thesis. It comprises the chapter two, three, four, and five. By this means, the design and the implementation of the proposed tuple space is not part of this report

KITopen

Error minimising gradients for improving cerebellar model articulation controller performance

Author: Scarfe Peter Craig
Publication venue: Curtin University
Publication date: 01/01/2009
Field of study

In motion control applications where the desired trajectory velocity exceeds an actuator’s maximum velocity limitations, large position errors will occur between the desired and actual trajectory responses. In these situations standard control approaches cannot predict the output saturation of the actuator and thus the associated error summation cannot be minimised.An adaptive feedforward control solution such as the Cerebellar Model Articulation Controller (CMAC) is able to provide an inherent level of prediction for these situations, moving the system output in the direction of the excessive desired velocity before actuator saturation occurs. However the pre-empting level of a CMAC is not adaptive, and thus the optimal point in time to start moving the system output in the direction of the excessive desired velocity remains unsolved. While the CMAC can adaptively minimise an actuator’s position error, the minimisation of the summation of error over time created by the divergence of the desired and actual trajectory responses requires an additional adaptive level of control.This thesis presents an improved method of training CMACs to minimise the summation of error over time created when the desired trajectory velocity exceeds the actuator’s maximum velocity limitations. This improved method called the Error Minimising Gradient Controller (EMGC) is able to adaptively modify a CMAC’s training signal so that the CMAC will start to move the output of the system in the direction of the excessive desired velocity with an optimised pre-empting level.The EMGC was originally created to minimise the loss of linguistic information conveyed through an actuated series of concatenated hand sign gestures reproducing deafblind sign language. The EMGC concept however is able to be implemented on any system where the error summation associated with excessive desired velocities needs to be minimised, with the EMGC producing an improved output approximation over using a CMAC alone.In this thesis, the EMGC was tested and benchmarked against a feedforward / feedback combined controller using a CMAC and PID controller. The EMGC was tested on an air-muscle actuator for a variety of situations comprising of a position discontinuity in a continuous desired trajectory. Tested situations included various discontinuity magnitudes together with varying approach and departure gradient profiles.Testing demonstrated that the addition of an EMGC can reduce a situation’s error summation magnitude if the base CMAC controller has not already provided a prior enough pre-empting output in the direction of the situation. The addition of an EMGC to a CMAC produces an improved approximation of reproduced motion trajectories, not only minimising position error for a single sampling instance, but also over time for periodic signals

espace@Curtin

Research Interests Databases

Author: George Kollios
Publication venue
Publication date
Field of study

CiteSeerX

Towards multi-purpose main-memory storage structures: Exploiting sub-space distance equalities in totally ordered data sets for exact knn queries

Author: Broneske David
Köppen Veit
Saake Gunter
Schäler Martin
Tex Christine
Publication venue: Pergamon
Publication date: 14/06/2021
Field of study

Efficient knn computation for high-dimensional data is an important, yet challenging task. Today, most information systems use a column-store back-end for relational data. For such systems, multi-dimensional indexes accelerating selections are known. However, they cannot be used to accelerate knn queries. Consequently, one relies on sequential scans, specialized knn indexes, or trades result quality for speed. To avoid storing one specialized index per query type, we envision multipurpose indexes allowing to efficiently compute multiple query types. In this paper, we focus on additionally supporting knn queries as first step towards this goal. To this end, we study how to exploit total orders for accelerating knn queries based on the sub-space distance equalities observation. It means that non-equal points in the full space, which are projected to the same point in a sub space, have the same distance to every other point in this sub space. In case one can easily find these equalities and tune storage structures towards them, this offers two effects one can exploit to accelerate knn queries. The first effect allows pruning of point groups based on a cascade of lower bounds. The second allows to re-use previously computed sub-space distances between point groups. This results in a worst-case execution bound, which is independent of the distance function. We present knn algorithms exploiting both effects and show how to tune a storage structure already known to work well for multi-dimensional selections. Our investigations reveal that the effects are robust to increasing, e.g., the dimensionality, suggesting generally good knn performance. Comparing our knn algorithms to well-known competitors reveals large performance improvements up to one order of magnitude. Furthermore, the algorithms deliver at least comparable performance as the next fastest competitor suggesting that the algorithms are only marginally affected by the curse of dimensionality

KITopen