177,655 research outputs found
Document Retrieval on Repetitive Collections
Document retrieval aims at finding the most important documents where a
pattern appears in a collection of strings. Traditional pattern-matching
techniques yield brute-force document retrieval solutions, which has motivated
the research on tailored indexes that offer near-optimal performance. However,
an experimental study establishing which alternatives are actually better than
brute force, and which perform best depending on the collection
characteristics, has not been carried out. In this paper we address this
shortcoming by exploring the relationship between the nature of the underlying
collection and the performance of current methods. Via extensive experiments we
show that established solutions are often beaten in practice by brute-force
alternatives. We also design new methods that offer superior time/space
trade-offs, particularly on repetitive collections.Comment: Accepted to ESA 2014. Implementation and experiments at
http://www.cs.helsinki.fi/group/suds/rlcsa
Automated, unsupervised inversion of multiwavelength lidar data with TiARA : Assessment of retrieval performance of microphysical parameters using simulated data
We evaluate the retrieval performance of the automated, unsupervised inversion algorithm, Tikhonov Advanced Regularization Algorithm (TiARA), which is used for the autonomous retrieval of microphysical parameters of anthropogenic and natural pollution particles. TiARA (version 1.0) has been developed in the past 10 years and builds on the legacy of a data-operator-controlled inversion algorithm used since 1998 for the analysis of data from multiwavelength Raman lidar. The development of TiARA has been driven by the need to analyze in (near) real time large volumes of data collected with NASA Langley Research Center's high-spectral-resolution lidar (HSRL-2). HSRL-2 was envisioned as part of the NASA Aerosols-Clouds-Ecosystems mission in response to the National Academy of Sciences (NAS) Decadal Study mission recommendations 2007. TiARA could thus also serve as an inversion algorithm in the context of a future space-borne lidar. We summarize key properties of TiARA on the basis of simulations with monomodal logarithmic-normal particle size distributions that cover particle radii from approximately 0.05 μm to 10 μm. The real and imaginary parts of the complex refractive index cover the range from nonabsorbing to highly light-absorbing pollutants. Our simulations include up to 25% measurement uncertainty. The goal of our study is to provide guidance with respect to technical features of future space-borne lidars, if such lidars will be used for retrievals of microphysical data products, absorption coefficients, and single-scattering albedo. We investigate the impact of two different measurement-error models on the quality of the data products.We also obtain for the first time, to the best of our knowledge, a statistical view on systematic and statistical uncertainties, if a large volume of data is processed. Effective radius is retrieved to 50% accuracy for 58% of cases with an imaginary part up to 0.01i and up to 100% of cases with an imaginary part of 0.05i. Similarly, volume concentration, surface-area concentration, and number concentrations are retrieved to 50% accuracy in 56%-100% of cases, 99%-100% of cases, and 54%-87% of cases, respectively, depending on the imaginary part. The numbers represent measurement uncertainties of up to 15%. If we target 20% retrieval accuracy, the numbers of cases that fall within that threshold are 36%-76% for effective radius, 36%-73% for volume concentration, 98%-100% for surface-area concentration, and 37%-61% for number concentration. That range of numbers again represents a spread in results for different values of the imaginary part. At present, we obtain an accuracy of (on average) 0.1 for the real part. A case study from the ORCALES field campaign is used to illustrate data products obtained with TiARA.Peer reviewe
Context guided retrieval
This paper presents a hierarchical case representation that uses a context guided retrieval method The performance of this method is compared to that of a simple flat file representation using standard nearest neighbour retrieval. The data presented in this paper is more extensive than that presented in an earlier paper by the same authors. The estimation of the construction costs of light industrial warehouse buildings is used as the test domain. Each case in the system comprises approximately 400 features. These are structured into a hierarchical case representation that holds more general contextual features at its top and specific building elements at its leaves. A modified nearest neighbour retrieval algorithm is used that is guided by contextual similarity. Problems are decomposed into sub-problems and solutions recomposed into a final solution. The comparative results show that the context guided retrieval method using the hierarchical case representation is significantly more accurate than the simpler flat file representation and standard nearest neighbour retrieval
Understanding the Limitations of CNN-based Absolute Camera Pose Regression
Visual localization is the task of accurate camera pose estimation in a known
scene. It is a key problem in computer vision and robotics, with applications
including self-driving cars, Structure-from-Motion, SLAM, and Mixed Reality.
Traditionally, the localization problem has been tackled using 3D geometry.
Recently, end-to-end approaches based on convolutional neural networks have
become popular. These methods learn to directly regress the camera pose from an
input image. However, they do not achieve the same level of pose accuracy as 3D
structure-based methods. To understand this behavior, we develop a theoretical
model for camera pose regression. We use our model to predict failure cases for
pose regression techniques and verify our predictions through experiments. We
furthermore use our model to show that pose regression is more closely related
to pose approximation via image retrieval than to accurate pose estimation via
3D structure. A key result is that current approaches do not consistently
outperform a handcrafted image retrieval baseline. This clearly shows that
additional research is needed before pose regression algorithms are ready to
compete with structure-based methods.Comment: Initial version of a paper accepted to CVPR 201
Neural Distributed Autoassociative Memories: A Survey
Introduction. Neural network models of autoassociative, distributed memory
allow storage and retrieval of many items (vectors) where the number of stored
items can exceed the vector dimension (the number of neurons in the network).
This opens the possibility of a sublinear time search (in the number of stored
items) for approximate nearest neighbors among vectors of high dimension. The
purpose of this paper is to review models of autoassociative, distributed
memory that can be naturally implemented by neural networks (mainly with local
learning rules and iterative dynamics based on information locally available to
neurons). Scope. The survey is focused mainly on the networks of Hopfield,
Willshaw and Potts, that have connections between pairs of neurons and operate
on sparse binary vectors. We discuss not only autoassociative memory, but also
the generalization properties of these networks. We also consider neural
networks with higher-order connections and networks with a bipartite graph
structure for non-binary data with linear constraints. Conclusions. In
conclusion we discuss the relations to similarity search, advantages and
drawbacks of these techniques, and topics for further research. An interesting
and still not completely resolved question is whether neural autoassociative
memories can search for approximate nearest neighbors faster than other index
structures for similarity search, in particular for the case of very high
dimensional vectors.Comment: 31 page
Investigating the use of semantic technologies in spatial mapping applications
Semantic Web Technologies are ideally suited to build context-aware information retrieval applications. However, the geospatial aspect of context awareness presents unique challenges such as the semantic modelling of geographical references for efficient handling of spatial queries, the reconciliation of the heterogeneity at the semantic and geo-representation levels, maintaining the quality of service and scalability of communicating, and the efficient rendering of the spatial queries' results. In this paper, we describe the modelling decisions taken to solve these challenges by analysing our implementation of an intelligent planning and recommendation tool that provides location-aware advice for a specific application domain. This paper contributes to the methodology of integrating heterogeneous geo-referenced data into semantic knowledgebases, and also proposes mechanisms for efficient spatial interrogation of the semantic knowledgebase and optimising the rendering of the dynamically retrieved context-relevant information on a web frontend
- …