119,315 research outputs found
Answering Complex Questions by Joining Multi-Document Evidence with Quasi Knowledge Graphs
Direct answering of questions that involve multiple entities and relations is a challenge for text-based QA. This problem is most pronounced when answers can be found only by joining evidence from multiple documents. Curated knowledge graphs (KGs) may yield good answers, but are limited by their inherent incompleteness and potential staleness. This paper presents QUEST, a method that can answer complex questions directly from textual sources on-the-fly, by computing similarity joins over partial results from different documents. Our method is completely unsupervised, avoiding training-data bottlenecks and being able to cope with rapidly evolving ad hoc topics and formulation style in user questions. QUEST builds a noisy quasi KG with node and edge weights, consisting of dynamically retrieved entity names and relational phrases. It augments this graph with types and semantic alignments, and computes the best answers by an algorithm for Group Steiner Trees. We evaluate QUEST on benchmarks of complex questions, and show that it substantially outperforms state-of-the-art baselines
TK: The Twitter Top-K Keywords Benchmark
Information retrieval from textual data focuses on the construction of
vocabularies that contain weighted term tuples. Such vocabularies can then be
exploited by various text analysis algorithms to extract new knowledge, e.g.,
top-k keywords, top-k documents, etc. Top-k keywords are casually used for
various purposes, are often computed on-the-fly, and thus must be efficiently
computed. To compare competing weighting schemes and database implementations,
benchmarking is customary. To the best of our knowledge, no benchmark currently
addresses these problems. Hence, in this paper, we present a top-k keywords
benchmark, TK, which features a real tweet dataset and queries with
various complexities and selectivities. TK helps evaluate weighting
schemes and database implementations in terms of computing performance. To
illustrate TK's relevance and genericity, we successfully performed
tests on the TF-IDF and Okapi BM25 weighting schemes, on one hand, and on
different relational (Oracle, PostgreSQL) and document-oriented (MongoDB)
database implementations, on the other hand
Active yellow pages: a pipelined resource management architecture for wide-area network computing
This paper describes a novel, pipelined resource
management architecture for computational grids. The
design is based on two key realizations. One is that resource management involves a sequence of tasks that is
best handled by a pipeline. As shown in the paper, this
approach results, in a scalable architecture for decentralized scheduling. The other realization is that static aggregation of resources for improved scheduling is inadequate in wide-area computing environments because the
needs of users and jobs change with both, location and
time. The described architecture addresses this problem
by dynamically aggregating resources in a manner that
continuously optimizes system response. This is accomplished by way of an active yellow pages directory
that allows aggregation constraints to be (re)defined on
the fly. An initial prototype of the active yellow pages
service has been deployed in the PUNCH network computing environment. Experiences with the production
PUNCH system and preliminary results from controlled
experiments indicate that the active yellow pages service performs well.Peer Reviewe
A Global Optimisation Toolbox for Massively Parallel Engineering Optimisation
A software platform for global optimisation, called PaGMO, has been developed
within the Advanced Concepts Team (ACT) at the European Space Agency, and was
recently released as an open-source project. PaGMO is built to tackle
high-dimensional global optimisation problems, and it has been successfully
used to find solutions to real-life engineering problems among which the
preliminary design of interplanetary spacecraft trajectories - both chemical
(including multiple flybys and deep-space maneuvers) and low-thrust (limited,
at the moment, to single phase trajectories), the inverse design of
nano-structured radiators and the design of non-reactive controllers for
planetary rovers. Featuring an arsenal of global and local optimisation
algorithms (including genetic algorithms, differential evolution, simulated
annealing, particle swarm optimisation, compass search, improved harmony
search, and various interfaces to libraries for local optimisation such as
SNOPT, IPOPT, GSL and NLopt), PaGMO is at its core a C++ library which employs
an object-oriented architecture providing a clean and easily-extensible
optimisation framework. Adoption of multi-threaded programming ensures the
efficient exploitation of modern multi-core architectures and allows for a
straightforward implementation of the island model paradigm, in which multiple
populations of candidate solutions asynchronously exchange information in order
to speed-up and improve the optimisation process. In addition to the C++
interface, PaGMO's capabilities are exposed to the high-level language Python,
so that it is possible to easily use PaGMO in an interactive session and take
advantage of the numerous scientific Python libraries available.Comment: To be presented at 'ICATT 2010: International Conference on
Astrodynamics Tools and Techniques
Efficient Spherical Harmonic Transforms aimed at pseudo-spectral numerical simulations
In this paper, we report on very efficient algorithms for the spherical
harmonic transform (SHT). Explicitly vectorized variations of the algorithm
based on the Gauss-Legendre quadrature are discussed and implemented in the
SHTns library which includes scalar and vector transforms. The main
breakthrough is to achieve very efficient on-the-fly computations of the
Legendre associated functions, even for very high resolutions, by taking
advantage of the specific properties of the SHT and the advanced capabilities
of current and future computers. This allows us to simultaneously and
significantly reduce memory usage and computation time of the SHT. We measure
the performance and accuracy of our algorithms. Even though the complexity of
the algorithms implemented in SHTns are in (where N is the maximum
harmonic degree of the transform), they perform much better than any third
party implementation, including lower complexity algorithms, even for
truncations as high as N=1023. SHTns is available at
https://bitbucket.org/nschaeff/shtns as open source software.Comment: 8 page
Analysis of Spectrum Occupancy Using Machine Learning Algorithms
In this paper, we analyze the spectrum occupancy using different machine
learning techniques. Both supervised techniques (naive Bayesian classifier
(NBC), decision trees (DT), support vector machine (SVM), linear regression
(LR)) and unsupervised algorithm (hidden markov model (HMM)) are studied to
find the best technique with the highest classification accuracy (CA). A
detailed comparison of the supervised and unsupervised algorithms in terms of
the computational time and classification accuracy is performed. The classified
occupancy status is further utilized to evaluate the probability of secondary
user outage for the future time slots, which can be used by system designers to
define spectrum allocation and spectrum sharing policies. Numerical results
show that SVM is the best algorithm among all the supervised and unsupervised
classifiers. Based on this, we proposed a new SVM algorithm by combining it
with fire fly algorithm (FFA), which is shown to outperform all other
algorithms.Comment: 21 pages, 6 figure
- âŠ