980,892 research outputs found

    Klee sets and Chebyshev centers for the right Bregman distance

    Get PDF
    We systematically investigate the farthest distance function, farthest points, Klee sets, and Chebyshev centers, with respect to Bregman distances induced by Legendre functions. These objects are of considerable interest in Information Geometry and Machine Learning; when the Legendre function is specialized to the energy, one obtains classical notions from Approximation Theory and Convex Analysis. The contribution of this paper is twofold. First, we provide an affirmative answer to a recently-posed question on whether or not every Klee set with respect to the right Bregman distance is a singleton. Second, we prove uniqueness of the Chebyshev center and we present a characterization that relates to previous works by Garkavi, by Klee, and by Nielsen and Nock.Comment: 23 pages, 2 figures, 14 image

    Optimizing scoring functions and indexes for proximity search in type-annotated corpora

    Get PDF
    We introduce a new, powerful class of text proximity queries: find an instance of a given "answer type" (person, place, distance) near "selector" tokens matching given literals or satisfying given ground predicates. An example query is type=distance NEAR Hamburg Munich. Nearness is defined as a flexible, trainable parameterized aggregation function of the selectors, their frequency in the corpus, and their distance from the candidate answer. Such queries provide a key data reduction step for information extraction, data integration, question answering, and other text-processing applications. We describe the architecture of a next-generation information retrieval engine for such applications, and investigate two key technical problems faced in building it. First, we propose a new algorithm that estimates a scoring function from past logs of queries and answer spans. Plugging the scoring function into the query processor gives high accuracy: typically, an answer is found at rank 2-4. Second, we exploit the skew in the distribution over types seen in query logs to optimize the space required by the new index structures required by our system. Extensive performance studies with a 10GB, 2-million document TREC corpus and several hundred TREC queries show both the accuracy and the efficiency of our system. From an initial 4.3GB index using 18,000 types from WordNet, we can discard 88% of the space, while inflating query times by a factor of only 1.9. Our final index overhead is only 20% of the total index space needed

    SE-KGE: A Location-Aware Knowledge Graph Embedding Model for Geographic Question Answering and Spatial Semantic Lifting

    Get PDF
    Learning knowledge graph (KG) embeddings is an emerging technique for a variety of downstream tasks such as summarization, link prediction, information retrieval, and question answering. However, most existing KG embedding models neglect space and, therefore, do not perform well when applied to (geo)spatial data and tasks. For those models that consider space, most of them primarily rely on some notions of distance. These models suffer from higher computational complexity during training while still losing information beyond the relative distance between entities. In this work, we propose a location-aware KG embedding model called SE-KGE. It directly encodes spatial information such as point coordinates or bounding boxes of geographic entities into the KG embedding space. The resulting model is capable of handling different types of spatial reasoning. We also construct a geographic knowledge graph as well as a set of geographic query-answer pairs called DBGeo to evaluate the performance of SE-KGE in comparison to multiple baselines. Evaluation results show that SE-KGE outperforms these baselines on the DBGeo dataset for geographic logic query answering task. This demonstrates the effectiveness of our spatially-explicit model and the importance of considering the scale of different geographic entities. Finally, we introduce a novel downstream task called spatial semantic lifting which links an arbitrary location in the study area to entities in the KG via some relations. Evaluation on DBGeo shows that our model outperforms the baseline by a substantial margin.Comment: Accepted to Transactions in GI

    Positional Encoding by Robots with Non-Rigid Movements

    Full text link
    Consider a set of autonomous computational entities, called \emph{robots}, operating inside a polygonal enclosure (possibly with holes), that have to perform some collaborative tasks. The boundary of the polygon obstructs both visibility and mobility of a robot. Since the polygon is initially unknown to the robots, the natural approach is to first explore and construct a map of the polygon. For this, the robots need an unlimited amount of persistent memory to store the snapshots taken from different points inside the polygon. However, it has been shown by Di Luna et al. [DISC 2017] that map construction can be done even by oblivious robots by employing a positional encoding strategy where a robot carefully positions itself inside the polygon to encode information in the binary representation of its distance from the closest polygon vertex. Of course, to execute this strategy, it is crucial for the robots to make accurate movements. In this paper, we address the question whether this technique can be implemented even when the movements of the robots are unpredictable in the sense that the robot can be stopped by the adversary during its movement before reaching its destination. However, there exists a constant δ>0\delta > 0, unknown to the robot, such that the robot can always reach its destination if it has to move by no more than δ\delta amount. This model is known in literature as \emph{non-rigid} movement. We give a partial answer to the question in the affirmative by presenting a map construction algorithm for robots with non-rigid movement, but having O(1)O(1) bits of persistent memory and ability to make circular moves

    A Semantic Distance of Natural Language Queries Based on Question-Answer Pairs

    Get PDF
    Many Natural Language Processing (NLP) techniques have been applied in the field of Question Answering (QA) for understanding natural language queries. Practical QA systems classify a natural language query into vertical domains, and determine whether it is similar to a question with known or latent answers. Current mobile personal assistant applications process queries, recognized from voice input or translated from cross-lingual queries. Theoretically speaking, all these problems rely on an intuitive notion of semantic distance. However, it is neither definable nor computable. Many studies attempt to approximate such a semantic distance in heuristic ways, for instance, distances based on synonym dictionaries. In this paper, we propose a unified algorithm to approximate the semantic distance by a well-defined information distance theory. The algorithm depends on a pre-constructed data structure - semantic clusters, which is built from 35 million question-answer pairs automatically. From the semantic measurement of questions, we implement two practical NLP systems, including a question classifier and a translation corrector. Then a series of comparison experiments have been conducted on both implementations. Experimental results demonstrate that our distance based approach produces fewer errors in classification, compared with other academic works. Also, our translation correction system achieves significant improvements on the Google translation results

    Distance Learning In An Educational Perspective In Indonesia During The Covid-19 Pandemic

    Get PDF
    The Existence of the Covid-19 pandemic has a serious impact on teaching and learning activities at various levels and types of educational institutions that were originally face-to-face in the classroom shifted into distance education (distance learning) in the network (online) with online systems (direct) and offline (delay). This study aims to find out how distance learning in the perspective of education in Indonesia during the current Covid-19 pandemic? What's the concept? What are the problems and challenges? And what's the solution? To answer the question above, the method used is the study of the library, namely in this case the author tries to study some available literature to get an explanation relevant to the problem so that the problems in this paper can easily be answered.The results of the literature study stated that (1) Distance learning in an educational perspective in Indonesia, is the implementation of education whose learners are separated from educators and their learning by using various learning resources through information and communication technology, and other media, which serves as a form of education for learners who cannot attend face-to-face education and aims to increase the expansion and equitable access to quality and relevant education as needed; (2) Distance learning during the Covid-19 Pandemic through online learning is focused on life skills education by online or online distance learning methods and offline or offline distance learning.In conclusion, during the Covis-19 pandemic, our education system must be ready to make the leap to transform online learning for all students and by all teachers to enter a new era of building creativity, honing students' skills, and improving self-quality with system changes, perspectives and patterns of our interaction with technology
    corecore