9,994 research outputs found
Toward Entity-Aware Search
As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability
Moa and the multi-model architecture: a new perspective on XNF2
Advanced non-traditional application domains such as geographic information systems and digital library systems demand advanced data management support. In an effort to cope with this demand, we present the concept of a novel multi-model DBMS architecture which provides evaluation of queries on complexly structured data without sacrificing efficiency. A vital role in this architecture is played by the Moa language featuring a nested relational data model based on XNF2, in which we placed renewed interest. Furthermore, extensibility in Moa avoids optimization obstacles due to black-box treatment of ADTs. The combination of a mapping of queries on complexly structured data to an efficient physical algebra expression via a nested relational algebra, extensibility open to optimization, and the consequently better integration of domain-specific algorithms, makes that the Moa system can efficiently and effectively handle complex queries from non-traditional application domains
Reverse spatial visual top-k query
With the wide application of mobile Internet techniques an location-based services (LBS), massive multimedia data with geo-tags has been generated and collected. In this paper, we investigate a novel type of spatial query problem, named reverse spatial visual top- query (RSVQ k ) that aims to retrieve a set of geo-images that have the query as one of the most relevant geo-images in both geographical proximity and visual similarity. Existing approaches for reverse top- queries are not suitable to address this problem because they cannot effectively process unstructured data, such as image. To this end, firstly we propose the definition of RSVQ k problem and introduce the similarity measurement. A novel hybrid index, named VR 2 -Tree is designed, which is a combination of visual representation of geo-image and R-Tree. Besides, an extension of VR 2 -Tree, called CVR 2 -Tree is introduced and then we discuss the calculation of lower/upper bound, and then propose the optimization technique via CVR 2 -Tree for further pruning. In addition, a search algorithm named RSVQ k algorithm is developed to support the efficient RSVQ k query. Comprehensive experiments are conducted on four geo-image datasets, and the results illustrate that our approach can address the RSVQ k problem effectively and efficiently
Energy Confused Adversarial Metric Learning for Zero-Shot Image Retrieval and Clustering
Deep metric learning has been widely applied in many computer vision tasks,
and recently, it is more attractive in \emph{zero-shot image retrieval and
clustering}(ZSRC) where a good embedding is requested such that the unseen
classes can be distinguished well. Most existing works deem this 'good'
embedding just to be the discriminative one and thus race to devise powerful
metric objectives or hard-sample mining strategies for leaning discriminative
embedding. However, in this paper, we first emphasize that the generalization
ability is a core ingredient of this 'good' embedding as well and largely
affects the metric performance in zero-shot settings as a matter of fact. Then,
we propose the Energy Confused Adversarial Metric Learning(ECAML) framework to
explicitly optimize a robust metric. It is mainly achieved by introducing an
interesting Energy Confusion regularization term, which daringly breaks away
from the traditional metric learning idea of discriminative objective devising,
and seeks to 'confuse' the learned model so as to encourage its generalization
ability by reducing overfitting on the seen classes. We train this confusion
term together with the conventional metric objective in an adversarial manner.
Although it seems weird to 'confuse' the network, we show that our ECAML indeed
serves as an efficient regularization technique for metric learning and is
applicable to various conventional metric methods. This paper empirically and
experimentally demonstrates the importance of learning embedding with good
generalization, achieving state-of-the-art performances on the popular CUB,
CARS, Stanford Online Products and In-Shop datasets for ZSRC tasks.
\textcolor[rgb]{1, 0, 0}{Code available at http://www.bhchen.cn/}.Comment: AAAI 2019, Spotligh
- âŠ