20,509 research outputs found
X-ENS: Semantic Enrichment of Web Search Results at Real-Time
While more and more semantic data are published on the Web, an important question is how typical web users can access and exploit this body of knowledge. Although, existing interaction paradigms in semantic search hide the complexity behind an easy-to-use interface, they have not managed to cover common search needs. In this paper, we present X-ENS (eXplore ENtities in Search), a web search application that enhances the classical, keyword-based, web searching with semantic information, as a means to combine the pros of both Semantic Web standards and common Web Searching. X-ENS identifies entities of interest in the snippets of the top search results which can be further exploited in a faceted interaction scheme, and thereby can help the user to limit the - often very large - search space to those hits that contain a particular piece of information. Moreover, X-ENS permits the exploration of the identified entities by exploiting semantic repositories
Ranking Archived Documents for Structured Queries on Semantic Layers
Archived collections of documents (like newspaper and web archives) serve as
important information sources in a variety of disciplines, including Digital
Humanities, Historical Science, and Journalism. However, the absence of
efficient and meaningful exploration methods still remains a major hurdle in
the way of turning them into usable sources of information. A semantic layer is
an RDF graph that describes metadata and semantic information about a
collection of archived documents, which in turn can be queried through a
semantic query language (SPARQL). This allows running advanced queries by
combining metadata of the documents (like publication date) and content-based
semantic information (like entities mentioned in the documents). However, the
results returned by such structured queries can be numerous and moreover they
all equally match the query. In this paper, we deal with this problem and
formalize the task of "ranking archived documents for structured queries on
semantic layers". Then, we propose two ranking models for the problem at hand
which jointly consider: i) the relativeness of documents to entities, ii) the
timeliness of documents, and iii) the temporal relations among the entities.
The experimental results on a new evaluation dataset show the effectiveness of
the proposed models and allow us to understand their limitation
Weaving Entities into Relations: From Page Retrieval to Relation Mining on the Web
With its sheer amount of information, the Web is clearly an important frontier for data mining. While Web mining must start with content on the Web, there is no effective ``search-based'' mechanism to help sifting through the information on the Web. Our goal is to provide a such online search-based facility for supporting query primitives, upon which Web mining applications can be built. As a first step, this paper aims at entity-relation discovery, or E-R discovery, as a useful function-- to weave scattered entities on the Web into coherent relations. To begin with, as our proposal, we formalize the concept of E-R discovery. Further, to realize E-R discovery, as our main thesis, we abstract tuple ranking-- the essential challenge of E-R discovery-- as pattern-based cooccurrence analysis. Finally, as our key insight, we observe that such relation mining shares the same core functions as traditional page-retrieval systems, which enables us to build the new E-R discovery upon today's search engines, almost for free. We report our system prototype and testbed, WISDM-ER, with real Web corpus. Our case studies have demonstrated a high promise, achieving 83%-91% accuracy for real benchmark queries-- and thus the real possibilities of enabling ad-hoc Web mining tasks with online E-R discovery
EntiTables: Smart Assistance for Entity-Focused Tables
Tables are among the most powerful and practical tools for organizing and
working with data. Our motivation is to equip spreadsheet programs with smart
assistance capabilities. We concentrate on one particular family of tables,
namely, tables with an entity focus. We introduce and focus on two specific
tasks: populating rows with additional instances (entities) and populating
columns with new headings. We develop generative probabilistic models for both
tasks. For estimating the components of these models, we consider a knowledge
base as well as a large table corpus. Our experimental evaluation simulates the
various stages of the user entering content into an actual table. A detailed
analysis of the results shows that the models' components are complimentary and
that our methods outperform existing approaches from the literature.Comment: Proceedings of the 40th International ACM SIGIR Conference on
Research and Development in Information Retrieval (SIGIR '17), 201
Toward Entity-Aware Search
As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability
Towards an automated query modification assistant
Users who need several queries before finding what they need can benefit from
an automatic search assistant that provides feedback on their query
modification strategies. We present a method to learn from a search log which
types of query modifications have and have not been effective in the past. The
method analyses query modifications along two dimensions: a traditional
term-based dimension and a semantic dimension, for which queries are enriches
with linked data entities. Applying the method to the search logs of two search
engines, we identify six opportunities for a query modification assistant to
improve search: modification strategies that are commonly used, but that often
do not lead to satisfactory results.Comment: 1st International Workshop on Usage Analysis and the Web of Data
(USEWOD2011) in the 20th International World Wide Web Conference (WWW2011),
Hyderabad, India, March 28th, 201
A Web Smart Space Framework for Intelligent Search Engines
A web smart space is an intelligent environment which has additional capability of searching the information smartly and efficiently. New advancements like dynamic web contents generation has increased the size of web repositories. Among so many modern software analysis requirements, one is to search information from the given repository. But useful information extraction is a troublesome hitch due to the multi-lingual; base of the web data collection. The issue of semantic based information searching has become a standoff due to the inconsistencies and variations in the characteristics of the data. In the accomplished research, a web smart space framework has been proposed which introduces front end processing for a search engine to make the information retrieval process more intelligent and accurate. In orthodox searching anatomies, searching is performed only by using pattern matching technique and consequently a large number of irrelevant results are generated. The projected framework has insightful ability to improve this drawback and returns efficient outcomes. Designed framework gets text input from the user in the form complete question, understands the input and generates the meanings. Search engine searches on the basis of the information provided
A User-Centered Concept Mining System for Query and Document Understanding at Tencent
Concepts embody the knowledge of the world and facilitate the cognitive
processes of human beings. Mining concepts from web documents and constructing
the corresponding taxonomy are core research problems in text understanding and
support many downstream tasks such as query analysis, knowledge base
construction, recommendation, and search. However, we argue that most prior
studies extract formal and overly general concepts from Wikipedia or static web
pages, which are not representing the user perspective. In this paper, we
describe our experience of implementing and deploying ConcepT in Tencent QQ
Browser. It discovers user-centered concepts at the right granularity
conforming to user interests, by mining a large amount of user queries and
interactive search click logs. The extracted concepts have the proper
granularity, are consistent with user language styles and are dynamically
updated. We further present our techniques to tag documents with user-centered
concepts and to construct a topic-concept-instance taxonomy, which has helped
to improve search as well as news feeds recommendation in Tencent QQ Browser.
We performed extensive offline evaluation to demonstrate that our approach
could extract concepts of higher quality compared to several other existing
methods. Our system has been deployed in Tencent QQ Browser. Results from
online A/B testing involving a large number of real users suggest that the
Impression Efficiency of feeds users increased by 6.01% after incorporating the
user-centered concepts into the recommendation framework of Tencent QQ Browser.Comment: Accepted by KDD 201
- …