Search CORE

657,884 research outputs found

Web Search using Improved Concept Based Query Refinement

Author: Suresh Ralla
V Swetha
Vemuri Saritha
Publication venue: Institute for Project Management Pvt. Ltd
Publication date: 29/08/2020
Field of study

The information extracted from Web pages can be used for effective query expansion. The aspect needed to improve accuracy of web search engines is the inclusion of metadata, not only to analyze Web content, but also to interpret. With the Web of today being unstructured and semantically heterogeneous, keyword-based queries are likely to miss important results. . Using data mining methods, our system derives dependency rules and applies them to concept-based queries. This paper presents a novel approach for query expansion that applies dependence rules mined from a large Web World, combining several existing techniques for data extraction and mining, to integrate the system into COMPACT, our prototype implementation of a concept-based search engine

Interscience Research Network

Graph-Based Concept Clustering for Web Search Results

Author: Haruechaiyasak Choochart
Jinarat Supakpong
Rungsawang Arnon
Publication venue: 'Institute of Advanced Engineering and Science'
Publication date: 01/12/2015
Field of study

A search engine usually returns a long list of web search results corresponding to a query from the user. Users must spend a lot of time for browsing and navigating the search results for the relevant results. Many research works applied the text clustering techniques, called web search results clustering, to handle the problem. Unfortunately, search result document returned from search engine is a very short text. It is difficult to cluster related documents into the same group because a short document has low informative content. In this paper, we proposed a method to cluster the web search results with high clustering quality using graph-based clustering with concept which extract from the external knowledge source. The main idea is to expand the original search results with some related concept terms. We applied the Wikipedia as the external knowledge source for concept extraction. We compared the clustering results of our proposed method with two well-known search results clustering techniques, Suffix Tree Clustering and Lingo. The experimental results showed that our proposed method significantly outperforms over the well-known clustering techniques

IAES journal

Crossref

Institute of Advanced Engineering and Science

Weaving Entities into Relations: From Page Retrieval to Relation Mining on the Web

Author: Chang Kevin Chen-Chuan
Cheng Tao
Chuang Shui-Lung
Davis William
Kelley Joseph M.
Publication venue
Publication date: 01/11/2004
Field of study

With its sheer amount of information, the Web is clearly an important frontier for data mining. While Web mining must start with content on the Web, there is no effective ``search-based'' mechanism to help sifting through the information on the Web. Our goal is to provide a such online search-based facility for supporting query primitives, upon which Web mining applications can be built. As a first step, this paper aims at entity-relation discovery, or E-R discovery, as a useful function-- to weave scattered entities on the Web into coherent relations. To begin with, as our proposal, we formalize the concept of E-R discovery. Further, to realize E-R discovery, as our main thesis, we abstract tuple ranking-- the essential challenge of E-R discovery-- as pattern-based cooccurrence analysis. Finally, as our key insight, we observe that such relation mining shares the same core functions as traditional page-retrieval systems, which enables us to build the new E-R discovery upon today's search engines, almost for free. We report our system prototype and testbed, WISDM-ER, with real Web corpus. Our case studies have demonstrated a high promise, achieving 83%-91% accuracy for real benchmark queries-- and thus the real possibilities of enabling ad-hoc Web mining tasks with online E-R discovery

Illinois Digital Environment for Access to Learning and Scholarship Repository

Enhanced Web Search Engines with Query-Concept Bipartite Graphs

Author: Chen Yan
Publication venue: ScholarWorks @ Georgia State University
Publication date: 16/08/2010
Field of study

With rapid growth of information on the Web, Web search engines have gained great momentum for exploiting valuable Web resources. Although keywords-based Web search engines provide relevant search results in response to users’ queries, future enhancement is still needed. Three important issues include (1) search results can be diverse because ambiguous keywords in queries can be interpreted to different meanings; (2) indentifying keywords in long queries is difficult for search engines; and (3) generating query-specific Web page summaries is desirable for Web search results’ previews. Based on clickthrough data, this thesis proposes a query-concept bipartite graph for representing queries’ relations, and applies the queries’ relations to applications such as (1) personalized query suggestions, (2) long queries Web searches and (3) query-specific Web page summarization. Experimental results show that query-concept bipartite graphs are useful for performance improvement for the three applications

ScholarWorks @ Georgia State University

Recommended from our members

Automatic Annotation and Semantic Search from Protégé

Author: Castells Pablo
Fernandez Miriam
Vallett David
Publication venue
Publication date: 01/01/2005
Field of study

Semantic search has been one of the major envisioned benefits of the Semantic Web since its emergence in the late 90’s [1]. Our demo shows a proposal towards this goal. One way to view a semantic search engine is as a tool that gets formal queries (e.g. in RDQL, RQL, SPARQL, or the like) from a client, executes them against an ontology-based knowledge base, and returns tuples of ontology values (resources) that satisfy the query [2]. While this conception of semantic search brings enormous advantages already, our work aims at taking a step beyond this. In our view of Information Retrieval in the Semantic Web, a search engine returns documents, rather than (or in addition to) exact values, in response to user queries. The engine should rank the documents, according to concept-based relevance criteria. The overall retrieval process is illustrated in Figure 1 (see [3] for more details of our research)

Open Research Online (The Open University)

Exploring the academic invisible web

Author: Dirk Lewandowski
Norbert Lossau
Philipp Mayr
Publication venue: 'Emerald'
Publication date: 01/01/2006
Field of study

Purpose: To provide a critical review of Bergman's 2001 study on the Deep Web. In addition, we bring a new concept into the discussion, the Academic Invisible Web (AIW). We define the Academic Invisible Web as consisting of all databases and collections relevant to academia but not searchable by the general-purpose internet search engines. Indexing this part of the Invisible Web is central to scientific search engines. We provide an overview of approaches followed thus far. Design/methodology/approach: Discussion of measures and calculations, estimation based on informetric laws. Literature review on approaches for uncovering information from the Invisible Web. Findings: Bergman's size estimate of the Invisible Web is highly questionable. We demonstrate some major errors in the conceptual design of the Bergman paper. A new (raw) size estimate is given. Research limitations/implications: The precision of our estimate is limited due to a small sample size and lack of reliable data. Practical implications: We can show that no single library alone will be able to index the Academic Invisible Web. We suggest collaboration to accomplish this task. Originality/value: Provides library managers and those interested in developing academic search engines with data on the size and attributes of the Academic Invisible Web.Comment: 13 pages, 3 figure

arXiv.org e-Print Archive

E-LIS

Crossref

SSOAR - Social Science Open Access Repository

The University of Arizona

REPOSIT

A Smart Web Crawler for a Concept Based Semantic Search Engine

Author: Kancherla Vinay
Publication venue: SJSU ScholarWorks
Publication date: 01/12/2014
Field of study

The internet is a vast collection of billions of web pages containing terabytes of information arranged in thousands of servers using HTML. The size of this collection itself is a formidable obstacle in retrieving information necessary and relevant. This made search engines an important part of our lives. Search engines strive to retrieve information as relevant as possible to the user. One of the building blocks of search engines is the Web Crawler. A web crawler is a bot that goes around the internet collecting and storing it in a database for further analysis and arrangement of the data. The project aims to create a smart web crawler for a concept based semantic based search engine. The crawler not only aims to crawl the World Wide Web and bring back data but also aims to perform an initial data analysis of unnecessary data before it stores the data. We aim to improve the efficiency of the Concept Based Semantic Search Engine by using the Smart crawler

SJSU ScholarWorks