Search CORE

4 research outputs found

An Improved PageRank Method based on Genetic Algorithm for Web Search

Author: Du Wencai
Gui Zhanji
Guo Qingju
Yan Lili
Publication venue: 'Elsevier BV'
Publication date: 31/12/2011
Field of study

AbstractWeb search engine has become a very important tool for finding information efficiently from the massive Web data. Based on PageRank algorithm, a genetic PageRank algorithm (GPRA) is proposed. With the condition of preserving PageRank algorithm advantages, GPRA takes advantage of genetic algorithm so as to solve web search. Experimental results have shown that GPRA is superior to PageRank algorithm and genetic algorithm on performance

Elsevier - Publisher Connector

Efficient Keyword Search Across Heterogeneous Relational Databases

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

An Evaluation and Comparison of Current Peer-to-Peer Full-Text Keyword Search Techniques

Author: Justin Moore
Kai Shen
Ming Zhong
Publication venue
Publication date
Field of study

Current peer-to-peer (p2p) full-text keyword search techniques fall into the following categories: document-based partitioning, keyword-based partitioning, hybrid indexing, and semantic search. This paper provides a performance evaluation and comparison of these p2p full-text keyword search techniques on a dataset with 3.7 million web pages and 6.8 million search queries. Our evaluation results can serve as a guide for choosing the most suitable p2p full-text keyword search technique based on given system parameters, such as network size, the number of documents, and the number of queries per second. 1

CiteSeerX

Advanced distributed data integration infrastructure and research data management portal

Author: Karataev Evgeny
Publication venue
Publication date: 10/01/2017
Field of study

The amount of data available due to the rapid spread of advanced information technology is exploding. At the same time, continued research on data integration systems aims to provide users with uniform data access and efficient data sharing. The ability to share data is particularly important for interdisciplinary research, where a comprehensive picture of the subject requires large amounts of data from disparate data sources from a variety of disciplines. While there are numerous data sets available from various groups worldwide, the existing data sources are principally oriented toward regional comparative efforts rather than global applications. They vary widely both in content and format. Such data sources cannot be easily integrated, and maintained by small groups of developers. I propose an advanced infrastructure for large-scale data integration based on crowdsourcing. In particular, I propose a novel architecture and algorithms to efficiently store dynamically incoming heterogeneous datasets enabling both data integration and data autonomy. My proposed infrastructure combines machine learning algorithms and human expertise to perform efficient schema alignment and maintain relationships between the datasets. It provides efficient data exploration functionality without requiring users to write complex queries, as well as performs approximate information fusion when exact match does not exist. Finally, I introduce Col*Fusion system that implements the proposed advance data integration infrastructure

D-Scholarship@Pitt