3,276 research outputs found

    Role of Ranking Algorithms for Information Retrieval

    Full text link
    As the use of web is increasing more day by day, the web users get easily lost in the web's rich hyper structure. The main aim of the owner of the website is to give the relevant information according their needs to the users. We explained the Web mining is used to categorize users and pages by analyzing user's behavior, the content of pages and then describe Web Structure mining. This paper includes different Page Ranking algorithms and compares those algorithms used for Information Retrieval. Different Page Rank based algorithms like Page Rank (PR), WPR (Weighted Page Rank), HITS (Hyperlink Induced Topic Selection), Distance Rank and EigenRumor algorithms are discussed and compared. Simulation Interface has been designed for PageRank algorithm and Weighted PageRank algorithm but PageRank is the only ranking algorithm on which Google search engine works.Comment: Keywords: Page Rank, Web Mining, Web Structured Mining, Web Content Minin

    An integrated ranking algorithm for efficient information computing in social networks

    Full text link
    Social networks have ensured the expanding disproportion between the face of WWW stored traditionally in search engine repositories and the actual ever changing face of Web. Exponential growth of web users and the ease with which they can upload contents on web highlights the need of content controls on material published on the web. As definition of search is changing, socially-enhanced interactive search methodologies are the need of the hour. Ranking is pivotal for efficient web search as the search performance mainly depends upon the ranking results. In this paper new integrated ranking model based on fused rank of web object based on popularity factor earned over only valid interlinks from multiple social forums is proposed. This model identifies relationships between web objects in separate social networks based on the object inheritance graph. Experimental study indicates the effectiveness of proposed Fusion based ranking algorithm in terms of better search results.Comment: 14 pages, International Journal on Web Service Computing (IJWSC), Vol.3, No.1, March 201

    A Noval Approach for Web Page Ranking Based on Weights of Links

    Get PDF
    As the web is the large collection of the information and also due to the changing content/nature of the web (plenty of pages or documents and pages are newly added and deleted on the time basis).The information present on the web is of great need, the world is full of questions and the web is serving as the major source of gaining information about specific query made by the user. As per the search engine for the query made a number of pages are retrieved among which the quality of the page that are retrieved is questioned. On the pages retrieved the search engine apply the certain algorithms to bring a order to the pages retrieved so that the most relevant document or pages are displayed at the top of list. Page ranking is done on the basis of the different approaches as the content based approaches, link based approaches. This paper will provide a review to few of the linked based page ranking algorithms. DOI: 10.17762/ijritcc2321-8169.15084

    MAG: A Multilingual, Knowledge-base Agnostic and Deterministic Entity Linking Approach

    Full text link
    Entity linking has recently been the subject of a significant body of research. Currently, the best performing approaches rely on trained mono-lingual models. Porting these approaches to other languages is consequently a difficult endeavor as it requires corresponding training data and retraining of the models. We address this drawback by presenting a novel multilingual, knowledge-based agnostic and deterministic approach to entity linking, dubbed MAG. MAG is based on a combination of context-based retrieval on structured knowledge bases and graph algorithms. We evaluate MAG on 23 data sets and in 7 languages. Our results show that the best approach trained on English datasets (PBOH) achieves a micro F-measure that is up to 4 times worse on datasets in other languages. MAG, on the other hand, achieves state-of-the-art performance on English datasets and reaches a micro F-measure that is up to 0.6 higher than that of PBOH on non-English languages.Comment: Accepted in K-CAP 2017: Knowledge Capture Conferenc

    Exploiting Social Annotation for Automatic Resource Discovery

    Full text link
    Information integration applications, such as mediators or mashups, that require access to information resources currently rely on users manually discovering and integrating them in the application. Manual resource discovery is a slow process, requiring the user to sift through results obtained via keyword-based search. Although search methods have advanced to include evidence from document contents, its metadata and the contents and link structure of the referring pages, they still do not adequately cover information sources -- often called ``the hidden Web''-- that dynamically generate documents in response to a query. The recently popular social bookmarking sites, which allow users to annotate and share metadata about various information sources, provide rich evidence for resource discovery. In this paper, we describe a probabilistic model of the user annotation process in a social bookmarking system del.icio.us. We then use the model to automatically find resources relevant to a particular information domain. Our experimental results on data obtained from \emph{del.icio.us} show this approach as a promising method for helping automate the resource discovery task.Comment: 6 pages, submitted to AAAI07 workshop on Information Integration on the We

    Stochastic Query Covering for Fast Approximate Document Retrieval

    Get PDF
    We design algorithms that, given a collection of documents and a distribution over user queries, return a small subset of the document collection in such a way that we can efficiently provide high-quality answers to user queries using only the selected subset. This approach has applications when space is a constraint or when the query-processing time increases significantly with the size of the collection. We study our algorithms through the lens of stochastic analysis and prove that even though they use only a small fraction of the entire collection, they can provide answers to most user queries, achieving a performance close to the optimal. To complement our theoretical findings, we experimentally show the versatility of our approach by considering two important cases in the context of Web search. In the first case, we favor the retrieval of documents that are relevant to the query, whereas in the second case we aim for document diversification. Both the theoretical and the experimental analysis provide strong evidence of the potential value of query covering in diverse application scenarios
    • …
    corecore