168,040 research outputs found

    Temporal models for mining, ranking and recommendation in the Web

    Get PDF
    Due to their first-hand, diverse and evolution-aware reflection of nearly all areas of life, heterogeneous temporal datasets i.e., the Web, collaborative knowledge bases and social networks have been emerged as gold-mines for content analytics of many sorts. In those collections, time plays an essential role in many crucial information retrieval and data mining tasks, such as from user intent understanding, document ranking to advanced recommendations. There are two semantically closed and important constituents when modeling along the time dimension, i.e., entity and event. Time is crucially served as the context for changes driven by happenings and phenomena (events) that related to people, organizations or places (so-called entities) in our social lives. Thus, determining what users expect, or in other words, resolving the uncertainty confounded by temporal changes is a compelling task to support consistent user satisfaction. In this thesis, we address the aforementioned issues and propose temporal models that capture the temporal dynamics of such entities and events to serve for the end tasks. Specifically, we make the following contributions in this thesis: (1) Query recommendation and document ranking in the Web - we address the issues for suggesting entity-centric queries and ranking effectiveness surrounding the happening time period of an associated event. In particular, we propose a multi-criteria optimization framework that facilitates the combination of multiple temporal models to smooth out the abrupt changes when transitioning between event phases for the former and a probabilistic approach for search result diversification of temporally ambiguous queries for the latter. (2) Entity relatedness in Wikipedia - we study the long-term dynamics of Wikipedia as a global memory place for high-impact events, specifically the reviving memories of past events. Additionally, we propose a neural network-based approach to measure the temporal relatedness of entities and events. The model engages different latent representations of an entity (i.e., from time, link-based graph and content) and use the collective attention from user navigation as the supervision. (3) Graph-based ranking and temporal anchor-text mining inWeb Archives - we tackle the problem of discovering important documents along the time-span ofWeb Archives, leveraging the link graph. Specifically, we combine the problems of relevance, temporal authority, diversity and time in a unified framework. The model accounts for the incomplete link structure and natural time lagging in Web Archives in mining the temporal authority. (4) Methods for enhancing predictive models at early-stage in social media and clinical domain - we investigate several methods to control model instability and enrich contexts of predictive models at the “cold-start” period. We demonstrate their effectiveness for the rumor detection and blood glucose prediction cases respectively. Overall, the findings presented in this thesis demonstrate the importance of tracking these temporal dynamics surround salient events and entities for IR applications. We show that determining such changes in time-based patterns and trends in prevalent temporal collections can better satisfy user expectations, and boost ranking and recommendation effectiveness over time

    A webometric analysis of Australian Universities using staff and size dependent web impact factors (WIF)

    Get PDF
    This study describes how search engines (SE) can be employed for automated, efficient data gathering for Webometric studies using predictable URLs. It then compares the usage of staffrelated Web Impact Factors (WIFs) to sizerelated impact factors for a ranking of Australian universities, showing that rankings based on staffrelated WIFs correlate much better with an established ranking from the Melbourne Institute than commonly used sizedependent WIFs. In fact sizedependent WIFs do not correlate with the Melbourne ranking at all. It also compares WIF data for Australian Universities provided by Smith (1999) for a longitudinal comparison of the WIF of Australian Universities over the last decade. It shows that sizedependent WIF values declined for most Australian universities over the last ten years, while staffdependent WIFs rose

    Search Engine Optimisation in UK news production

    Get PDF
    This is an Author's Accepted Manuscript of an article published in Journalism Practice, 5(4), 462 - 477, 2011, copyright Taylor & Francis, available online at: http://www.tandfonline.com/10.1080/17512786.2010.551020.This paper represents an exploratory study into an emerging culture in UK online newsrooms—the practice of Search Engine Optimisation (SEO), which assesses its impact on news production. Comprising a short-term participant observational case study at a national online news publisher, and a series of semi-structured, in-depth interviews with SEO professionals at three further UK media organisations, the author sets out to establish how SEO is operationalised in the newsroom, and what consequences these practices have for online news production. SEO practice is found to be varied and application is not universal. Not all UK news organisations are making the most of SEO even though some publishers take a highly sophisticated approach. Efforts are constrained by time, resources and management support, as well as off-page technical issues. SEO policy is found, in some cases, to inform editorial policy, but there is resistance to the principal of SEO driving decision-making. Several themes are established which call for further research

    Utilising content marketing metrics and social networks for academic visibility

    Get PDF
    There are numerous assumptions on research evaluation in terms of quality and relevance of academic contributions. Researchers are becoming increasingly acquainted with bibliometric indicators, including; citation analysis, impact factor, h-index, webometrics and academic social networking sites. In this light, this chapter presents a review of these concepts as it considers relevant theoretical underpinnings that are related to the content marketing of scholars. Therefore, this contribution critically evaluates previous papers that revolve on the subject of academic reputation as it deliberates on the individual researchers’ personal branding. It also explains how metrics are currently being used to rank the academic standing of journals as well as higher educational institutions. In a nutshell, this chapter implies that the scholarly impact depends on a number of factors including accessibility of publications, peer review of academic work as well as social networking among scholars.peer-reviewe

    Index ordering by query-independent measures

    Get PDF
    Conventional approaches to information retrieval search through all applicable entries in an inverted file for a particular collection in order to find those documents with the highest scores. For particularly large collections this may be extremely time consuming. A solution to this problem is to only search a limited amount of the collection at query-time, in order to speed up the retrieval process. In doing this we can also limit the loss in retrieval efficacy (in terms of accuracy of results). The way we achieve this is to firstly identify the most “important” documents within the collection, and sort documents within inverted file lists in order of this “importance”. In this way we limit the amount of information to be searched at query time by eliminating documents of lesser importance, which not only makes the search more efficient, but also limits loss in retrieval accuracy. Our experiments, carried out on the TREC Terabyte collection, report significant savings, in terms of number of postings examined, without significant loss of effectiveness when based on several measures of importance used in isolation, and in combination. Our results point to several ways in which the computation cost of searching large collections of documents can be significantly reduced
    • 

    corecore