7,770,075 research outputs found

    6 Keywords Searchable for Class Videos

    Get PDF

    Hybrid Search: Effectively Combining Keywords and Semantic Searches

    Get PDF
    This paper describes hybrid search, a search method supporting both document and knowledge retrieval via the flexible combination of ontologybased search and keyword-based matching. Hybrid search smoothly copes with lack of semantic coverage of document content, which is one of the main limitations of current semantic search methods. In this paper we define hybrid search formally, discuss its compatibility with the current semantic trends and present a reference implementation: K-Search. We then show how the method outperforms both keyword-based search and pure semantic search in terms of precision and recall in a set of experiments performed on a collection of about 18.000 technical documents. Experiments carried out with professional users show that users understand the paradigm and consider it very powerful and reliable. K-Search has been ported to two applications released at Rolls-Royce plc for searching technical documentation about jet engines

    Describing Papers and Reviewers' Competences by Taxonomy of Keywords

    Full text link
    This article focuses on the importance of the precise calculation of similarity factors between papers and reviewers for performing a fair and accurate automatic assignment of reviewers to papers. It suggests that papers and reviewers' competences should be described by taxonomy of keywords so that the implied hierarchical structure allows similarity measures to take into account not only the number of exactly matching keywords, but in case of non-matching ones to calculate how semantically close they are. The paper also suggests a similarity measure derived from the well-known and widely-used Dice's coefficient, but adapted in a way it could be also applied between sets whose elements are semantically related to each other (as concepts in taxonomy are). It allows a non-zero similarity factor to be accurately calculated between a paper and a reviewer even if they do not share any keyword in common

    Beyond Keywords

    Get PDF
    The potential of social media to give insight into the dynamic evolution of public conversations, and into their reactive and constitutive role in political activities, has to date been underdeveloped. While topic modeling can give static insight into the structure of a conversation, and keyword volume tracking can show how engagement with a specific idea varies over time, there is need for a method of analysis able to understand how conversations about societal values evolve and react to events in the world by incorporating new ideas and relating them to existing themes. In this article, we propose a method for analyzing social media messages that formalizes the structure of public conversations and allows the sociologist to study the evolution of public discourse in a rigorous, replicable, and data-driven fashion. This approach may be useful to those studying the social construction of meaning, the origins of factionalism and internecine conflict, or boundary-setting and group-identification exercises and has potential implications. Keywords: social media, framing, public conversation, analysis tools, visualizatio

    Comparing the hierarchy of keywords in on-line news portals

    Get PDF
    The tagging of on-line content with informative keywords is a widespread phenomenon from scientific article repositories through blogs to on-line news portals. In most of the cases, the tags on a given item are free words chosen by the authors independently. Therefore, relations among keywords in a collection of news items is unknown. However, in most cases the topics and concepts described by these keywords are forming a latent hierarchy, with the more general topics and categories at the top, and more specialised ones at the bottom. Here we apply a recent, cooccurrence-based tag hierarchy extraction method to sets of keywords obtained from four different on-line news portals. The resulting hierarchies show substantial differences not just in the topics rendered as important (being at the top of the hierarchy) or of less interest (categorised low in the hierarchy), but also in the underlying network structure. This reveals discrepancies between the plausible keyword association frameworks in the studied news portals

    T2{}^2K2{}^2: The Twitter Top-K Keywords Benchmark

    Full text link
    Information retrieval from textual data focuses on the construction of vocabularies that contain weighted term tuples. Such vocabularies can then be exploited by various text analysis algorithms to extract new knowledge, e.g., top-k keywords, top-k documents, etc. Top-k keywords are casually used for various purposes, are often computed on-the-fly, and thus must be efficiently computed. To compare competing weighting schemes and database implementations, benchmarking is customary. To the best of our knowledge, no benchmark currently addresses these problems. Hence, in this paper, we present a top-k keywords benchmark, T2{}^2K2{}^2, which features a real tweet dataset and queries with various complexities and selectivities. T2{}^2K2{}^2 helps evaluate weighting schemes and database implementations in terms of computing performance. To illustrate T2{}^2K2{}^2's relevance and genericity, we successfully performed tests on the TF-IDF and Okapi BM25 weighting schemes, on one hand, and on different relational (Oracle, PostgreSQL) and document-oriented (MongoDB) database implementations, on the other hand

    Predicting financial markets with Google Trends and not so random keywords

    Full text link
    We check the claims that data from Google Trends contain enough data to predict future financial index returns. We first discuss the many subtle (and less subtle) biases that may affect the backtest of a trading strategy, particularly when based on such data. Expectedly, the choice of keywords is crucial: by using an industry-grade backtesting system, we verify that random finance-related keywords do not to contain more exploitable predictive information than random keywords related to illnesses, classic cars and arcade games. We however show that other keywords applied on suitable assets yield robustly profitable strategies, thereby confirming the intuition of Preis et al. (2013)Comment: 8 pages, 4 figures. First names and last names swappe
    corecore