934 research outputs found

    An effective, low-cost measure of semantic relatedness obtained from Wikipedia links

    Get PDF
    This paper describes a new technique for obtaining measures of semantic relatedness. Like other recent approaches, it uses Wikipedia to provide structured world knowledge about the terms of interest. Out approach is unique in that it does so using the hyperlink structure of Wikipedia rather than its category hierarchy or textual content. Evaluation with manually defined measures of semantic relatedness reveals this to be an effective compromise between the ease of computation of the former approach and the accuracy of the latter

    Connecting every bit of knowledge: The Structure of Wikipedia’s first link network

    Get PDF
    Apples, porcupines, and the most obscure Bob Dylan song\u27is every topic a few clicks from Philosophy? Within Wikipedia, the surprising answer is yes: nearly all paths lead to Philosophy. Wikipedia is the largest, most meticulously indexed collection of human knowledge ever amassed. More than information about a topic, Wikipedia is a web of naturally emerging relationships. By following the first link in each article, we algorithmically construct a directed network of all 4.7 million articles: Wikipedia\u27s First Link Network. Here we study the English edition of Wikipedia\u27s First Link Network for insight into how the many inventions, places, people, objects, and events are related and organized. We traverse every path, measuring the accumulation of first links, path lengths, basins, cycles, and the influence each article exerts in shaping the network. We discover scale-free distributions describe path length, accumulation, and influence. Far from dispersed, first links disproportionately accumulate at a few articles\u27flowing from specific to general and culminating around fundamental notions such as Community, State, and Science. Philosophy shapes more paths than any other article by two orders of magnitude. Curiously, we also observe a gravitation towards topical articles such as Health Care and Fossil Fuel. These findings enrich our view of the connections and structure of Wikipedia\u27s ever growing store of knowledge

    A Unified multilingual semantic representation of concepts

    Get PDF
    Semantic representation lies at the core of several applications in Natural Language Processing. However, most existing semantic representation techniques cannot be used effectively for the representation of individual word senses. We put forward a novel multilingual concept representation, called MUFFIN , which not only enables accurate representation of word senses in different languages, but also provides multiple advantages over existing approaches. MUFFIN represents a given concept in a unified semantic space irrespective of the language of interest, enabling cross-lingual comparison of different concepts. We evaluate our approach in two different evaluation benchmarks, semantic similarity and Word Sense Disambiguation, reporting state-of-the-art performance on several standard datasets

    A Trio Neural Model for Dynamic Entity Relatedness Ranking

    Full text link
    Measuring entity relatedness is a fundamental task for many natural language processing and information retrieval applications. Prior work often studies entity relatedness in static settings and an unsupervised manner. However, entities in real-world are often involved in many different relationships, consequently entity-relations are very dynamic over time. In this work, we propose a neural networkbased approach for dynamic entity relatedness, leveraging the collective attention as supervision. Our model is capable of learning rich and different entity representations in a joint framework. Through extensive experiments on large-scale datasets, we demonstrate that our method achieves better results than competitive baselines.Comment: In Proceedings of CoNLL 201

    Educational tool based on topology and evolution of hyperlinks in the Wikipedia

    Get PDF
    We propose a new method to support educationalexploration in the hyperlink network of the Wikipedia onlineencyclopedia. The learner is provided with alternative parallelranking lists, each one promoting hyperlinks that represent adifferent pedagogical perspective to the desired learning topic.The learner can browse the conceptual relations between thelatest versions of articles or the conceptual relations belongingto consecutive temporal versions of an article, or a mixture ofboth approaches. Based on her needs and intuition, the learnerexplores hyperlink network and meanwhile the method buildsautomatically concept maps that reflect her conceptualizationprocess and can be used for varied educational purposes.Initial experiments with a prototype tool based on the methodindicate enhancement to ordinary learning results and suggestfurther research.Peer reviewe

    Guided generation of pedagogical concept maps from the Wikipedia

    Get PDF
    We propose a new method for guided generation of concept maps from open accessonline knowledge resources such as Wikies. Based on this method we have implemented aprototype extracting semantic relations from sentences surrounding hyperlinks in the Wikipedia’sarticles and letting a learner to create customized learning objects in real-time based oncollaborative recommendations considering her earlier knowledge. Open source modules enablepedagogically motivated exploration in Wiki spaces, corresponding to an intelligent tutoringsystem. The method extracted compact noun–verb–noun phrases, suggested for labeling arcsbetween nodes that were labeled with article titles. On average, 80 percent of these phrases wereuseful while their length was only 20 percent of the length of the original sentences. Experimentsindicate that even simple analysis algorithms can well support user-initiated information retrievaland building intuitive learning objects that follow the learner’s needs.Peer reviewe

    Creating a Phrase Similarity Graph From Wikipedia

    Get PDF
    The paper addresses the problem of modeling the relationship between phrases in English using a similarity graph. The mathematical model stores data about the strength of the relationship between phrases expressed as a decimal number. Both structured data from Wikipedia, such as that the Wikipedia page with title “Dog” belongs to theWikipedia category “Domesticated animals”, and textual descriptions, such as that the Wikipedia page with title “Dog” contains the word “wolf” thirty one times are used in creating the graph. The quality of the graph data is validated by comparing the similarity of pairs of phrases using our software that uses the graph with results of studies that were performed with human subjects. To the best of our knowledge, our software produces better correlation with the results of both the Miller and Charles study and the WordSimilarity-353 study than any other published research
    • 

    corecore