5 research outputs found

    Temporal Expressions in Wikipedia Biographical Articles

    Get PDF
    KĂ€esoleva bakalaureusetöö eesmĂ€rk oli uurida ajavĂ€ljendite kasutust Vikipeedia biograafilistes artiklites ja vĂ€lja selgitada sĂŒnniaastate jagunemine, ajavĂ€ljendite liikide jagunemine, aastaarvuliste ajavĂ€ljendite jagunemine, ajavĂ€ljendite granulaarsus, ajavĂ€ljendite rikkalikkus ja sellele toetudes selgitada vĂ€lja ajavĂ€ljendite sobivus Vikipeedia biograafiliste artiklite ajasemantiliseks vĂ”rdlemiseks. Töös anti ĂŒlevaade autori poolt rakendatud Vikipeedia biograafiliste artiklite andmekaeve ja töötlemise protsessidest, ajavĂ€ljendite mĂ€rgendamise ja statistika koostamise protsessidest, toodi vĂ€lja ajavĂ€ljendite statistika Vikipeedia biograafilistes artiklites ja pakuti vĂ€lja meetod artiklite vĂ”rdlemiseks ajavĂ€ljendite alusel ja anti meetodile esialgne hinnang.The purpose of this Bachelor’s thesis is to research the usage of temporal expressions in Wikipedia biographical articles and find out the distribution of birth years, distribution of temporal expression types, distribution of years in temporal expressions, granularity of temporal expressions, temporal richness and analyse the compability for comparison of the temporal expressions in Wikipedia biographical articles. This work gives an overview of data mining and processing of Wikipedia biographical articles, temporal expression tagging and statistics creation processes, statistics of temporal expressions in Wikipedia biographical articles is given and a method for comparing articles on the basis of temporal expressions and initial valuation for the method is proposed.Keywords:Temporal expression, Wikipedia, biography, data miningCERCS: P175 Informatics, systems theor

    Corpus Linguistics and 17th-Century Prostitution

    Get PDF
    Corpus linguistics has much to offer history, being as both disciplines engage so heavily in analysis of large amounts of textual material. This book demonstrates the opportunities for exploring corpus linguistics as a method in historiography and the humanities and social sciences more generally. Focusing on the topic of prostitution in 17th-century England, it shows how corpus methods can assist in social research, and can be used to deepen our understanding and comprehension. McEnery and Baker draw principally on two sources – the newsbook Mercurius Fumigosis and the Early English Books Online Corpus. This scholarship on prostitution and the sex trade offers insight into the social position of women in history

    Context & Semantics in News & Web Search

    Full text link

    Supporting Exploration of Historical Perspectives Across Collections

    No full text
    The ever growing number of textual historical collections calls for methods that can meaningfully connect and explore these. Different collections offer different perspectives, expressing views at the time of writing or even a subjective view of the author. We propose to connect heterogeneous digital collections through temporal references found in documents as well as their textual content. We evaluate our approach and find that it works very well on digital-native collections. Digitized collections pose interesting challenges and with improved preprocessing our approach performs well. We introduce a novel search interface to explore and analyze the connected collections that highlights different perspectives and requires little domain knowledge. In our approach, perspectives are expressed as complex queries. Our approach supports humanity scholars in exploring collections in a novel way and allows for digital collections to be more accessible by adding new connections and new means to access collections

    Supporting Exploration of Historical Perspectives Across Collections

    Get PDF
    The ever growing number of textual historical collections calls for methods that can meaningfully connect and explore these. Different collections offer different perspectives, expressing views at the time of writing or even a subjective view of the author. We propose to connect heterogeneous digital collections through temporal references found in documents as well as their textual content. We evaluate our approach and find that it works very well on digital-native collections. Digitized collections pose interesting challenges and with improved preprocessing our approach performs well. We introduce a novel search interface to explore and analyze the connected collections that highlights different perspectives and requires little domain knowledge. In our approach, perspectives are expressed as complex queries. Our approach supports humanity scholars in exploring collections in a novel way and allows for digital collections to be more accessible by adding new connections and new means to access collections
    corecore