232 research outputs found

    Cross-Lingual and Cross-Chronological Information Access to Multilingual Historical Documents

    Get PDF
    In this chapter, we present our work in realizing information access across different languages and periods. Nowadays, digital collections of historical documents have to handle materials written in many different languages in different time periods. Even in a particular language, there are significant differences over time in terms of grammar, vocabulary and script. Our goal is to develop a method to access digital collections in a wide range of periods from ancient to modern. We introduce an information extraction method for digitized ancient Mongolian historical manuscripts for reducing labour-intensive analysis. The proposed method performs computerized analysis on Mongolian historical documents. Named entities such as personal names and place names are extracted by employing support vector machine. The extracted named entities are utilized to create a digital edition that reflects an ancient Mongolian historical manuscript written in traditional Mongolian script. The Text Encoding Initiative guidelines are adopted to encode the named entities, transcriptions and interpretations of ancient words. A web-based prototype system is developed for utilizing digital editions of ancient Mongolian historical manuscripts as scholarly tools. The proposed prototype has the capability to display and search traditional Mongolian text and its transliteration in Latin letters along with the highlighted named entities and the scanned images of the source manuscript

    Arabic Manuscripts Analysis and Retrieval

    Get PDF

    Arabic Manuscript Layout Analysis and Classification

    Get PDF

    Studies on User Intent Analysis and Mining

    Get PDF
    Predicting the goals of users can be extremely useful in e-commerce, online entertainment, information retrieval, and many other online services and applications. In this thesis, we study the task of user intent understanding, trying to bridge the gap between user expressions to online services and their goals behind it. As far as we know, most of the existing user intent studies are focusing on web search and social media domain. Studies on other areas are not enough. For example, as people more and more rely our daily life on cellphone, our information needs expressing to mobile devices and related services are increasing dramatically. Studies of user intent mining on mobile devices are not much. And the intentions of using mobile devices are different from the ones we use web search engine or social network. So we cannot directly apply the existing user intention to this area. Besides, user's intents are not stable but changing over time. And different interests will impact each other. Modeling such kind of dynamic user interests can help accurately understand and predict user's intent. But there're few existing works in this area. Moreover, user intent could be explicitly or implicitly expressed by users. The implicit intent expression is more close to human's natural language and also have great value to recognize and mine. To make further studies of these challenges, we first try to answer the question of “What is the user intent?” By referring amount of previous studies, we give our definition of user intent as “User intent is a task-specific, predefined or latent concept, topic or knowledge-base that is under an expression from a user who is trying to express his goal of information or service need.“ Then, we focus on the driving scenario when a user using cellphone and study the user intent in this domain. As far as we know, it is the first time of user intent analysis and categorization in this domain. And we also build a dataset of user input and related intent category and attributes by crowdsourcing and carefully handcraft. With the user intent taxonomy and dataset in hand, we conduct a user intent classification and user intent attribute recognition by supervised machine learning models. To classify the user intent for a user intent query, we use a convolutional neural network model to build a multi-class classifier. And then we use a sequential labeling method to recognize the intent attribute in the query. The experiment results show that our proposed method outperforms several baseline models in precision, recall, and F-score. In addition, we study the implicit user intent mining method through web search log data. By using a Restricted Boltzmann Machine, we make use of the correlation of query and click information to learn the latent intent behind a user web search. We propose a user intent prediction model on online discussion forum using Multivariate Hawkes Process. It dynamically models user intentions change and interact over time.The method models both of the internal and external factors of user's online forum response motivations, and also integrated the time decay fact of user's interests. We also present a data visualization method, using an enriched domain ontology to highlight the domain-specific words and entity relations within an article.Ph.D., Information Studies -- Drexel University, 201

    Traces of the Animal Past

    Get PDF
    Understanding the relationships between humans and animals is essential to a full understanding of both our present and our shared past. Across the humanities and social sciences, researchers have embraced the ‘animal turn,’ a multispecies approach to scholarship, with historians at the forefront of new research in human-animal studies that blends traditional research methods with interdisciplinary theoretical frameworks that decenter humans in historical narratives. These exciting approaches come with core methodological challenges for scholars seeking to better understand the past from non-anthropocentric perspectives. Whether in a large public archive, a small private collection, or the oral histories of living memories, stories of animals are mediated by the humans who have inscribed the records and organized archival collections. In oral histories, the place of animals in the past are further refracted by the frailty of human memory and recollection. Only traces remain for researchers to read and interpret. Bringing together seventeen original essays by a leading group of international scholars, Traces of the Animal Past showcases the innovative methods historians use to unearth and explain how animals fit into our collective histories. Situating the historian within the narrative, bringing transparency to methodological processes, and reflecting on the processes and procedures of current research, this book presents new approaches and new directions for a maturing field of historical inquiry

    The Classification of Religions: A domain-analytic examination of the history and epistemology of the classification of religions within the Religious Studies discipline

    Get PDF
    While religion is a part of every culture and is entangled in many facets of the lives of those who are religious, the scientific study of religion and the Religious Studies discipline are fairly new, only developing in the mid to late nineteenth century. One of the contributions that the scientific study of religions has made is the development of different approaches for classifying religions. As a multidisciplinary field, Religious Studies and the classification of religions has been influenced by philosophy, psychology, history, sociology and anthropology. This study, using the domain-analytic paradigm, traces the development of the Religious Studies discipline and the classification of religions, analyzes the epistemological assumptions behind the prominent approaches used to classify religions and briefly examines their relation to the Library of Congress, Dewey Decimal and Universal Decimal classifications
    corecore