1,418 research outputs found

    FAQchat as in Information Retrieval system

    Get PDF
    A chatbot is a conversational agent that interacts with users through natural languages. In this paper, we describe a new way to access information using a chatbot. The FAQ in the School of Computing at the University of Leeds has been used to retrain the ALICE chatbot system, producing FAQchat. The results returned from FAQchat are similar to ones generated by search engines such as Google. For evaluation, a comparison was made between FAQchat and Google. The main objective is to demonstrate that FAQchat is a viable alternative to Google and it can be used as a tool to access FAQ databases

    The Challenging Balance of Being a Physician in Training, a Public Health Graduate Student, and Having a Life – A Commentary

    Get PDF
    Being a successful resident physician and graduate student is challenged by many competing life forces. In this paper I comment on these challenges and offer some thoughts on finding a work-life balance that is suitable for me. Keys to this balance include time management, flexibility, self-care, and frequent reflection of goals and priorities. Whereas they are demanding, the challenges of this balance also keep life vibrant and rewarding

    Sunnah Arabic Corpus: Design and Methodology.

    Get PDF
    Sunnah Arabic Corpus is an annotated linguistic resource that consists of 144K words/170K tokens of the Hadith narratives (an utterance attributed to prophet Mohammed) extracted from Riyāḍu Aṣṣāliḥīn book. As a first layer of annotation, the corpus has been fully diacritized. In addition, each orthographic word/token is segmented into its syntactic words. And each syntactic word is tagged with its part-of-speech in addition to multiple morphological features. Several hadith translations in different languages are provided and aligned at the narrative/paragraph level. Hadith Arabic Corpus follows the successful Quranic Arabic Corpus in its standards (corpus.quran.com). Sunnah Arabic Corpus is freely available under the Creative Commons Attribution-ShareAlike 4.0 International License

    Arabic dialects annotation using an online game

    Get PDF
    Modern Standard Arabic is the written standard across the Arab world; but there is an increasing use of Arabic dialects in social media, so this is appropriate as a source of a corpus for research on classifying Arabic dialect texts using machine learning algorithms. An important first step is annotation of the text corpus with correct dialect tags. We collected tweets from Twitter and comments from Facebook and online newspapers, aiming for representative samples of five groups of Arabic dialects: Gulf, Iraqi, Egyptian, Levantine, and North African. Then, we explored an approach to crowdsourcing corpus annotation. The task of annotation was developed as an online game, where players can test their dialect classification skills and get a score of their knowledge. This approach has so far achieved 24K annotated documents containing 587K tokens; 16,179 tagged as a dialect and 7,821 as Modern Standard Arabic

    Unsupervised grammar inference systems for natural language

    Get PDF
    In recent years there have been significant advances in the field of Unsupervised Grammar Inference (UGI) for Natural Languages such as English or Dutch. This paper presents a broad range of UGI implementations, where we can begin to see how the theory has been put to practise. Several mature systems are emerging, built using complex models and capable of deriving natural language grammatical phenomena. The range of systems is classified into: models based on Categorical Grammar (GraSp, CLL, EMILE); Memory Based Learning Models (FAMBL, RISE); Evolutionary computing models (ILM, LAgts); and string-pattern searches (ABL, GB). An objectively measurable statistical comparison of performances of the systems reviewed is not yet feasible. However, their merits and shortfalls are discussed, as well as a look at what the future has in store for UGI

    Constructing a Bilingual Hadith Corpus Using a Segmentation Tool

    Get PDF
    This article describes the process of gathering and constructing a bilingual parallel corpus of Islamic Hadith, which is the set of narratives reporting different aspects of the prophet Muhammad’s life. The corpus data is gathered from the six canonical Hadith collections using a custom segmentation tool that automatically segments and annotates the two Hadith components with 92% accuracy. This Hadith segmenter minimises the costs of language resource creation and produces consistent results independently from previous knowledge and experiences that usually influence human annotators. The corpus includes more than 10M tokens and will be freely available via the LREC repository

    Text Segmentation Using N-grams to Annotate Hadith Corpus

    Get PDF

    Analyses of risks associated with radiation exposure from past major solar particle events

    Get PDF
    Radiation exposures and cancer induction/mortality risks were investigated for several major solar particle events (SPE's). The SPE's included are: February 1956, November 1960, August 1972, October 1989, and the September, August, and October 1989 events combined. The three 1989 events were treated as one since all three could affect a single lunar or Mars mission. A baryon transport code was used to propagate particles through aluminum and tissue shield materials. A free space environment was utilized for all calculations. Results show the 30-day blood forming organs (BFO) limit of 25 rem was surpassed by all five events using 10 g/sq cm of shielding. The BFO limit is based on a depth dose of 5 cm of tissue, while a more detailed shield distribution of the BFO's was utilized. A comparison between the 5 cm depth dose and the dose found using the BFO shield distribution shows that the 5 cm depth value slightly higher than the BFO dose. The annual limit of 50 rem was exceeded by the August 1972, October 1989, and the three combined 1989 events with 5 g/sq cm of shielding. Cancer mortality risks ranged from 1.5 to 17 percent at 1 g/sq cm and 0.5 to 1.1 percent behind 10 g/sq cm of shielding for five events. These ranges correspond to those for a 45 year old male. It is shown that secondary particles comprise about 1/3 of the total risk at 10 g/sq cm of shielding. Utilizing a computerized Space Shuttle shielding model to represent a typical spacecraft configuration in free space at the August 1972 SPE, average crew doses exceeded the BFO dose limit
    corecore