2,731 research outputs found

    An Ontology based Text-to-Picture Multimedia m-Learning System

    Get PDF
    Multimedia Text-to-Picture is the process of building mental representation from words associated with images. From the research aspect, multimedia instructional message items are illustrations of material using words and pictures that are designed to promote user realization. Illustrations can be presented in a static form such as images, symbols, icons, figures, tables, charts, and maps; or in a dynamic form such as animation, or video clips. Due to the intuitiveness and vividness of visual illustration, many text to picture systems have been proposed in the literature like, Word2Image, Chat with Illustrations, and many others as discussed in the literature review chapter of this thesis. However, we found that some common limitations exist in these systems, especially for the presented images. In fact, the retrieved materials are not fully suitable for educational purposes. Many of them are not context-based and didn’t take into consideration the need of learners (i.e., general purpose images). Manually finding the required pedagogic images to illustrate educational content for learners is inefficient and requires huge efforts, which is a very challenging task. In addition, the available learning systems that mine text based on keywords or sentences selection provide incomplete pedagogic illustrations. This is because words and their semantically related terms are not considered during the process of finding illustrations. In this dissertation, we propose new approaches based on the semantic conceptual graph and semantically distributed weights to mine optimal illustrations that match Arabic text in the children’s story domain. We combine these approaches with best keywords and sentences selection algorithms, in order to improve the retrieval of images matching the Arabic text. Our findings show significant improvements in modelling Arabic vocabulary with the most meaningful images and best coverage of the domain in discourse. We also develop a mobile Text-to-Picture System that has two novel features, which are (1) a conceptual graph visualization (CGV) and (2) a visual illustrative assessment. The CGV shows the relationship between terms associated with a picture. It enables the learners to discover the semantic links between Arabic terms and improve their understanding of Arabic vocabulary. The assessment component allows the instructor to automatically follow up the performance of learners. Our experiments demonstrate the efficiency of our multimedia text-to-picture system in enhancing the learners’ knowledge and boost their comprehension of Arabic vocabulary

    Geospatial Analysis and Modeling of Textual Descriptions of Pre-modern Geography

    Get PDF
    Textual descriptions of pre-modern geography offer a different view of classical geography. The descriptions have been produced when none of the modern geographical concepts and tools were available. In this dissertation, we study pre-modern geography by primarily finding the existing structures of the descriptions and different cases of geographical data. We first explain four major geographical cases in pre-modern Arabic sources: gazetteer, administrative hierarchies, routes, and toponyms associated with people. Focusing on hierarchical divisions and routes, we offer approaches for manual annotation of administrative hierarchies and route sections as well as a semi-automated toponyms annotation. The latter starts with a fuzzy search of toponyms from an authority list and applies two different extrapolation models to infer true or false values, based on the context, for disambiguating the automatically annotated toponyms. Having the annotated data, we introduce mathematical models to shape and visualize regions based on the description of administrative hierarchies. Moreover, we offer models for comparing hierarchical divisions and route networks from different sources. We also suggest approaches to approximate geographical coordinates for places that do not have geographical coordinates - we call them unknown places - which is a major issue in visualization of pre-modern places on map. The final chapter of the dissertation introduces the new version of al-Ṯurayyā, a gazetteer and a spatial model of the classical Islamic world using georeferenced data of a pre-modern atlas with more than 2, 000 toponyms and routes. It offers search, path finding, and flood network functionalities as well as visualizations of regions using one of the models that we describe for regions. However the gazetteer is designed using the classical Islamic world data, the spatial model and features can be used for similarly prepared datasets.:1 Introduction 1 2 Related Work 8 2.1 GIS 8 2.2 NLP, Georeferencing, Geoparsing, Annotation 10 2.3 Gazetteer 15 2.4 Modeling 17 3 Classical Geographical Cases 20 3.1 Gazetteer 21 3.2 Routes and Travelogues 22 3.3 Administrative Hierarchy 24 3.4 Geographical Aspects of Biographical Data 25 4 Annotation and Extraction 27 4.1 Annotation 29 4.1.1 Manual Annotation of Geographical Texts 29 4.1.1.1 Administrative Hierarchy 30 4.1.1.2 Routes and Travelogues 32 4.1.2 Semi-Automatic Toponym Annotation 34 4.1.2.1 The Annotation Process 35 4.1.2.2 Extrapolation Models 37 4.1.2.2.1 Frequency of Toponymic N-grams 37 4.1.2.2.2 Co-occurrence Frequencies 38 4.1.2.2.3 A Supervised ML Approach 40 4.1.2.3 Summary 45 4.2 Data Extraction and Structures 45 4.2.1 Administrative Hierarchy 45 4.2.2 Routes and Distances 49 5 Modeling Geographical Data 51 5.1 Mathematical Models for Administrative Hierarchies 52 5.1.1 Sample Data 53 5.1.2 Quadtree 56 5.1.3 Voronoi Diagram 58 5.1.4 Voronoi Clippings 62 5.1.4.1 Convex Hull 62 5.1.4.2 Concave Hull 63 5.1.5 Convex Hulls 65 5.1.6 Concave Hulls 67 5.1.7 Route Network 69 5.1.8 Summary of Models for Administrative Hierarchy 69 5.2 Comparison Models 71 5.2.1 Hierarchical Data 71 5.2.1.1 Test Data 73 5.2.2 Route Networks 76 5.2.2.1 Post-processing 81 5.2.2.2 Applications 82 5.3 Unknown Places 84 6 Al-Ṯurayyā 89 6.1 Introducing al-Ṯurayyā 90 6.2 Gazetteer 90 6.3 Spatial Model 91 6.3.1 Provinces and Administrative Divisions 93 6.3.2 Pathfinding and Itineraries 93 6.3.3 Flood Network 96 6.3.4 Path Alignment Tool 97 6.3.5 Data Structure 99 6.3.5.1 Places 100 6.3.5.2 Routes and Distances 100 7 Conclusions and Further Work 10

    Benchmarking Arabic AI with Large Language Models

    Full text link
    With large Foundation Models (FMs), language technologies (AI in general) are entering a new paradigm: eliminating the need for developing large-scale task-specific datasets and supporting a variety of tasks through set-ups ranging from zero-shot to few-shot learning. However, understanding FMs capabilities requires a systematic benchmarking effort by comparing FMs performance with the state-of-the-art (SOTA) task-specific models. With that goal, past work focused on the English language and included a few efforts with multiple languages. Our study contributes to ongoing research by evaluating FMs performance for standard Arabic NLP and Speech processing, including a range of tasks from sequence tagging to content classification across diverse domains. We start with zero-shot learning using GPT-3.5-turbo, Whisper, and USM, addressing 33 unique tasks using 59 publicly available datasets resulting in 96 test setups. For a few tasks, FMs performs on par or exceeds the performance of the SOTA models but for the majority it under-performs. Given the importance of prompt for the FMs performance, we discuss our prompt strategies in detail and elaborate on our findings. Our future work on Arabic AI will explore few-shot prompting, expand the range of tasks, and investigate additional open-source models.Comment: Foundation Models, Large Language Models, Arabic NLP, Arabic Speech, Arabic AI, , CHatGPT Evaluation, USM Evaluation, Whisper Evaluatio

    Can humain association norm evaluate latent semantic analysis?

    Get PDF
    This paper presents the comparison of word association norm created by a psycholinguistic experiment to association lists generated by algorithms operating on text corpora. We compare lists generated by Church and Hanks algorithm and lists generated by LSA algorithm. An argument is presented on how those automatically generated lists reflect real semantic relations

    A text mining approach for Arabic question answering systems

    Get PDF
    As most of the electronic information available nowadays on the web is stored as text,developing Question Answering systems (QAS) has been the focus of many individualresearchers and organizations. Relatively, few studies have been produced for extractinganswers to “why” and “how to” questions. One reason for this negligence is that when goingbeyond sentence boundaries, deriving text structure is a very time-consuming and complexprocess. This thesis explores a new strategy for dealing with the exponentially large spaceissue associated with the text derivation task. To our knowledge, to date there are no systemsthat have attempted to addressing such type of questions for the Arabic language.We have proposed two analytical models; the first one is the Pattern Recognizer whichemploys a set of approximately 900 linguistic patterns targeting relationships that hold withinsentences. This model is enhanced with three independent algorithms to discover thecausal/explanatory role indicated by the justification particles. The second model is the TextParser which is approaching text from a discourse perspective in the framework of RhetoricalStructure Theory (RST). This model is meant to break away from the sentence limit. TheText Parser model is built on top of the output produced by the Pattern Recognizer andincorporates a set of heuristics scores to produce the most suitable structure representing thewhole text.The two models are combined together in a way to allow for the development of an ArabicQAS to deal with “why” and “how to” questions. The Pattern Recognizer model achieved anoverall recall of 81% and a precision of 78%. On the other hand, our question answeringsystem was able to find the correct answer for 68% of the test questions. Our results revealthat the justification particles play a key role in indicating intrasentential relations

    Theory and Applications for Advanced Text Mining

    Get PDF
    Due to the growth of computer technologies and web technologies, we can easily collect and store large amounts of text data. We can believe that the data include useful knowledge. Text mining techniques have been studied aggressively in order to extract the knowledge from the data since late 1990s. Even if many important techniques have been developed, the text mining research field continues to expand for the needs arising from various application fields. This book is composed of 9 chapters introducing advanced text mining techniques. They are various techniques from relation extraction to under or less resourced language. I believe that this book will give new knowledge in the text mining field and help many readers open their new research fields
    corecore