228 research outputs found

    Automatic Arabic Text Summarization System (AATSS) Based on Semantic Feature Extraction

    Get PDF
    Recently, one of the problems arisen due to the amount of information and it’s availability on the web, is the increased need for effective and powerful tool to automatically summarize text. For English and European languages an intensive works have been done with high performance and nowadays they look forward to multi-document and multi-language summarization. However, Arabic language still suffers from the little attentions and research done in this filed. In our research we propose a model to automatically summarize Arabic text using text extraction. Various steps are involved in the approach: preprocessing text, extract set of feature from sentences, classify sentence based on scoring method, ranking sentences and finally generate an extract summary. The main difference between our proposed system and other Arabic summarization systems are the consideration of semantics, entity objects such as names and places, and similarity factors in our proposed system. The proposed system has been applied on news domain using a dataset obtained from Falesteen newspaper. Manual evaluation techniques are used to evaluate and test the system. The results obtained by the proposed method achieve 86.5% similarity between the system and human summarization. A comparative study between our proposed system and Sakhr Arabic online summarization system has been conducted. The results show that our proposed system outperforms the Shakr system

    A crowdsourcing recommendation model for image annotations in cultural heritage platforms

    Get PDF
    Cultural heritage is one of many fields that has seen a significant digital transformation in the form of digitization and asset annotations for heritage preservation, inheritance, and dissemination. However, a lack of accurate and descriptive metadata in this field has an impact on the usability and discoverability of digital content, affecting cultural heritage platform visitors and resulting in an unsatisfactory user experience as well as limiting processing capabilities to add new functionalities. Over time, cultural heritage institutions were responsible for providing metadata for their collection items with the help of professionals, which is expensive and requires significant effort and time. In this sense, crowdsourcing can play a significant role in digital transformation or massive data processing, which can be useful for leveraging the crowd and enriching the metadata quality of digital cultural content. This paper focuses on a very important challenge faced by cultural heritage crowdsourcing platforms, which is how to attract users and make such activities enjoyable for them in order to achieve higher-quality annotations. One way to address this is to offer personalized interesting items based on each user preference, rather than making the user experience random and demanding. Thus, we present an image annotation recommendation system for users of cultural heritage platforms. The recommendation system design incorporates various technologies intending to help users in selecting the best matching images for annotations based on their interests and characteristics. Different classification methods were implemented to validate the accuracy of our work on Egyptian heritage.Agencia Estatal de Investigación | Ref. TIN2017-87604-RXunta de Galicia | Ref. ED431B 2020/3

    Language report for Catalan (English version)

    Get PDF
    The central objective of the Metanet4u project is to contribute to the establishment of a pan-European digital platform that makes available language resources and services, encompassing both datasets and software tools, for speech and language processing, and supports a new generation of exchange facilities for them.Peer ReviewedPreprin

    The 1st International Electronic Conference on Algorithms

    Get PDF
    This book presents 22 of the accepted presentations at the 1st International Electronic Conference on Algorithms which was held completely online from September 27 to October 10, 2021. It contains 16 proceeding papers as well as 6 extended abstracts. The works presented in the book cover a wide range of fields dealing with the development of algorithms. Many of contributions are related to machine learning, in particular deep learning. Another main focus among the contributions is on problems dealing with graphs and networks, e.g., in connection with evacuation planning problems

    From Scrolls to Scrolling: Sacred Texts, Materiality, and Dynamic Media Cultures

    Get PDF
    Using the digital turn as a starting point, the essays in this volume explore the materiality of sacred texts in Judaism, Christianity, and Islam, along with transitions between various media cultures and material forms. The essays explore how material factors have shaped the production and transmission of sacred texts, as well as impacting the way in which people engage with, use, and perform these texts, within and between religious traditions

    Ensemble Morphosyntactic Analyser for Classical Arabic

    Get PDF
    Classical Arabic (CA) is an influential language for Muslim lives around the world. It is the language of two sources of Islamic laws: the Quran and the Sunnah, the collection of traditions and sayings attributed to the prophet Mohammed. However, classical Arabic in general, and the Sunnah, in particular, is underexplored and under-resourced in the field of computational linguistics. This study examines the possible directions for adapting existing tools, specifically morphological analysers, designed for modern standard Arabic (MSA) to classical Arabic. Morphological analysers of CA are limited, as well as the data for evaluating them. In this study, we adapt existing analysers and create a validation data-set from the Sunnah books. Inspired by the advances in deep learning and the promising results of ensemble methods, we developed a systematic method for transferring morphological analysis that is capable of handling different labelling systems and various sequence lengths. In this study, we handpicked the best four open access MSA morphological analysers. Data generated from these analysers are evaluated before and after adaptation through the existing Quranic Corpus and the Sunnah Arabic Corpus. The findings are as follows: first, it is feasible to analyse under-resourced languages using existing comparable language resources given a small sufficient set of annotated text. Second, analysers typically generate different errors and this could be exploited. Third, an explicit alignment of sequences and the mapping of labels is not necessary to achieve comparable accuracies given a sufficient size of training dataset. Adapting existing tools is easier than creating tools from scratch. The resulting quality is dependent on training data size and number and quality of input taggers. Pipeline architecture performs less well than the End-to-End neural network architecture due to error propagation and limitation on the output format. A valuable tool and data for annotating classical Arabic is made freely available

    Modeling second language learners' interlanguage and its variability: a computer-based dynamic assessment approach to distinguishing between errors and mistakes

    Get PDF
    Despite a long history, interlanguage variability research is a debatable topic as most paradigms do not distinguish between competence and performance. While interlanguage performance has been proven to be variable, determining whether interlanguage competence is exposed to random and/or systematic variations is complex, given the fact that distinction between competence-dependent errors and performance-related mistakes should be established to best represent the interlanguage competence. This thesis suggests a dynamic assessment model grounded in sociocultural theory to distinguish between errors and mistakes in texts written by learners of French, to then investigate the extent to which interlanguage competence varies across time, text types, and students. The key outcomes include: 1. An expanded model based on dynamic assessment principles to distinguish between errors and mistakes, which also provides the structure to create and observe learners’ zone of proximal development; 2. A method to increase the accuracy of the part-of-speech tagging procedure whose reliability correlates with the number of incorrect words contained in learners’ texts; 3. A sociocultural insight into interlanguage variability research. Results demonstrate that interlanguage competence is as variable as performance. The main finding shows that knowledge over time is subject to not only systematic, but also unsystematic variations

    Designing MOOC:a shared view on didactical principles

    Get PDF
    The innovative impact of the paper can be highlighted by the following statements: 1. Applying the Group Concept Mapping, a non-traditional and power research methodology for objectively identifying the shared vision of a group of experts on MOOC didactical principles. 2. Defining MOOC didactical principles and their operationalisations in more concrete guidelines. 3. Formulating suggestions for combining xMOOC and cMOOC.Supported by European Commission, DG EAC, under the Erasmus+ Programm
    corecore