57 research outputs found

    An application of distributional semantics for the analysis of the Holy Quran

    Get PDF
    In this contribution we illustrate the methodology and the results of an experiment we conducted by applying Distributional Semantics Models to the analysis of the Holy Quran. Our aim was to gather information on the potential differences in meanings that the same words might take on when used in Modern Standard Arabic w.r.t. their usage in the Quran. To do so we used the Penn Arabic Treebank as a contrastive corpu

    Towards a flexible open-source software library for multi-layered scholarly textual studies: An Arabic case study dealing with semi-automatic language processing

    Get PDF
    This paper presents both the general model and a case study of the Computational and Collaborative Philology Library (CoPhiLib), an ongoing initiative underway at the Institute for Computational Linguistics (ILC) of the National Research Council (CNR), Pisa, Italy. The library, designed and organized as a reusable, abstract and open-source software component, aims at solving the needs of multi-lingual and cross-lingual analysis by exposing common Application Programming Interfaces (APIs). The core modules, coded by the Java programming language, constitute the groundwork of a Web platform designed to deal with textual scholarly needs. The Web application, implemented according to the Java Enterprise specifications, focuses on multi-layered analysis for the study of literary documents and related multimedia sources. This ambitious challenge seeks to obtain the management of textual resources, on the one hand by abstracting from current language, on the other hand by decoupling from the specific requirements of single projects. This goal is achieved thanks to methodologies declared by the 'agile process', and by putting into effect suitable use case modeling, design patterns, and component-based architectures. The reusability and flexibility of the system have been tested on an Arabic case study: the system allows users to choose the morphological engine (such as AraMorph or Al-Khalil), along with linguistic granularity (i.e. with or without declension). Finally, the application enables the construction of annotated resources for further statistical engines (training set). © 2014 IEEE

    Motore morfologico della lingua araba

    Get PDF
    The morphological engine has been designed to perform the double function: generate automatically, from one Arabic entry, all its forms (including the their morpho-syntactic classification); allow the morphological analysis, that is go back from one form to the dictionary entry (or entries

    Commerce Numérique: traffic signals for crossroads between cultures.

    Get PDF
    Commerce is a French literary journal - founded by Princess Margherita Caetani - which relied on the collaboration of three prestigious writers: Paul Valéry, Léon-Paul Fargue, and Valéry Larbaud. The journal is composed of twenty-nine volumes published between 1924 and 1932. Each volume includes different literary material like poems and novels, written by both well- known and unknown writers, who also translated important authors like Joyce, T.S. Eliot, Pirandello, Ungaretti, Saint-John Perse, Rilke, and Hofmannsthal. Considering the historical, literary, and cultural importance of the journal Commerce, our project “Commerce numérique” aims to digitize and to make the journal contents freely available online to both the general public and the research community. This article describes the way in which the journal was encoded. Particular importance is also given to the encoding of poems present in Commerce. Some poems are in the original language and are accompanied by their French translation, other poems are in the French-translated form without the original text. In order to fully and accurately express the phenomena and their structures, we adopted some aspects of the TEI framework that will be explained in detail. The French translation of a Moroccan Arabic poem from the 13th century is also considered. The original Arabic poem is interesting because it presents aspects of both the Moroccan dialect and the oral text. The study and the encoding of the Arabic poem in parallel to its translation highlight some important structural differences between Arabic poetry and Western poetry

    Enseigner l’« histoire des religions » Que faire de l’Antiquité ?

    Get PDF
    Enseigner non pas les religions – encore moins « transmettre les croyances religieuses » –, mais leur histoire, en relation avec les contextes, les milieux, les grands événements et personnages, les cultures : donc une histoire à espaces distants et à temporalités différenciées. Sur ce point, l’accord est aujourd’hui à peu près général , mais ce consensus laisse bien des problèmes en suspens, ainsi que le prouve un double débat entre les spécialistes (et les « gens intéressés ») sur la défini..

    AraMorph Data Plus

    No full text
    The AraMorph's original engine (https://sourceforge.net/projects/aramorph/files/aramorph/1.2.1) uses six linguistic files. Three Arabic-English lexicon files: prefixes (299 entries), suffixes (618 entries), and stems (82158 entries representing 38600 lemmas). Other three files consist of morphological compatibility tables used for controlling prefix-stem combinations (1648 entries), stem-suffix combinations (1285 entries), and prefix-suffix combinations (598 entries). The present data consists of the updated lexical resources used by the Aramorph' engine. The updates take advantage of a number of orthographic, morpho-syntactic and semantic constraints that operate at the word level. Therefore, the Arabic-English lexicon files contain: prefixes (335 entries), suffixes (876 entries), and stems (35475 entries). Note that the number of stems is smaller in Plus than in Original, due to the removal of obsolete entries and of a number of foreign names that are unlikely to be found in Arabic texts. The morphological compatibility tables used for controlling prefix-stem combinations (2698 entries), stem-suffix combinations (2161 entries), and prefix-suffix combinations (1295 entries)

    Vers une ontologie de la culture arabo-musulmane

    No full text
    Le projet « Vers une ontologie de la culture arabo-musulmane » vise à créer une ressource sémantique-lexicale numérique grâce à l'extraction automatique des données à partir du lexique arabe al-qāmūs al-muḥīṭ (qāmūs) compilé par ’al-fīrūz’ābādī (1329-1414). Avec les attributs MonolingualExternalRef, chaque lemme de la ressource numérisée qāmūs sera lié au synset correspondant de la WordNet de la langue anglaise (PWN) et au concept de l'ontologie SUMO (chaque fois que c'est possible)

    al-qāmūs l-muḥīṭ: a digital Arabic dictionary: letter wāw

    No full text
    Dossier letter wāw contains: TXT file: part of plain text corresponding of the section of the letter wāw XML files without translation: conversion of text into XML resulting from information extraction and tagging of lemma, part of speech, lexical information, derivational information, and meanings. XML files with translation: enriched with translations of lemmas and corresponding senses using the bilingual dictionary ''An Advanced Learner's Arabic-English Dictionary", by H. Anthony Salmoné and published in 1889. This section contains: 28 chapters, 374 roots and 2877 lexical entrie

    al-qāmūs l-muḥīṭ: a digital Arabic dictionary: letter zāy

    No full text
    Dossier letter zāy contains: TXT file: part of plain text corresponding of the section of the letter zāy XML files without translation: conversion of text into XML resulting from information extraction and tagging of lemma, part of speech, lexical information, derivational information, and meanings. XML files with translation: enriched with translations of lemmas and corresponding senses using the bilingual dictionary ''An Advanced Learner's Arabic-English Dictionary", by H. Anthony Salmoné and published in 1889. This section contains: 24 chapters, 284 roots and 1450 lexical entrie

    al-qāmūs l-muḥīṭ: a digital Arabic dictionary: letter jīm

    No full text
    Dossier letter jīm contains: TXT file: part of plain text corresponding of the section of the letter jīm XML files without translation: conversion of text into XML resulting from information extraction and tagging of lemma, part of speech, lexical information, derivational information, and meanings. XML files with translation: enriched with translations of lemmas and corresponding senses using the bilingual dictionary ''An Advanced Learner's Arabic-English Dictionary", by H. Anthony Salmoné and published in 1889. This section contains: 28 chapters, 461 roots and 1921 lexical entrie
    corecore