7 research outputs found

    Spatial database modeling for the Atlas Linguistico-Etnografico de Colombia

    Get PDF
    This article presents a methodological proposal for the design and development of a database that allows computer processing, digitization and systematization of lexical, ethnographic and supplementary materials of the Atlas Linguistico-Etnografico de Colombia. Our project aims to safeguard the linguistic heritage of Colombia and provide access and dissemination to the ALEC materials, through the modeling of a spatial database for the design and development of a Geographic Information System and a Web Atlas

    Design and implementation of the web : linguistic and ethnographic atlas of Colombia

    Get PDF
    The Atlas Lingüístico y Etnográfico de Colombia (Linguistic and Ethnographic Atlas of Colombia), known by “ALEC” is a compilation of popular speaking Spanish of the populations of Colombia; such research was carried out for more than fifty years. The result of this work is a collection of thematic maps organized in six volumes and its supplements in analog format. In that sense was created the project entitles “Interactive ALEC” which main objective is to develop a digital and interactive web version of the ethnographic and Linguistic Atlas of Colombia (1983) and its supplements. In this way the Corpus linguistics research group belonging to the Institute Caro y Cuervo and the research group NIDE of the Universidad Distrital “Francisco José de Caldas” have been working together in the design and development of the Atlas Web, that allows the visualization and consulting of the spatial information contained in the volume III of the analog ALEC Atlas, applying concepts of Geographical Information Systems and web cartography. Therefore, the objective of this paper is to show the process of design and development of the web prototype of the ALEC as a collection of static and dynamic maps, which show spatial information, combined with multimedia content, taking into account that in addition to all maps, the total compendium includes images, illustrations, photographs, audio and text comments. Likewise, the interactive ALEC is a good example of how to use geo-technology tools nowadays, because they are essential for the dissemination of geo linguistic information through internet, achieving more access and distribution of the Atlas web

    Análisis dialectométrico del nivel fonético del Atlas Lingüístico Pluridimensional de Panamá

    No full text
    This work examines the dialectal division of Spanish in Panama based on the quantitative distribution of the phonetic features found in the Atlas Linguistico Pluridimensional de Panama (ALPEP) and its dialectometric analysis. The results allowed identifying five dialectal zones: the first zone, central-western composed of Panama, Portobelo, Salud, Penonome, Santa Fe, Chitre, and Puerto Armuelles; the second area, which is known as the western zone, includes Pedasi, Santiago, El Tigre, Tole, Cerro Punta and Changuinola; the third zone, central-eastern, formed by Meteti and Canita; the fourth zone, eastern, given by the populations of La Palma and Yaviza; and the Guna Yala zone of El Porvenir

    Dictionary Writing Systems y otras herramientas informáticas para la elaboración, administración y publicación de diccionarios

    No full text
    This work describes the functioning, components, main features, and some examples of Dictionary Writing Systems (DWS), which are essential informatic tools for dictionary writing and management. This implies the automation of some processes of write entries, the use of dictionary databases for other purposes, the ease of teamwork remote working, etc. These functionalities are related to the typology of DWS or other programs and the types of dictionaries or purposes they were designed to address.Este texto se centra en describir el funcionamiento, los componentes, las características principales y algunos ejemplos de los Dictionary Writing Systems (DWS), los cuales son herramientas informáticas fundamentales para la escritura y administración de diccionarios. Esto implica la automatización en algunos procesos de escritura de las entradas, el uso de bases de datos de diccionarios para otros propósitos, la facilidad de trabajo en equipo de forma remota, entre otros. Estas funcionalidades se vinculan con el tipo de DWS u otras herramientas, y los tipos de diccionarios o fines para los que fueron diseñados

    Dialectones : finding statistically significant dialectal boundaries using twitter data

    No full text
    Most NLP applications assume that a particular language is homogeneous in the regions where it is spoken. However, each language varies considerably throughout its geographical distribution. To make NLP sensitive to dialects, a reliable, representative and up-to-date source of information that quantitatively represents such geographical variation is necessary. However, some of the current approaches have disadvantages such as the need for parameters, the disregard of the geographical coordinates in the analysis, and the use of linguistic alternations that presuppose the existence of specific dialectal varieties. Detection of "ecotones" is an analogous problem in the field of ecology that focuses on the identification of boundaries, instead of regions, in ecosystems facilitating the construction of statistical tests. We adapted the concept of "ecotone" to "dialectone" for the detection of dialectal boundaries by using two non-parametric statistical tests: the Hilbert-Schmidt independence criterion (HSIC) and the Wilcoxon signed-rank. The proposed method was applied to a large corpus of Spanish tweets produced in 160 locations in Colombia through the analysis of unigram features. The resulting dialectones showed to be meaningful but difficult to compare against regions identified by other authors using classical dialectometry. We concluded that the automatic detection of dialectones is convenient alternative to classical methods in dialectometry and a potential source of information for automatic language applications

    El componente del subcorpus Oral ALEC como muestra del Espanol hablado en la Amazonía Colombiana

    No full text
    The Oral Corpus of the Atlas Lingüístico-Etnográfico de Colombia (ALEC) is composed of 650 hours of recordings collected during research conducted in Colombia between 1956 and 1983. This corpus includes 45 speaking sample audio files collected in the Amazonian region (departments of Putumayo, Caquetá, Amazonas). These sound files are a representative sample of the Spanish spoken in the region. The Corpus Linguistics and Computational Research Group (grupo de Lingüística de corpus y computacional, LICC) of the Instituto Caro y Cuervo has worked on the systematization of these files using both structural and descriptive metadata. The goal is to make available linguistic conservation material to the academic community, with information that can be exploited for diverse quantitative and/or qualitative analyses from different language levels (phonetic-phonological, morphosyntactic, lexical-semantic, and pragmatic.)