81 research outputs found

    Paronyms for Accelerated Correction of Semantic Errors

    Get PDF
    * Work done under partial support of Mexican Government (CONACyT, SNI), IPN (CGPI, COFAA) and Korean Government (KIPA Professorship for Visiting Faculty Positions). The second author is currently on Sabbatical leave at Chung-Ang University.The errors usually made by authors during text preparation are classified. The notion of semantic errors is elaborated, and malapropisms are pointed among them as “similar” to the intended word but essentially distorting the meaning of the text. For whatever method of malapropism correction, we propose to beforehand compile dictionaries of paronyms, i.e. of words similar to each other in letters, sounds or morphs. The proposed classification of errors and paronyms is illustrated by English and Russian examples being valid for many languages. Specific dictionaries of literal and morphemic paronyms are compiled for Russian. It is shown that literal paronyms drastically cut down (up to 360 times) the search of correction candidates, while morphemic paronyms permit to correct errors not studied so far and characteristic for foreigners

    Experiments in Detection and Correction of Russian Malapropisms by Means of the WEB

    Get PDF
    Malapropism is a semantic error that is hardly detectable because it usually retains syntactical links between words in the sentence but replaces one content word by a similar word with quite different meaning. A method of automatic detection of malapropisms is described, based on Web statistics and a specially defined Semantic Compatibility Index (SCI). For correction of the detected errors, special dictionaries and heuristic rules are proposed, which retains only a few highly SCI-ranked correction candidates for the user’s selection. Experiments on Web-assisted detection and correction of Russian malapropisms are reported, demonstrating efficacy of the described method

    Una aproximación para resolución de ambigüedad estructural empleando tres mecanismos diferentes

    Get PDF
    La ambigüedad estructural es uno de los problemas más difíciles de resolver en sistemas de procesamiento de lenguaje natural. Consideramos dos tipos de resolución de ambigüedad estructural que pueden emplearse en el análisis de textos sin restricciones: conocimiento léxico y cierta clase de contexto. En este trabajo, proponemos un modelo basado en tres diferentes mecanismos para revelar la estructura sintáctica correcta y un módulo de clasificación para obtener las estructuras más probables para la oración analizada. Nuestro modelo está dirigido al análisis de textos sin restricciones y las herramientas desarrolladas no requieren ninguna desambiguación de marcas morfológicas ni ningún tipo de marcas sintácticas.Trabajo hecho con apoyo parcial del CONACyT, SNI y CGEPI-IPN, México

    Crowdsourcing Fungal Biodiversity : Revision of Inaturalist Observations in Northwestern Siberia

    Get PDF
    The paper presents the first analysis of crowdsourcing data of all observations of fungi (including lichens) and myxomycetes in Northwestern Siberia uploaded to iNaturalist.org to date (24.02.2022). The Introduction presents an analysis of fungal diversity crowdsourcing globally, in Russia, and in the region of interest. Materials and methods describe the protocol of uploading data to iNaturalist.org, the structure of the crowdsourcing community. initiative to revise the accumulated data. procedures of data analysis, and compilation of a dataset of revised crowdsourced data. The Results present the analysis of accumulated data by several parameters: temporal, geographical and taxonomical scope, observation and identification efforts, identifiability of various taxa, species novelty and Red Data Book categories and the protection status of registered observations. The Discussion provides data on usability of crowdsourcing data for biodiversity research and conservation of fungi, including pros and contras. The Electronic Supplements to the paper include an annotated checklist of observations of protected species with information on Red Data Book categories and the protection status, and an annotated checklist of regional records of new taxa. The paper is supplemented with a dataset of about 15 000 revised and annotated records available through Global Biodiversity Information Facility (GBIF). The tradition of crowdsourcing is rooted in mycological societies around the world, including Russia. In Northwestern Siberia, a regional mycological club was established in 2018, encouraging its members to contribute observations of fungi on iNaturalist.org. A total of about 15 000 observations of fungi and myxomycetes were uploaded so far, by about 200 observers, from three administrative regions (Yamalo-Nenetsky Autonomous Okrug, Khanty-Mansi Autonomous Okrug, and Tyumen Region). The geographical coverage of crowdsourcing observations remains low. However. the observation activity has increased in the last four years. The goal of this study consisted of a collaborative effort of professional mycologists invited to help with the identification of these observations and analysis of the accumulated data. As a result, all observations were reviewed by at least one expert. About half of all the observations have been identified reliably to the species level and received Research Grade status. Of those, 90 species (195 records) represented records of taxa new to their respective regions: 876 records of 53 species of protected species provide important data for conservation programmes. The other half of the observations consists of records still under-identified for various reasons: poor quality photographs, complex taxa (impossible to identify without microscopic or molecular study). or lack of experts in a particular taxonomic group. The Discussion section summarises the pros and cons of the use of crowdsourcing for the study and conservation of regional fungal diversity, and summarises the dispute on this subject among mycologists. Further research initiatives involving crowdsourcing data must focus on an increase in the quality of observations and strive to introduce the habit of collecting voucher specimens among the community of amateurs. The timely feedback from experts is also important to provide quality and the increase of personal involvement.Peer reviewe

    Phenological shifts of abiotic events, producers and consumers across a continent

    Get PDF
    Ongoing climate change can shift organism phenology in ways that vary depending on species, habitats and climate factors studied. To probe for large-scale patterns in associated phenological change, we use 70,709 observations from six decades of systematic monitoring across the former Union of Soviet Socialist Republics. Among 110 phenological events related to plants, birds, insects, amphibians and fungi, we find a mosaic of change, defying simple predictions of earlier springs, later autumns and stronger changes at higher latitudes and elevations. Site mean temperature emerged as a strong predictor of local phenology, but the magnitude and direction of change varied with trophic level and the relative timing of an event. Beyond temperature-associated variation, we uncover high variation among both sites and years, with some sites being characterized by disproportionately long seasons and others by short ones. Our findings emphasize concerns regarding ecosystem integrity and highlight the difficulty of predicting climate change outcomes. The authors use systematic monitoring across the former USSR to investigate phenological changes across taxa. The long-term mean temperature of a site emerged as a strong predictor of phenological change, with further imprints of trophic level, event timing, site, year and biotic interactions.Peer reviewe

    Chronicles of nature calendar, a long-term and large-scale multitaxon database on phenology

    Get PDF
    We present an extensive, large-scale, long-term and multitaxon database on phenological and climatic variation, involving 506,186 observation dates acquired in 471 localities in Russian Federation, Ukraine, Uzbekistan, Belarus and Kyrgyzstan. The data cover the period 1890-2018, with 96% of the data being from 1960 onwards. The database is rich in plants, birds and climatic events, but also includes insects, amphibians, reptiles and fungi. The database includes multiple events per species, such as the onset days of leaf unfolding and leaf fall for plants, and the days for first spring and last autumn occurrences for birds. The data were acquired using standardized methods by permanent staff of national parks and nature reserves (87% of the data) and members of a phenological observation network (13% of the data). The database is valuable for exploring how species respond in their phenology to climate change. Large-scale analyses of spatial variation in phenological response can help to better predict the consequences of species and community responses to climate change.Peer reviewe
    corecore