31,279 research outputs found

    A random forest system combination approach for error detection in digital dictionaries

    Full text link
    When digitizing a print bilingual dictionary, whether via optical character recognition or manual entry, it is inevitable that errors are introduced into the electronic version that is created. We investigate automating the process of detecting errors in an XML representation of a digitized print dictionary using a hybrid approach that combines rule-based, feature-based, and language model-based methods. We investigate combining methods and show that using random forests is a promising approach. We find that in isolation, unsupervised methods rival the performance of supervised methods. Random forests typically require training data so we investigate how we can apply random forests to combine individual base methods that are themselves unsupervised without requiring large amounts of training data. Experiments reveal empirically that a relatively small amount of data is sufficient and can potentially be further reduced through specific selection criteria.Comment: 9 pages, 7 figures, 10 tables; appeared in Proceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data, April 201

    Natural hybridization between Populus nigra L. and P. x canadensis Moench. Hybrid offspring competes for niches along the Rhine river in the Netherlands

    Get PDF
    Black poplar (Populus nigra L.) is a major species for European riparian forests but its abundance has decreased over the decades due to human influences. For restoration of floodplain woodlands, the remaining black poplar stands may act as source population. A potential problem is that P. nigra and Populus deltoides have contributed to many interspecific hybrids, which have been planted in large numbers. As these Populus x canadensis clones have the possibility to intercross with wild P. nigra trees, their offspring could establish themselves along European rivers. In this study, we have sampled 44 poplar seedlings and young trees that occurred spontaneously along the Rhine river and its tributaries in the Netherlands. Along these rivers, only a few native P. nigra L. populations exist in combination with many planted cultivated P. x canadensis trees. By comparison to reference material from P. nigra, P. deltoides and P. x canadensis, species-specific AFLP bands and microsatellite alleles indicated that nearly half of the sampled trees were not pure P. nigra but progeny of natural hybridisation that had colonised the Rhine river banks. The posterior probability method as implemented in NewHybrids using microsatellite data was the superior method in establishing the most likely parentage. The results of this study indicate that offspring of hybrid cultivated poplars compete for the same ecological niche as native black poplars

    Wild dogs at stake: deforestation threatens the only Amazon endemic canid, the short-eared dog (Atelocynus microtis)

    Get PDF
    The persistent high deforestation rate and fragmentation of the Amazon forests are the main threats to their biodiversity. To anticipate and mitigate these threats, it is important to understand and predict how species respond to the rapidly changing landscape. The short-eared dog Atelocynus microtis is the only Amazon-endemic canid and one of the most understudied wild dogs worldwide. We investigated short-eared dog habitat associations on two spatial scales. First, we used the largest record database ever compiled for short-eared dogs in combination with species distribution models to map species habitat suitability, estimate its distribution range and predict shifts in species distribution in response to predicted deforestation across the entire Amazon (regional scale). Second, we used systematic camera trap surveys and occupancy models to investigate how forest cover and forest fragmentation affect the space use of this species in the Southern Brazilian Amazon (local scale). Species distribution models suggested that the short-eared dog potentially occurs over an extensive and continuous area, through most of the Amazon region south of the Amazon River. However, approximately 30% of the short-eared dog's current distribution is expected to be lost or suffer sharp declines in habitat suitability by 2027 (within three generations) due to forest loss. This proportion might reach 40% of the species distribution in unprotected areas and exceed 60% in some interfluves (i.e. portions of land separated by large rivers) of the Amazon basin. Our local-scale analysis indicated that the presence of forest positively affected short-eared dog space use, while the density of forest edges had a negative effect. Beyond shedding light on the ecology of the short-eared dog and refining its distribution range, our results stress that forest loss poses a serious threat to the conservation of the species in a short time frame. Hence, we propose a re-assessment of the short-eared dog's current IUCN Red List status (Near Threatened) based on findings presented here. Our study exemplifies how data can be integrated across sources and modelling procedures to improve our knowledge of relatively understudied species
    • …
    corecore