1,392 research outputs found

    Computational Sociolinguistics: A Survey

    Get PDF
    Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

    Using Global Constraints and Reranking to Improve Cognates Detection

    Full text link
    Global constraints and reranking have not been used in cognates detection research to date. We propose methods for using global constraints by performing rescoring of the score matrices produced by state of the art cognates detection systems. Using global constraints to perform rescoring is complementary to state of the art methods for performing cognates detection and results in significant performance improvements beyond current state of the art performance on publicly available datasets with different language pairs and various conditions such as different levels of baseline state of the art performance and different data size conditions, including with more realistic large data size conditions than have been evaluated with in the past.Comment: 10 pages, 6 figures, 6 tables; published in the Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, pages 1983-1992, Vancouver, Canada, July 201

    A coupled terrestrial and aquatic biogeophysical model of the Upper Merrimack River watershed, New Hampshire, to inform ecosystem services evaluation and management under climate and land-cover change

    Get PDF
    Accurate quantification of ecosystem services (ES) at regional scales is increasingly important for making informed decisions in the face of environmental change. We linked terrestrial and aquatic ecosystem process models to simulate the spatial and temporal distribution of hydrological and water quality characteristics related to ecosystem services. The linked model integrates two existing models (a forest ecosystem model and a river network model) to establish consistent responses to changing drivers across climate, terrestrial, and aquatic domains. The linked model is spatially distributed, accounts for terrestrial–aquatic and upstream–downstream linkages, and operates on a daily time-step, all characteristics needed to understand regional responses. The model was applied to the diverse landscapes of the Upper Merrimack River watershed, New Hampshire, USA. Potential changes in future environmental functions were evaluated using statistically downscaled global climate model simulations (both a high and low emission scenario) coupled with scenarios of changing land cover (centralized vs. dispersed land development) for the time period of 1980–2099. Projections of climate, land cover, and water quality were translated into a suite of environmental indicators that represent conditions relevant to important ecosystem services and were designed to be readily understood by the public. Model projections show that climate will have a greater influence on future aquatic ecosystem services (flooding, drinking water, fish habitat, and nitrogen export) than plausible changes in land cover. Minimal changes in aquatic environmental indicators are predicted through 2050, after which the high emissions scenarios show intensifying impacts. The spatially distributed modeling approach indicates that heavily populated portions of the watershed will show the strongest responses. Management of land cover could attenuate some of the changes associated with climate change and should be considered in future planning for the region

    Approaching the Symbol Grounding Problem with Probabilistic Graphical Models

    Get PDF
    In order for robots to engage in dialog with human teammates, they must have the ability to map between words in the language and aspects of the external world. A solution to this symbol grounding problem (Harnad, 1990) would enable a robot to interpret commands such as “Drive over to receiving and pick up the tire pallet.” In this article we describe several of our results that use probabilistic inference to address the symbol grounding problem. Our specific approach is to develop models that factor according to the linguistic structure of a command. We first describe an early result, a generative model that factors according to the sequential structure of language, and then discuss our new framework, generalized grounding graphs (G3). The G3 framework dynamically instantiates a probabilistic graphical model for a natural language input, enabling a mapping between words in language and concrete objects, places, paths and events in the external world. We report on corpus-based experiments where the robot is able to learn and use word meanings in three real-world tasks: indoor navigation, spatial language video retrieval, and mobile manipulation.U.S. Army Research Laboratory. Collaborative Technology Alliance Program (Cooperative Agreement W911NF-10-2-0016)United States. Office of Naval Research (MURI N00014-07-1-0749

    Anaphora resolution for Arabic machine translation :a case study of nafs

    Get PDF
    PhD ThesisIn the age of the internet, email, and social media there is an increasing need for processing online information, for example, to support education and business. This has led to the rapid development of natural language processing technologies such as computational linguistics, information retrieval, and data mining. As a branch of computational linguistics, anaphora resolution has attracted much interest. This is reflected in the large number of papers on the topic published in journals such as Computational Linguistics. Mitkov (2002) and Ji et al. (2005) have argued that the overall quality of anaphora resolution systems remains low, despite practical advances in the area, and that major challenges include dealing with real-world knowledge and accurate parsing. This thesis investigates the following research question: can an algorithm be found for the resolution of the anaphor nafs in Arabic text which is accurate to at least 90%, scales linearly with text size, and requires a minimum of knowledge resources? A resolution algorithm intended to satisfy these criteria is proposed. Testing on a corpus of contemporary Arabic shows that it does indeed satisfy the criteria.Egyptian Government
    corecore