1,392 research outputs found
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
Using Global Constraints and Reranking to Improve Cognates Detection
Global constraints and reranking have not been used in cognates detection
research to date. We propose methods for using global constraints by performing
rescoring of the score matrices produced by state of the art cognates detection
systems. Using global constraints to perform rescoring is complementary to
state of the art methods for performing cognates detection and results in
significant performance improvements beyond current state of the art
performance on publicly available datasets with different language pairs and
various conditions such as different levels of baseline state of the art
performance and different data size conditions, including with more realistic
large data size conditions than have been evaluated with in the past.Comment: 10 pages, 6 figures, 6 tables; published in the Proceedings of the
55th Annual Meeting of the Association for Computational Linguistics, pages
1983-1992, Vancouver, Canada, July 201
A coupled terrestrial and aquatic biogeophysical model of the Upper Merrimack River watershed, New Hampshire, to inform ecosystem services evaluation and management under climate and land-cover change
Accurate quantification of ecosystem services (ES) at regional scales is increasingly important for making informed decisions in the face of environmental change. We linked terrestrial and aquatic ecosystem process models to simulate the spatial and temporal distribution of hydrological and water quality characteristics related to ecosystem services. The linked model integrates two existing models (a forest ecosystem model and a river network model) to establish consistent responses to changing drivers across climate, terrestrial, and aquatic domains. The linked model is spatially distributed, accounts for terrestrial–aquatic and upstream–downstream linkages, and operates on a daily time-step, all characteristics needed to understand regional responses. The model was applied to the diverse landscapes of the Upper Merrimack River watershed, New Hampshire, USA. Potential changes in future environmental functions were evaluated using statistically downscaled global climate model simulations (both a high and low emission scenario) coupled with scenarios of changing land cover (centralized vs. dispersed land development) for the time period of 1980–2099. Projections of climate, land cover, and water quality were translated into a suite of environmental indicators that represent conditions relevant to important ecosystem services and were designed to be readily understood by the public. Model projections show that climate will have a greater influence on future aquatic ecosystem services (flooding, drinking water, fish habitat, and nitrogen export) than plausible changes in land cover. Minimal changes in aquatic environmental indicators are predicted through 2050, after which the high emissions scenarios show intensifying impacts. The spatially distributed modeling approach indicates that heavily populated portions of the watershed will show the strongest responses. Management of land cover could attenuate some of the changes associated with climate change and should be considered in future planning for the region
Approaching the Symbol Grounding Problem with Probabilistic Graphical Models
In order for robots to engage in dialog with human teammates, they must have the ability to map between words in the language and aspects of the external world. A solution to this symbol grounding problem (Harnad, 1990) would enable a robot to interpret commands such as “Drive over to receiving and pick up the tire pallet.” In this article we describe several of our results that use probabilistic inference to address the symbol grounding problem. Our specific approach is to develop models that factor according to the linguistic structure of a command. We first describe an early result, a generative model that factors according to the sequential structure of language, and then discuss our new framework, generalized grounding graphs (G3). The G3 framework dynamically instantiates a probabilistic graphical model for a natural language input, enabling a mapping between words in language and concrete objects, places, paths and events in the external world. We report on corpus-based experiments where the robot is able to learn and use word meanings in three real-world tasks: indoor navigation, spatial language video retrieval, and mobile manipulation.U.S. Army Research Laboratory. Collaborative Technology Alliance Program (Cooperative Agreement W911NF-10-2-0016)United States. Office of Naval Research (MURI N00014-07-1-0749
Anaphora resolution for Arabic machine translation :a case study of nafs
PhD ThesisIn the age of the internet, email, and social media there is an increasing need for processing online information, for example, to support education and business. This has led to the rapid development of natural language processing technologies such as computational linguistics, information retrieval, and data mining. As a branch of computational linguistics, anaphora resolution has attracted much interest. This is reflected in the large number of papers on the topic published in journals such as Computational Linguistics. Mitkov (2002) and Ji et al. (2005) have argued that the overall quality of anaphora resolution systems remains low, despite practical advances in the area, and that major challenges include dealing with real-world knowledge and accurate parsing.
This thesis investigates the following research question: can an algorithm be found for the resolution of the anaphor nafs in Arabic text which is accurate to at least 90%, scales linearly with text size, and requires a minimum of knowledge resources? A resolution algorithm intended to satisfy these criteria is proposed. Testing on a corpus of contemporary Arabic shows that it does indeed satisfy the criteria.Egyptian Government
- …