17 research outputs found

    Treebanking user-generated content: A proposal for a unified representation in universal dependencies

    Get PDF
    The paper presents a discussion on the main linguistic phenomena of user-generated texts found in web and social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework. Given on the one hand the increasing number of treebanks featuring user-generated content, and its somewhat inconsistent treatment in these resources on the other, the aim of this paper is twofold: (1) to provide a short, though comprehensive, overview of such treebanks - based on available literature - along with their main features and a comparative analysis of their annotation criteria, and (2) to propose a set of tentative UD-based annotation guidelines, to promote consistent treatment of the particular phenomena found in these types of texts. The main goal of this paper is to provide a common framework for those teams interested in developing similar resources in UD, thus enabling cross-linguistic consistency, which is a principle that has always been in the spirit of UD

    Treebanking user-generated content: a proposal for a unified representation in universal dependencies

    Get PDF
    The paper presents a discussion on the main linguistic phenomena of user-generated texts found in web and social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework. Given on the one hand the increasing number of treebanks featuring user-generated content, and its somewhat inconsistent treatment in these resources on the other, the aim of this paper is twofold: (1) to provide a short, though comprehensive, overview of such treebanks - based on available literature - along with their main features and a comparative analysis of their annotation criteria, and (2) to propose a set of tentative UD-based annotation guidelines, to promote consistent treatment of the particular phenomena found in these types of texts. The main goal of this paper is to provide a common framework for those teams interested in developing similar resources in UD, thus enabling cross-linguistic consistency, which is a principle that has always been in the spirit of UD

    Limits to reproduction and seed size-number trade-offs that shape forest dominance and future recovery

    Get PDF
    International audienceThe relationships that control seed production in trees are fundamental to understanding the evolution of forest species and their capacity to recover from increasing losses to drought, fire, and harvest. A synthesis of fecundity data from 714 species worldwide allowed us to examine hypotheses that are central to quantifying reproduction, a foundation for assessing fitness in forest trees. Four major findings emerged. First, seed production is not constrained by a strict trade-off between seed size and numbers. Instead, seed numbers vary over ten orders of magnitude, with species that invest in large seeds producing more seeds than expected from the 1:1 trade-off. Second, gymnosperms have lower seed production than angiosperms, potentially due to their extra investments in protective woody cones. Third, nutrient-demanding species, indicated by high foliar phosphorus concentrations, have low seed production. Finally, sensitivity of individual species to soil fertility varies widely, limiting the response of community seed production to fertility gradients. In combination, these findings can inform models of forest response that need to incorporate reproductive potential

    Limits to reproduction and seed size-number tradeoffs that shape forest dominance and future recovery

    Get PDF
    The relationships that control seed production in trees are fundamental to understanding the evolution of forest species and their capacity to recover from increasing losses to drought, fire, and harvest. A synthesis of fecundity data from 714 species worldwide allowed us to examine hypotheses that are central to quantifying reproduction, a foundation for assessing fitness in forest trees. Four major findings emerged. First, seed production is not constrained by a strict trade-off between seed size and numbers. Instead, seed numbers vary over ten orders of magnitude, with species that invest in large seeds producing more seeds than expected from the 1:1 trade-off. Second, gymnosperms have lower seed production than angiosperms, potentially due to their extra investments in protective woody cones. Third, nutrient-demanding species, indicated by high foliar phosphorus concentrations, have low seed production. Finally, sensitivity of individual species to soil fertility varies widely, limiting the response of community seed production to fertility gradients. In combination, these findings can inform models of forest response that need to incorporate reproductive potential

    Treebanking user-generated content: A proposal for a unified representation in universal dependencies

    No full text
    The paper presents a discussion on the main linguistic phenomena of user-generated texts found in web and social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework. Given on the one hand the increasing number of treebanks featuring user-generated content, and its somewhat inconsistent treatment in these resources on the other, the aim of this paper is twofold: (1) to provide a short, though comprehensive, overview of such treebanks - based on available literature - along with their main features and a comparative analysis of their annotation criteria, and (2) to propose a set of tentative UD-based annotation guidelines, to promote consistent treatment of the particular phenomena found in these types of texts. The main goal of this paper is to provide a common framework for those teams interested in developing similar resources in UD, thus enabling cross-linguistic consistency, which is a principle that has always been in the spirit of UD

    Treebanking user-generated content: a proposal for a unified representation in universal dependencies

    No full text
    The paper presents a discussion on the main linguistic phenomena of user-generated texts found in web and social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework. Given on the one hand the increasing number of treebanks featuring user-generated content, and its somewhat inconsistent treatment in these resources on the other, the aim of this paper is twofold: (1) to provide a short, though comprehensive, overview of such treebanks - based on available literature - along with their main features and a comparative analysis of their annotation criteria, and (2) to propose a set of tentative UD-based annotation guidelines, to promote consistent treatment of the particular phenomena found in these types of texts. The main goal of this paper is to provide a common framework for those teams interested in developing similar resources in UD, thus enabling cross-linguistic consistency, which is a principle that has always been in the spirit of UD

    Treebanking user-generated content: a UD based overview of guidelines, corpora and unified recommendations

    Get PDF
    This article presents a discussion on the main linguistic phenomena which cause difficulties in the analysis of user-generated texts found on the web and in social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework of syntactic analysis. Given on the one hand the increasing number of treebanks featuring user-generated content, and its somewhat inconsistent treatment in these resources on the other, the aim of this article is twofold: (1) to provide a condensed, though comprehensive, overview of such treebanks—based on available literature—along with their main features and a comparative analysis of their annotation criteria, and (2) to propose a set of tentative UD-based annotation guidelines, to promote consistent treatment of the particular phenomena found in these types of texts. The overarching goal of this article is to provide a common framework for researchers interested in developing similar resources in UD, thus promoting cross-linguistic consistency, which is a principle that has always been central to the spirit of UD

    Treebanking User-Generated Content: A Proposal for a Unified Representation in Universal Dependencies

    Get PDF
    The paper presents a discussion on the main linguistic phenomena of user-generated texts found in web and social media, and proposes a set of annotation guidelines for their treatment within the Universal Dependencies (UD) framework. Given on the one hand the increasing number of treebanks featuring user-generated content, and its somewhat inconsistent treatment in these resources on the other, the aim of this paper is twofold: (1) to provide a short, though comprehensive, overview of such treebanks - based on available literature - along with their main features and a comparative analysis of their annotation criteria, and (2) to propose a set of tentative UD-based annotation guidelines, to promote consistent treatment of the particular phenomena found in these types of texts. The main goal of this paper is to provide a common framework for those teams interested in developing similar resources in UD, thus enabling cross-linguistic consistency, which is a principle that has always been in the spirit of UD
    corecore