576 research outputs found

    Unsupervised induction of semantic roles

    Get PDF
    In recent years, a considerable amount of work has been devoted to the task of automatic frame-semantic analysis. Given the relative maturity of syntactic parsing technology, which is an important prerequisite, frame-semantic analysis represents a realistic next step towards broad-coverage natural language understanding and has been shown to benefit a range of natural language processing applications such as information extraction and question answering. Due to the complexity which arises from variations in syntactic realization, data-driven models based on supervised learning have become the method of choice for this task. However, the reliance on large amounts of semantically labeled data which is costly to produce for every language, genre and domain, presents a major barrier to the widespread application of the supervised approach. This thesis therefore develops unsupervised machine learning methods, which automatically induce frame-semantic representations without making use of semantically labeled data. If successful, unsupervised methods would render manual data annotation unnecessary and therefore greatly benefit the applicability of automatic framesemantic analysis. We focus on the problem of semantic role induction, in which all the argument instances occurring together with a specific predicate in a corpus are grouped into clusters according to their semantic role. Our hypothesis is that semantic roles can be induced without human supervision from a corpus of syntactically parsed sentences, by leveraging the syntactic relations conveyed through parse trees with lexical-semantic information. We argue that semantic role induction can be guided by three linguistic principles. The first is the well-known constraint that semantic roles are unique within a particular frame. The second is that the arguments occurring in a specific syntactic position within a specific linking all bear the same semantic role. The third principle is that the (asymptotic) distribution over argument heads is the same for two clusters which represent the same semantic role. We consider two approaches to semantic role induction based on two fundamentally different perspectives on the problem. Firstly, we develop feature-based probabilistic latent structure models which capture the statistical relationships that hold between the semantic role and other features of an argument instance. Secondly, we conceptualize role induction as the problem of partitioning a graph whose vertices represent argument instances and whose edges express similarities between these instances. The graph thus represents all the argument instances for a particular predicate occurring in the corpus. The similarities with respect to different features are represented on different edge layers and accordingly we develop algorithms for partitioning such multi-layer graphs. We empirically validate our models and the principles they are based on and show that our graph partitioning models have several advantages over the feature-based models. In a series of experiments on both English and German the graph partitioning models outperform the feature-based models and yield significantly better scores over a strong baseline which directly identifies semantic roles with syntactic positions. In sum, we demonstrate that relatively high-quality shallow semantic representations can be induced without human supervision and foreground a promising direction of future research aimed at overcoming the problem of acquiring large amounts of lexicalsemantic knowledge

    Step Two in Flood Recovery of Pastures Is Renovation

    Get PDF
    As flood waters recede, the renovation of flooded pastures is just beginning. Now is a good time to check pasture plants for survival. Forage production is a function of the plant species, and their density and growth. Evaluate live plants (plant vigor), plant density, and desirable species versus weeds

    Unsupervised Induction of Semantic Roles

    Get PDF
    Datasets annotated with semantic roles are an important prerequisite to developing highperformance role labeling systems. Unfortunately, the reliance on manual annotations, which are both difficult and highly expensive to produce, presents a major obstacle to the widespread application of these systems across different languages and text genres. In this paper we describe a method for inducing the semantic roles of verbal arguments directly from unannotated text. We formulate the role induction problem as one of detecting alternations and finding a canonical syntactic form for them. Both steps are implemented in a novel probabilistic model, a latent-variable variant of the logistic classifier. Our method increases the purity of the induced role clusters by a wide margin over a strong baseline.

    Unsupervised Semantic Role Induction via Split-Merge Clustering

    Get PDF
    In this paper we describe an unsupervised method for semantic role induction which holds promise for relieving the data acquisition bottleneck associated with supervised role labelers. We present an algorithm that iteratively splits and merges clusters representing semantic roles, thereby leading from an initial clustering to a final clustering of better quality. The method is simple, surprisingly effective, and allows to integrate linguistic knowledge transparently. By combining role induction with a rule-based component for argument identification we obtain an unsupervised end-to-end semantic role labeling system. Evaluation on the CoNLL 2008 benchmark dataset demonstrates that our method outperforms competitive unsupervised approaches by a wide margin.

    The 20-m shuttle run: Assessment and interpretation of data in relation to youth aerobic fitness and health

    Get PDF
    Cardiorespiratory fitness (CRF) is a good summative measure of the body’s ability to perform continuous, rhythmic, dynamic, large-muscle group physical activity, and exercise. In children, CRF is meaningfully associated with health, independent of physical activity levels, and it is an important determinant of sports and athletic performance. Although gas-analyzed peak oxygen uptake is the criterion physiological measure of children’s CRF, it is not practical for population-based testing. Field testing offers a simple, cheap, practical alternative to gas analysis. The 20-m shuttle run test (20mSRT)—a progressive aerobic exercise test involving continuous running between 2 lines 20 m apart in time to audio signals—is probably the most widely used field test of CRF. This review aims to clarify the international utility of the 20mSRT by synthesizing the evidence describing measurement variability, validity, reliability, feasibility, and the interpretation of results, as well as to provide future directions for international surveillance. The authors show that the 20mSRT is an acceptable, feasible, and scalable measure of CRF and functional/exercise capacity, and that it has moderate criterion validity and high to very high reliability. The assessment is pragmatic, easily interpreted, and results are transferable to meaningful and understandable situations. The authors recommend that CRF, assessed by the 20mSRT, be considered as an international population health surveillance measure to provide additional insight into pediatric population health

    Cardiorespiratory fitness in children: Evidence for criterion-referenced cut-points

    Get PDF
    Introduction Criterion-referenced cut-points for field-based cardiorespiratory fitness for children (CRF) are lacking. This study determined: (a) the association between CRF and obesity, (b) the optimal cut-points for low CRF associated with obesity in children, and (c) the association between obesity and peak oxygen uptake () estimated from the 20-m shuttle run test using two different prediction equations. Methods A total of 8,740 children aged 10.1±1.2 were recruited from 11 sites across Canada. CRF was assessed using 20mSRT reported as running speed at the last completed stage, number of completed laps and predicted , which was estimated at the age by sex level using the Léger et al. and FitnessGram equations. Body mass index and waist circumference z-scores were used to identify obesity. Receiver operating characteristic (ROC) curves and logistic regression determined the discriminatory ability of CRF for predicting obesity. Results 20mSRT had satisfactory predictive ability to detect obesity estimated by BMI, WC, and BMI and WC combined (area under the curve [AUC]\u3e0.65). The FitnessGram equation (AUC\u3e0.71) presented somewhat higher discriminatory power for obesity than the equation of Léger et al. (AUC\u3e0.67) at most ages. Sensitivity was strong (\u3e70%) for all age- and sex-specific cut-points, with optimal cut-points in 8- to 12-year-olds for obesity identified as 39 mL•kg-1•min-1(laps: 15; speed: 9.0 km/h) and 41 mL•kg-1•min-1 (laps: 15–17; speed: 9.0 km/h) for girls and boys, respectively. Conclusions 20mSRT performance is negatively associated with obesity and CRF cut-points from ROC analyses have good discriminatory power for obesity

    Testing validity of FitnessGram in two samples of US adolescents (12–15 years)

    Get PDF
    Background This study examined the validity of the FitnessGram® criterion-reference cut-points for cardiorespiratory fitness (CRF) based on two samples of US adolescents (aged 12–15 years). This study also established the CRF cut-points for metabolically healthy weight status based on a recent national fitness survey for the purposes of cross-validating with pre-existing cut-points including FitnessGram. Methods Two cross-sectional data from the 2003–2004 National Health and Nutrition Examination Survey (NHANES) (n = 378) and 2012 NHANES National Youth Fitness Survey (NNYFS) (n = 451) were used. CRF (estimated V˙ role= presentation style= box-sizing: border-box; margin: 0px; padding: 0px; display: inline-block; line-height: normal; font-size: 16.2px; word-spacing: normal; overflow-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; position: relative; \u3eV˙O2max in mL/kg/min) was estimated from a submaximal exercise test. CRF categories based on FitnessGram cut-points, a clustered cardiometabolic risk factors score and weight status were used. A series of Receiver Operating Characteristic (ROC) curve analyses were conducted to identify age- and sex-specific CRF cut-points that were optimal for metabolically healthy weight status. Results Based on FitnessGram cut-points, having high risk CRF, but not low risk CRF, was associated with high cardiometabolic risk (OR = 3.17, 95% CI = 1.14–8.79) and unhealthy weight status (OR = 5.81, 95% CI = 3.49–9.68). The optimal CRF cut-points for 12-13-year-olds and 14-15-year-olds were 40 and 43 mL/kg/min in males and 39 and 34 mL/kg/min in females, respectively. Compared to meeting new CRF cut-points, not meeting new CRF cut-points was associated with higher odds of showing high cardiometabolic risk (OR = 2.91, 95% CI = 1.47–5.77) and metabolically unhealthy weight status (OR = 4.47, 95% CI = 2.83–7.05). Conclusion FitnessGram CRF cut-point itself has rarely been scrutinized in previous literature. Our findings provide partial support for FitnessGram based on two samples of US adolescents. CRF cut-points established in this study supports international criterion-referenced cut-points as well as FitnessGram cut-points only for males. FitnessGram should be continuously monitored and scrutinized using different samples
    • …
    corecore