1,912 research outputs found

    The CoNLL 2007 shared task on dependency parsing

    Get PDF
    The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in 2006, the shared task has been devoted to dependency parsing, this year with both a multilingual track and a domain adaptation track. In this paper, we define the tasks of the different tracks and describe how the data sets were created from existing treebanks for ten languages. In addition, we characterize the different approaches of the participating systems, report the test results, and provide a first analysis of these results

    Knowledge Base Population using Semantic Label Propagation

    Get PDF
    A crucial aspect of a knowledge base population system that extracts new facts from text corpora, is the generation of training data for its relation extractors. In this paper, we present a method that maximizes the effectiveness of newly trained relation extractors at a minimal annotation cost. Manual labeling can be significantly reduced by Distant Supervision, which is a method to construct training data automatically by aligning a large text corpus with an existing knowledge base of known facts. For example, all sentences mentioning both 'Barack Obama' and 'US' may serve as positive training instances for the relation born_in(subject,object). However, distant supervision typically results in a highly noisy training set: many training sentences do not really express the intended relation. We propose to combine distant supervision with minimal manual supervision in a technique called feature labeling, to eliminate noise from the large and noisy initial training set, resulting in a significant increase of precision. We further improve on this approach by introducing the Semantic Label Propagation method, which uses the similarity between low-dimensional representations of candidate training instances, to extend the training set in order to increase recall while maintaining high precision. Our proposed strategy for generating training data is studied and evaluated on an established test collection designed for knowledge base population tasks. The experimental results show that the Semantic Label Propagation strategy leads to substantial performance gains when compared to existing approaches, while requiring an almost negligible manual annotation effort.Comment: Submitted to Knowledge Based Systems, special issue on Knowledge Bases for Natural Language Processin

    Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images

    Get PDF
    Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images of TCGA samples remain underutilized. To highlight this resource, we present mappings of tumorinfiltrating lymphocytes (TILs) based on H&E images from 13 TCGA tumor types. These TIL maps are derived through computational staining using a convolutional neural network trained to classify patches of images. Affinity propagation revealed local spatial structure in TIL patterns and correlation with overall survival. TIL map structural patterns were grouped using standard histopathological parameters. These patterns are enriched in particular T cell subpopulations derived from molecular measures. TIL densities and spatial structure were differentially enriched among tumor types, immune subtypes, and tumor molecular subtypes, implying that spatial infiltrate state could reflect particular tumor cell aberration states. Obtaining spatial lymphocytic patterns linked to the rich genomic characterization of TCGA samples demonstrates one use for the TCGA image archives with insights into the tumor-immune microenvironment

    Analysis of Producer and Consumer Cattle Surveys

    Get PDF
    This thesis presents two separate studies focusing on producers and consumers in the United States cattle industry. The objective of the first study was to analyze the differences between a text cheap talk script and a visual cheap talk script in an online choice experiment to see if it decreased or eliminated hypothetical bias. The product evaluated was Tennessee Certified Beef, specifically USDA Choice boneless ribeye, with other attributes to complement the beef product. Using a random parameters logit model, results indicated that willingness to pay (WTP) estimates for respondents who saw the visual cheap talk script were higher than the WTP estimates for respondents who saw the text cheap talk script. The study also evaluated the respondent’s preferred learning style (visual or verbal) and found that this too had an impact on WTP. The second study’s objective was to analyze the differences between operating and closed dairies in the Southeastern United States through farm and operator characteristics. Probit regression model results indicated variables that were related to the operational status of a dairy such as the number of cows and the dairies average daily production. The study also found there were other factors besides the size of the dairy operation that were significant in determining the operational status of the dairy

    The Impact of a Visual Cheap Talk Script in an Online Choice Experiment

    Get PDF
    Hypothetical bias causes willingness to pay (WTP) values to be inaccurate and is a prevalent issue in choice experiments. Research has shown that a “cheap talk” script may reduce hypothetical bias ; however, it is uncertain which cheap talk script format is the best at controlling hypothetical bias . Therefore, we conduct a choice experiment using a between-subjects design in which half of the articipants saw a “visual” cheap talk script and  half saw a “text” cheap talk script prior to the choice sets. Random parameter logit model results indicate hypothetical bias was more prevalent when participants saw the visual cheap talk script compared to the more conventional text cheap talk script. Text learners also appeared to be less prone to hypothetical bias than visual learners

    Detecting complex events in user-generated video using concept classifiers

    Get PDF
    Automatic detection of complex events in user-generated videos (UGV) is a challenging task due to its new characteristics differing from broadcast video. In this work, we firstly summarize the new characteristics of UGV, and then explore how to utilize concept classifiers to recognize complex events in UGV content. The method starts from manually selecting a variety of relevant concepts, followed byconstructing classifiers for these concepts. Finally, complex event detectors are learned by using the concatenated probabilistic scores of these concept classifiers as features. Further, we also compare three different fusion operations of probabilistic scores, namely Maximum, Average and Minimum fusion. Experimental results suggest that our method provides promising results. It also shows that Maximum fusion tends to give better performance for most complex events
    corecore