6,056 research outputs found

    A large annotated corpus for learning natural language inference

    Full text link
    Understanding entailment and contradiction is fundamental to understanding natural language, and inference about entailment and contradiction is a valuable testing ground for the development of semantic representations. However, machine learning research in this area has been dramatically limited by the lack of large-scale resources. To address this, we introduce the Stanford Natural Language Inference corpus, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning. At 570K pairs, it is two orders of magnitude larger than all other resources of its type. This increase in scale allows lexicalized classifiers to outperform some sophisticated existing entailment models, and it allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.Comment: To appear at EMNLP 2015. The data will be posted shortly before the conference (the week of 14 Sep) at http://nlp.stanford.edu/projects/snli

    Ask, and shall you receive?: Understanding Desire Fulfillment in Natural Language Text

    Full text link
    The ability to comprehend wishes or desires and their fulfillment is important to Natural Language Understanding. This paper introduces the task of identifying if a desire expressed by a subject in a given short piece of text was fulfilled. We propose various unstructured and structured models that capture fulfillment cues such as the subject's emotional state and actions. Our experiments with two different datasets demonstrate the importance of understanding the narrative and discourse structure to address this task

    The SNLI Corpus

    Get PDF
    The SNLI corpus (version 1.0) is a collection of 570k human-written English sentence pairs manually labeled for balanced classification with the labels entailment, contradiction, and neutral, supporting the task of natural language inference (NLI), also known as recognizing textual entailment (RTE). We aim for it to serve both as a benchmark for evaluating representational systems for text, especially including those induced by representation learning methods, as well as a resource for developing NLP models of any kind.We gratefully acknowledge support from a Google Faculty Research Award, a gift from Bloomberg L.P., the Defense Advanced Research Projects Agency (DARPA) Deep Exploration and Filtering of Text (DEFT) Program under Air Force Research Laboratory (AFRL) contract no. FA8750-13-2-0040, the National Science Foundation under grant no. IIS 1159679, and the Department of the Navy, Office of Naval Research, under grant no. N00014-10-1-0109. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of Google, Bloomberg L.P., DARPA, AFRL NSF, ONR, or the US government. We also thank our many excellent Mechanical Turk contributors

    Towards a killer app for the Semantic Web

    Get PDF
    Killer apps are highly transformative technologies that create new markets and widespread patterns of behaviour. IT generally, and the Web in particular, has benefited from killer apps to create new networks of users and increase its value. The Semantic Web community on the other hand is still awaiting a killer app that proves the superiority of its technologies. There are certain features that distinguish killer apps from other ordinary applications. This paper examines those features in the context of the Semantic Web, in the hope that a better understanding of the characteristics of killer apps might encourage their consideration when developing Semantic Web applications

    Visualization of database structures for information retrieval

    Get PDF
    This paper describes the Book House system, which is designed to support children's information retrieval in libraries as part of their education. It is a shareware program available on CD‐ROM or floppy disks, and comprises functionality for database searching as well as for classifying and storing book information in the database. The system concept is based on an understanding of children's domain structures and their capabilities for categorization of information needs in connection with their activities in schools, in school libraries or in public libraries. These structures are visualized in the interface by using metaphors and multimedia technology. Through the use of text, images and animation, the Book House encourages children ‐ even at a very early age ‐ to learn by doing in an enjoyable way, which plays on their previous experiences with computer games. Both words and pictures can be used for searching; this makes the system suitable for all age groups. Even children who have not yet learned to read properly can, by selecting pictures, search for and find those books they would like to have read aloud. Thus, at the very beginning of their school life, they can learn to search for books on their own. For the library community, such a system will provide an extended service which will increase the number of children's own searches and also improve the relevance, quality and utilization of the book collections in the libraries. A market research report on the need for an annual indexing service for books in the Book House format is in preparation by the Danish Library Centre A/S

    Wittgenstein on line / on the line

    Get PDF
    wo independent publishing projects have thoroughly changed the state of Wittgenstein scholarship in recent years. Michael Nedo's 'Wiener Ausgabe'1 offers a traditional critical edition of Wittgenstein's philosophical writings ranging from 1929 up to and including the 'Big Typescript' (1933). Considering the eclectic and - at times - arbitrary editorial policy underlying previous publications from the Nachlass2 Nedo's project offers unprecedented philosophical rigor as well as textual criticism in volumes designed for comfortable reading. A second, more ambitious, attempt at a critical edition is the Bergen electronic edition.3 It is planned to include 4 CD-ROMs, covering the entire range of the philosopher's unpublished writing. Two disks are currently available, comprising all of Wittgenstein's manuscripts from 1929-1939, as well as type-scripts, beginning with 'Notes on Logic' (1913) and leading up to Typescript 226, composed in 1939.\ud \ud Wittgenstein's writings from the Thirties are, therefore, available in independent, reliable printed and electronic editions respectively. Readers can, for the first time, observe the philosopher at work, transferring paragraphs from pocket notebooks to handwritten 'volumes'; picking acceptable remarks to be included in type-scripts that are, at a later stage, cut up into slips of paper which are again annotated, rearranged and put together in further volumes and type-scripts. But this is only half the excitement. The 'Wiener Ausgabe' and the 'Bergen Edition' stake their success on different media, inevitably provoking a comparison between the well known features of printed scholarly editions and the not so familiar realm of digitized texts

    A Data Science Course for Undergraduates: Thinking with Data

    Get PDF
    Data science is an emerging interdisciplinary field that combines elements of mathematics, statistics, computer science, and knowledge in a particular application domain for the purpose of extracting meaningful information from the increasingly sophisticated array of data available in many settings. These data tend to be non-traditional, in the sense that they are often live, large, complex, and/or messy. A first course in statistics at the undergraduate level typically introduces students with a variety of techniques to analyze small, neat, and clean data sets. However, whether they pursue more formal training in statistics or not, many of these students will end up working with data that is considerably more complex, and will need facility with statistical computing techniques. More importantly, these students require a framework for thinking structurally about data. We describe an undergraduate course in a liberal arts environment that provides students with the tools necessary to apply data science. The course emphasizes modern, practical, and useful skills that cover the full data analysis spectrum, from asking an interesting question to acquiring, managing, manipulating, processing, querying, analyzing, and visualizing data, as well communicating findings in written, graphical, and oral forms.Comment: 21 pages total including supplementary material
    • …
    corecore