2,533 research outputs found

    Basic research planning in mathematical pattern recognition and image analysis

    Get PDF
    Fundamental problems encountered while attempting to develop automated techniques for applications of remote sensing are discussed under the following categories: (1) geometric and radiometric preprocessing; (2) spatial, spectral, temporal, syntactic, and ancillary digital image representation; (3) image partitioning, proportion estimation, and error models in object scene interference; (4) parallel processing and image data structures; and (5) continuing studies in polarization; computer architectures and parallel processing; and the applicability of "expert systems" to interactive analysis

    Distributed Representations for Compositional Semantics

    Full text link
    The mathematical representation of semantics is a key issue for Natural Language Processing (NLP). A lot of research has been devoted to finding ways of representing the semantics of individual words in vector spaces. Distributional approaches --- meaning distributed representations that exploit co-occurrence statistics of large corpora --- have proved popular and successful across a number of tasks. However, natural language usually comes in structures beyond the word level, with meaning arising not only from the individual words but also the structure they are contained in at the phrasal or sentential level. Modelling the compositional process by which the meaning of an utterance arises from the meaning of its parts is an equally fundamental task of NLP. This dissertation explores methods for learning distributed semantic representations and models for composing these into representations for larger linguistic units. Our underlying hypothesis is that neural models are a suitable vehicle for learning semantically rich representations and that such representations in turn are suitable vehicles for solving important tasks in natural language processing. The contribution of this thesis is a thorough evaluation of our hypothesis, as part of which we introduce several new approaches to representation learning and compositional semantics, as well as multiple state-of-the-art models which apply distributed semantic representations to various tasks in NLP.Comment: DPhil Thesis, University of Oxford, Submitted and accepted in 201

    Table-to-Text: Generating Descriptive Text for Scientific Tables from Randomized Controlled Trials

    Get PDF
    Unprecedented amounts of data have been generated in the biomedical domain, and the bottleneck for biomedical research has shifted from data generation to data management, interpretation, and communication. Therefore, it is highly desirable to develop systems to assist in text generation from biomedical data, which will greatly improve the dissemination of scientific findings. However, very few studies have investigated issues of data-to-text generation in the biomedical domain. Here I present a systematic study for generating descriptive text from tables in randomized clinical trials (RCT) articles, which includes: (1) an information model for representing RCT tables; (2) annotated corpora containing pairs of RCT table and descriptive text, and labeled structural and semantic information of RCT tables; (3) methods for recognizing structural and semantic information of RCT tables; (4) methods for generating text from RCT tables, evaluated by a user study on three aspects: relevance, grammatical quality, and matching. The proposed hybrid text generation method achieved a low bilingual evaluation understudy (BLEU) score of 5.69; but human review achieved scores of 9.3, 9.9 and 9.3 for relevance, grammatical quality and matching, respectively, which are comparable to review of original human-written text. To the best of our knowledge, this is the first study to generate text from scientific tables in the biomedical domain. The proposed information model, labeled corpora and developed methods for recognizing tables and generating descriptive text could also facilitate other biomedical and informatics research and applications

    Eyes Wide Open: an interactive learning method for the design of rule-based systems

    Get PDF
    International audienceWe present in this paper a new general method, the Eyes Wide Open method (EWO) for the design of rule-based document recognition systems. Our contribution is to introduce a learning procedure, through machine learning techniques, in interaction with the user to design the recognition system. Therefore, and unlike many approaches that are manually designed, ours can easily adapt to a new type of documents while taking advantage of the expressiveness of rule-based systems and their ability to convey the hierarchical structure of a document. The EWO method is independent of any existing recognition system. An automatic analysis of an annotated corpus, guided by the user, is made to help the adaption of the recognition system to a new kind of document. The user will then bring sense to the automatically extracted information. In this paper, we validate EWO by producing two rule-based systems: one for the Mau-rdor international competition, on a heterogeneous corpus of documents, containing handwritten and printed documents, written in different languages and another one for the RIMES competition corpus, a homogeneous corpus of French handwritten business letters. On the RIMES corpus, our method allows an assisted design of a grammatical description that gives better results than all the previously proposed statistical systems
    corecore