5,320 research outputs found

    A Hierarchical Neural Autoencoder for Paragraphs and Documents

    Full text link
    Natural language generation of coherent long texts like paragraphs or longer documents is a challenging problem for recurrent networks models. In this paper, we explore an important step toward this generation task: training an LSTM (Long-short term memory) auto-encoder to preserve and reconstruct multi-sentence paragraphs. We introduce an LSTM model that hierarchically builds an embedding for a paragraph from embeddings for sentences and words, then decodes this embedding to reconstruct the original paragraph. We evaluate the reconstructed paragraph using standard metrics like ROUGE and Entity Grid, showing that neural models are able to encode texts in a way that preserve syntactic, semantic, and discourse coherence. While only a first step toward generating coherent text units from neural models, our work has the potential to significantly impact natural language generation and summarization\footnote{Code for the three models described in this paper can be found at www.stanford.edu/~jiweil/

    Research and Development Workstation Environment: the new class of Current Research Information Systems

    Get PDF
    Against the backdrop of the development of modern technologies in the field of scientific research the new class of Current Research Information Systems (CRIS) and related intelligent information technologies has arisen. It was called - Research and Development Workstation Environment (RDWE) - the comprehensive problem-oriented information systems for scientific research and development lifecycle support. The given paper describes design and development fundamentals of the RDWE class systems. The RDWE class system's generalized information model is represented in the article as a three-tuple composite web service that include: a set of atomic web services, each of them can be designed and developed as a microservice or a desktop application, that allows them to be used as an independent software separately; a set of functions, the functional filling-up of the Research and Development Workstation Environment; a subset of atomic web services that are required to implement function of composite web service. In accordance with the fundamental information model of the RDWE class the system for supporting research in the field of ontology engineering - the automated building of applied ontology in an arbitrary domain area, scientific and technical creativity - the automated preparation of application documents for patenting inventions in Ukraine was developed. It was called - Personal Research Information System. A distinctive feature of such systems is the possibility of their problematic orientation to various types of scientific activities by combining on a variety of functional services and adding new ones within the cloud integrated environment. The main results of our work are focused on enhancing the effectiveness of the scientist's research and development lifecycle in the arbitrary domain area.Comment: In English, 13 pages, 1 figure, 1 table, added references in Russian. Published. Prepared for special issue (UkrPROG 2018 conference) of the scientific journal "Problems of programming" (Founder: National Academy of Sciences of Ukraine, Institute of Software Systems of NAS Ukraine

    Mapping Big Data into Knowledge Space with Cognitive Cyber-Infrastructure

    Full text link
    Big data research has attracted great attention in science, technology, industry and society. It is developing with the evolving scientific paradigm, the fourth industrial revolution, and the transformational innovation of technologies. However, its nature and fundamental challenge have not been recognized, and its own methodology has not been formed. This paper explores and answers the following questions: What is big data? What are the basic methods for representing, managing and analyzing big data? What is the relationship between big data and knowledge? Can we find a mapping from big data into knowledge space? What kind of infrastructure is required to support not only big data management and analysis but also knowledge discovery, sharing and management? What is the relationship between big data and science paradigm? What is the nature and fundamental challenge of big data computing? A multi-dimensional perspective is presented toward a methodology of big data computing.Comment: 59 page

    A semantic partition based text mining model for document classification.

    Get PDF
    corecore