121,718 research outputs found

    Autoplot: A browser for scientific data on the web

    Full text link
    Autoplot is software developed for the Virtual Observatories in Heliophysics to provide intelligent and automated plotting capabilities for many typical data products that are stored in a variety of file formats or databases. Autoplot has proven to be a flexible tool for exploring, accessing, and viewing data resources as typically found on the web, usually in the form of a directory containing data files with multiple parameters contained in each file. Data from a data source is abstracted into a common internal data model called QDataSet. Autoplot is built from individually useful components, and can be extended and reused to create specialized data handling and analysis applications and is being used in a variety of science visualization and analysis applications. Although originally developed for viewing heliophysics-related time series and spectrograms, its flexible and generic data representation model makes it potentially useful for the Earth sciences.Comment: 16 page

    Matching Natural Language Sentences with Hierarchical Sentence Factorization

    Full text link
    Semantic matching of natural language sentences or identifying the relationship between two sentences is a core research problem underlying many natural language tasks. Depending on whether training data is available, prior research has proposed both unsupervised distance-based schemes and supervised deep learning schemes for sentence matching. However, previous approaches either omit or fail to fully utilize the ordered, hierarchical, and flexible structures of language objects, as well as the interactions between them. In this paper, we propose Hierarchical Sentence Factorization---a technique to factorize a sentence into a hierarchical representation, with the components at each different scale reordered into a "predicate-argument" form. The proposed sentence factorization technique leads to the invention of: 1) a new unsupervised distance metric which calculates the semantic distance between a pair of text snippets by solving a penalized optimal transport problem while preserving the logical relationship of words in the reordered sentences, and 2) new multi-scale deep learning models for supervised semantic training, based on factorized sentence hierarchies. We apply our techniques to text-pair similarity estimation and text-pair relationship classification tasks, based on multiple datasets such as STSbenchmark, the Microsoft Research paraphrase identification (MSRP) dataset, the SICK dataset, etc. Extensive experiments show that the proposed hierarchical sentence factorization can be used to significantly improve the performance of existing unsupervised distance-based metrics as well as multiple supervised deep learning models based on the convolutional neural network (CNN) and long short-term memory (LSTM).Comment: Accepted by WWW 2018, 10 page
    • …
    corecore