15,860 research outputs found

    Learning Criteria and Evaluation Metrics for Textual Transfer between Non-Parallel Corpora

    Full text link
    We consider the problem of automatically generating textual paraphrases with modified attributes or stylistic properties, focusing on the setting without parallel data (Hu et al., 2017; Shen et al., 2017). This setting poses challenges for learning and evaluation. We show that the metric of post-transfer classification accuracy is insufficient on its own, and propose additional metrics based on semantic content preservation and fluency. For reliable evaluation, all three metric categories must be taken into account. We contribute new loss functions and training strategies to address the new metrics. Semantic preservation is addressed by adding a cyclic consistency loss and a loss based on paraphrase pairs, while fluency is improved by integrating losses based on style-specific language models. Automatic and manual evaluation show large improvements over the baseline method of Shen et al. (2017). Our hope is that these losses and metrics can be general and useful tools for a range of textual transfer settings without parallel corpora

    Using a novel source-localized phase regressor technique for evaluation of the vascular contribution to semantic category area localization in BOLD fMRI.

    Get PDF
    Numerous studies have shown that gradient-echo blood oxygen level dependent (BOLD) fMRI is biased toward large draining veins. However, the impact of this large vein bias on the localization and characterization of semantic category areas has not been examined. Here we address this issue by comparing standard magnitude measures of BOLD activity in the Fusiform Face Area (FFA) and Parahippocampal Place Area (PPA) to those obtained using a novel method that suppresses the contribution of large draining veins: source-localized phase regressor (sPR). Unlike previous suppression methods that utilize the phase component of the BOLD signal, sPR yields robust and unbiased suppression of large draining veins even in voxels with no task-related phase changes. This is confirmed in ideal simulated data as well as in FFA/PPA localization data from four subjects. It was found that approximately 38% of right PPA, 14% of left PPA, 16% of right FFA, and 6% of left FFA voxels predominantly reflect signal from large draining veins. Surprisingly, with the contributions from large veins suppressed, semantic category representation in PPA actually tends to be lateralized to the left rather than the right hemisphere. Furthermore, semantic category areas larger in volume and higher in fSNR were found to have more contributions from large veins. These results suggest that previous studies using gradient-echo BOLD fMRI were biased toward semantic category areas that receive relatively greater contributions from large veins

    Structure-semantics interplay in complex networks and its effects on the predictability of similarity in texts

    Get PDF
    There are different ways to define similarity for grouping similar texts into clusters, as the concept of similarity may depend on the purpose of the task. For instance, in topic extraction similar texts mean those within the same semantic field, whereas in author recognition stylistic features should be considered. In this study, we introduce ways to classify texts employing concepts of complex networks, which may be able to capture syntactic, semantic and even pragmatic features. The interplay between the various metrics of the complex networks is analyzed with three applications, namely identification of machine translation (MT) systems, evaluation of quality of machine translated texts and authorship recognition. We shall show that topological features of the networks representing texts can enhance the ability to identify MT systems in particular cases. For evaluating the quality of MT texts, on the other hand, high correlation was obtained with methods capable of capturing the semantics. This was expected because the golden standards used are themselves based on word co-occurrence. Notwithstanding, the Katz similarity, which involves semantic and structure in the comparison of texts, achieved the highest correlation with the NIST measurement, indicating that in some cases the combination of both approaches can improve the ability to quantify quality in MT. In authorship recognition, again the topological features were relevant in some contexts, though for the books and authors analyzed good results were obtained with semantic features as well. Because hybrid approaches encompassing semantic and topological features have not been extensively used, we believe that the methodology proposed here may be useful to enhance text classification considerably, as it combines well-established strategies

    BLEU is Not Suitable for the Evaluation of Text Simplification

    Full text link
    BLEU is widely considered to be an informative metric for text-to-text generation, including Text Simplification (TS). TS includes both lexical and structural aspects. In this paper we show that BLEU is not suitable for the evaluation of sentence splitting, the major structural simplification operation. We manually compiled a sentence splitting gold standard corpus containing multiple structural paraphrases, and performed a correlation analysis with human judgments. We find low or no correlation between BLEU and the grammaticality and meaning preservation parameters where sentence splitting is involved. Moreover, BLEU often negatively correlates with simplicity, essentially penalizing simpler sentences.Comment: Accepted to EMNLP 2018 (Short papers

    Robots for Exploration, Digital Preservation and Visualization of Archeological Sites

    Get PDF
    Monitoring and conservation of archaeological sites are important activities necessary to prevent damage or to perform restoration on cultural heritage. Standard techniques, like mapping and digitizing, are typically used to document the status of such sites. While these task are normally accomplished manually by humans, this is not possible when dealing with hard-to-access areas. For example, due to the possibility of structural collapses, underground tunnels like catacombs are considered highly unstable environments. Moreover, they are full of radioactive gas radon that limits the presence of people only for few minutes. The progress recently made in the artificial intelligence and robotics field opened new possibilities for mobile robots to be used in locations where humans are not allowed to enter. The ROVINA project aims at developing autonomous mobile robots to make faster, cheaper and safer the monitoring of archaeological sites. ROVINA will be evaluated on the catacombs of Priscilla (in Rome) and S. Gennaro (in Naples)

    A semantic-based platform for the digital analysis of architectural heritage

    Get PDF
    This essay focuses on the fields of architectural documentation and digital representation. We present a research paper concerning the development of an information system at the scale of architecture, taking into account the relationships that can be established between the representation of buildings (shape, dimension, state of conservation, hypothetical restitution) and heterogeneous information about various fields (such as the technical, the documentary or still the historical one). The proposed approach aims to organize multiple representations (and associated information) around a semantic description model with the goal of defining a system for the multi-field analysis of buildings

    Dual Logic Concepts based on Mathematical Morphology in Stratified Institutions: Applications to Spatial Reasoning

    Full text link
    Several logical operators are defined as dual pairs, in different types of logics. Such dual pairs of operators also occur in other algebraic theories, such as mathematical morphology. Based on this observation, this paper proposes to define, at the abstract level of institutions, a pair of abstract dual and logical operators as morphological erosion and dilation. Standard quantifiers and modalities are then derived from these two abstract logical operators. These operators are studied both on sets of states and sets of models. To cope with the lack of explicit set of states in institutions, the proposed abstract logical dual operators are defined in an extension of institutions, the stratified institutions, which take into account the notion of open sentences, the satisfaction of which is parametrized by sets of states. A hint on the potential interest of the proposed framework for spatial reasoning is also provided.Comment: 36 page
    corecore