510 research outputs found

    Establishing Robustness of a Spatial Dataset in a Tolerance-Based Vector Model

    Get PDF
    Spatial data are usually described through a vector model in which geometries are rep- resented by a set of coordinates embedded into an Euclidean space. The use of a finite representation, instead of the real numbers theoretically required, causes many robustness problems which are well-known in literature. Such problems are made even worst in a distributed context, where data is exchanged between different systems and several perturbations can be introduced in the data representation. In this context, a spatial dataset is said to be robust if the evaluation of the spatial relations existing among its objects can be performed in different systems, producing always the same result.In order to discuss the robustness of a spatial dataset, two implementation models have to be distinguished, since they determine different ways to evaluate the relations existing among geometric objects: the identity and the tolerance model. The robustness of a dataset in the identity model has been widely discussed in [Belussi et al., 2012, Belussi et al., 2013, Belussi et al., 2015a] and some algorithms of the Snap Rounding (SR) family [Hobby, 1999, Halperin and Packer, 2002, Packer, 2008, Belussi et al., 2015b] can be successfully applied in such context. Conversely, this problem has been less explored in the tolerance model. The aim of this paper is to propose an algorithm inspired by the ones of SR family for establishing or restoring the robustness of a vector dataset in the tolerance model. The main ideas are to introduce an additional operation which spreads instead of snapping geometries, in order to preserve the original relation between them, and to use a tolerance region for such operation instead of a single snapping location. Finally, some experiments on real-world datasets are presented, which confirms how the proposed algorithm can establish the robustness of a dataset

    CTC-based Compression for Direct Speech Translation

    Get PDF
    Previous studies demonstrated that a dynamic phone-informed compression of the input audio is beneficial for speech translation (ST). However, they required a dedicated model for phone recognition and did not test this solution for direct ST, in which a single model translates the input audio into the target language without intermediate representations. In this work, we propose the first method able to perform a dynamic compression of the input indirect ST models. In particular, we exploit the Connectionist Temporal Classification (CTC) to compress the input sequence according to its phonetic characteristics. Our experiments demonstrate that our solution brings a 1.3-1.5 BLEU improvement over a strong baseline on two language pairs (English-Italian and English-German), contextually reducing the memory footprint by more than 10%.Comment: Accepted at EACL202

    A template-based approach for the specification of 3D topological constraints

    Get PDF
    Several different models have been defined in literature for the definition of 3D scenes that include a geometrical representation of objects together with a semantical classification of them. Such semantical characterization encapsulates important details about the object properties and behavior and often includes spatial relations that are defined only implicitly or through natural language, such as \u201can external access shall be in touch with the building only when it is classified as a direct access\u201d. The problem of ensuring the coherence between geometric and semantic information is well known in literature. Many attempts exist which try to extent the OCL to allow the representation of spatial integrity constraints in an UML model. However, this approach requires a deep knowledge of the OCL formalism and the implementation of ad-hoc procedures to validate the constraints specified at conceptual level. Therefore, a new approach is needed that helps designers to define complex OCL constraints and at the same time allows the automatic generation of the code to test them on a given dataset. The aim of this paper is to propose a set of predefined templates to express on the classes of an UML data model, a family of 3D spatial integrity constraints based on topological relations; all this without requiring the knowledge of any formal language by domain experts and supporting their automatic translation into validation procedures

    Snap Rounding with Restore: an Algorithm for Producing Robust Geometric Datasets

    Get PDF
    This paper presents a new algorithm called Snap Rounding with Restore (SRR), which aims to make ge- ometric datasets robust and to increase the quality of geometric approximation and the preservation of topological structure. It is based on the well-known Snap Rounding algorithm, but improves it by eliminat- ing from the snap rounded arrangement the configurations in which the distance between a vertex and a non-incident edge is smaller than half-the-width of a pixel of the rounding grid. Therefore, the goal of SRR is exactly the same as the goal of another algorithm, Iterated Snap Rounding (ISR), and of its evolution, Iterated Snap Rounding with Bounded Drift (ISRBD). However, SRR produces an output with a quality of approximation that is on average better than ISRBD, both under the viewpoint of the distance from the original segments and of the conservation of their topological structure. The paper also reports some cases where ISRBD, notwithstanding the bounded drift, produces strong topological modifications while SRR does not. A statistical analysis on a large collection of input datasets confirms these differences. It follows that the proposed Snap Rounding with Restore algorithm is suitable for applications that require both robustness, a guaranteed geometric approximation and a good topological approximation

    one step one lane chemical dna sequencing by n methylformamide in the presence of metal ions

    Get PDF
    We report on a chemical method that allows DNA sequencing by a single reaction. It is based on treatment of 5′-end-labeled DNA with N-methylformamide in the presence of manganese. This method allows the manipulation of samples to be kept to a minimum and consists of a single chemical step that requires about 30 minutes to complete base degradation, phosphodiester bond cleavage and denaturation. Examples of one-treatment, one-lane DNA sequencing of both radioactively and fluorescently 5′-end-labeled DNAs are reported

    No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation

    Full text link
    Automatic speech recognition (ASR) systems are known to be sensitive to the sociolinguistic variability of speech data, in which gender plays a crucial role. This can result in disparities in recognition accuracy between male and female speakers, primarily due to the under-representation of the latter group in the training data. While in the context of hybrid ASR models several solutions have been proposed, the gender bias issue has not been explicitly addressed in end-to-end neural architectures. To fill this gap, we propose a data augmentation technique that manipulates the fundamental frequency (f0) and formants. This technique reduces the data unbalance among genders by simulating voices of the under-represented female speakers and increases the variability within each gender group. Experiments on spontaneous English speech show that our technique yields a relative WER improvement up to 9.87% for utterances by female speakers, with larger gains for the least-represented f0 ranges.Comment: Accepted at ASRU 202

    Towards a methodology for evaluating automatic subtitling

    Get PDF
    In response to the increasing interest towards automatic subtitling, this EAMT-funded project aimed at collecting subtitle post-editing data in a real use case scenario where professional subtitlers edit automatically generated subtitles. The post-editing setting includes, for the first time, automatic generation of timestamps and segmentation, and focuses on the effect of timing and segmentation edits on the post-editing process. The collected data will serve as the basis for investigating how subtitlers interact with automatic subtitling and for devising evaluation methods geared to the multimodal nature and formal requirements of subtitling

    Untargeted Metabolomics Analysis of the Orchid Species Oncidium sotoanum Reveals the Presence of Rare Bioactive C-Diglycosylated Chrysin Derivatives

    Get PDF
    Plants are valuable sources of secondary metabolites with pharmaceutical properties, but only a small proportion of plant life has been actively exploited for medicinal purposes to date. Underexplored plant species are therefore likely to contain novel bioactive compounds. In this study, we investigated the content of secondary metabolites in the flowers, leaves and pseudobulbs of the orchid Oncidium sotoanum using an untargeted metabolomics approach. We observed the strong accumulation of C-diglycosylated chrysin derivatives, which are rarely found in nature. Further characterization revealed evidence of antioxidant activity (FRAP and DPPH assays) and potential activity against neurodegenerative disorders (MAO-B inhibition assay) depending on the specific molecular structure of the metabolites. Natural product bioprospecting in underexplored plant species based on untargeted metabolomics can therefore help to identify novel chemical structures with diverse pharmaceutical properties
    • …
    corecore