7 research outputs found

    Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe

    Full text link
    Privacy concerns have attracted increasing attention in data-driven products due to the tendency of machine learning models to memorize sensitive training data. Generating synthetic versions of such data with a formal privacy guarantee, such as differential privacy (DP), provides a promising path to mitigating these privacy concerns, but previous approaches in this direction have typically failed to produce synthetic data of high quality. In this work, we show that a simple and practical recipe in the text domain is effective: simply fine-tuning a pretrained generative language model with DP enables the model to generate useful synthetic text with strong privacy protection. Through extensive empirical analyses on both benchmark and private customer data, we demonstrate that our method produces synthetic text that is competitive in terms of utility with its non-private counterpart, meanwhile providing strong protection against potential privacy leakages.Comment: ACL 2023 Main Conference (Honorable Mention

    Predicative possession in Medieval Slavic Bible translations Predicative Possession in Early Biblical Slavic

    No full text
    Late Proto-Slavic (LPS) had an inventory of three constructions for expressing predicative possession. Using the earliest Slavic Bible translations from Old Church Slavic (OCS), and to a lesser degree Old Czech, a number of conclusions can be drawn about the status of predicative possession for LPS. The verb iměti ‘have’ was the most frequent and least syntactically and semantically restricted predicative possessive construction (PPC). Existential PPCs with a dative possessor appear primarily with kinship relations, abstract possessums, and in a number of other fixed construction types; existential PPCs with the possessor in an u + genitive prepositional phrase primarily appear with concrete and countable possessums. Both existential PPCs call for an animate, most often pronominal, possessor. The u + genitive was the rarest type of PPC in LPS, though it had undoubtedly grammaticalized as a PPC

    The competing roles of SV(O) and VS(O) word orders in Xoždenie igumena Daniila

    No full text
    Though both Early East Slavic (EES) and Modern Russian have a relatively free word order, the distribution and function of word order in EES is quite distinct from Modern Russian. This paper is a study of word order within a single EES text, Xoždenie igumena Daniila, which is split into two major subdivisions: travel guide and narrative. In the travel guide, existential, stance, and motion verbs occur more frequently in VS order, and VS(O) order is more frequent overall; copular and transitive verbs occur more frequently in SV(O) order. Instances of the less frequent word order for the clause type occur as a result of specific conditioning contexts. The narrative, in contrast, has proportionally more SV(O) clauses and transitive verbs than the travel guide

    Synchrony and diachrony of the Bulgarian predicative possessive constructions

    No full text
    The paper investigates the system of predicative possession in Bulgarian from a Slavic and Balkan perspective. The constructions are described in terms of their semantic and syntactic properties and several generalizations are made about the distribution of possessive features such as alienable vs inalienable and permanent vs temporary. In the second part of paper, I bring forward some observations about the diachrony of the Bulgarian predicative possessive constructions and their potential (Slavic or Balkan) source
    corecore