3,742 research outputs found

    Synonymy and Polysemy in Legal Terminology and Their Applications to Bilingual and Bijural Translation

    Get PDF
    The paper focuses on synonymy and polysemy in the language of law in English-speaking countries. The introductory part briefly outlines the process of legal translation and tackle the specificity of bijural translation. Then, traditional understanding of what a term is and its application to legal terminology is considered; three different levels of vocabulary used in legal texts are outlined and their relevance to bijural translation explained. Next, synonyms in the language of law are considered with respect to their intension and distribution, and examples are given to show that most expressions or phrases which are interchangeable synonyms in the general language should be treated carefully in legal translation. Finally, polysemes in legal terminology are discussed and examples given to illustrate problems potentially encountered by translators

    Translating near-synonyms: Possibilities and preferences in the interlingua

    Full text link
    This paper argues that an interlingual representation must explicitly represent some parts of the meaning of a situation as possibilities (or preferences), not as necessary or definite components of meaning (or constraints). Possibilities enable the analysis and generation of nuance, something required for faithful translation. Furthermore, the representation of the meaning of words, especially of near-synonyms, is crucial, because it specifies which nuances words can convey in which contexts.Comment: 8 pages, LaTeX2e, 1 eps figure, uses colacl.sty, epsfig.sty, avm.sty, times.st

    The polysemy of the Spanish verb sentir: a behavioral profile analysis

    Get PDF
    This study investigates the intricate polysemy of the Spanish perception verb sentir (‘feel’) which, analogous to the more-studied visual perception verbs ver (‘see’) and mirar (‘look’), also displays an ample gamut of semantic uses in various syntactic environments. The investigation is based on a corpus-based behavioral profile (BP) analysis. Besides its methodological merits as a quantitative, systematic and verifiable approach to the study of meaning and to polysemy in particular, the BP analysis offers qualitative usage-based evidence for cognitive linguistic theorizing. With regard to the polysemy of sentir, the following questions were addressed: (1) What is the prototype of each cluster of senses? (2) How are the different senses structured: how many senses should be distinguished – i.e. which senses cluster together and which senses should be kept separately? (3) Which senses are more related to each other and which are highly distinguishable? (4) What morphosyntactic variables make them more or less distinguishable? The results show that two significant meaning clusters can be distinguished, which coincide with the division between the middle voice uses (sentirse) and the other uses (sentir). Within these clusters, a number of meaningful subclusters emerge, which seem to coincide largely with the more general semantic categories of physical, cognitive and emotional perception

    Antecedent selection techniques for high-recall roreference resolution

    Get PDF
    We investigate methods to improve the recall in coreference resolution by also trying to resolve those definite descriptions where no earlier mention of the referent shares the same lexical head (coreferent bridging). The problem, which is notably harder than identifying coreference relations among mentions which have the same lexical head, has been tackled with several rather different approaches, and we attempt to provide a meaningful classification along with a quantitative comparison. Based on the different merits of the methods, we discuss possibilities to improve them and show how they can be effectively combined

    Russian Lexicographic Landscape: a Tale of 12 Dictionaries

    Full text link
    The paper reports on quantitative analysis of 12 Russian dictionaries at three levels: 1) headwords: The size and overlap of word lists, coverage of large corpora, and presence of neologisms; 2) synonyms: Overlap of synsets in different dictionaries; 3) definitions: Distribution of definition lengths and numbers of senses, as well as textual similarity of same-headword definitions in different dictionaries. The total amount of data in the study is 805,900 dictionary entries, 892,900 definitions, and 84,500 synsets. The study reveals multiple connections and mutual influences between dictionaries, uncovers differences in modern electronic vs. traditional printed resources, as well as suggests directions for development of new and improvement of existing lexical semantic resources

    Distributional Measures of Semantic Distance: A Survey

    Full text link
    The ability to mimic human notions of semantic distance has widespread applications. Some measures rely only on raw text (distributional measures) and some rely on knowledge sources such as WordNet. Although extensive studies have been performed to compare WordNet-based measures with human judgment, the use of distributional measures as proxies to estimate semantic distance has received little attention. Even though they have traditionally performed poorly when compared to WordNet-based measures, they lay claim to certain uniquely attractive features, such as their applicability in resource-poor languages and their ability to mimic both semantic similarity and semantic relatedness. Therefore, this paper presents a detailed study of distributional measures. Particular attention is paid to flesh out the strengths and limitations of both WordNet-based and distributional measures, and how distributional measures of distance can be brought more in line with human notions of semantic distance. We conclude with a brief discussion of recent work on hybrid measures

    Segmenting broadcast news streams using lexical chains

    Get PDF
    In this paper we propose a course-grained NLP approach to text segmentation based on the analysis of lexical cohesion within text. Most work in this area has focused on the discovery of textual units that discuss subtopic structure within documents. In contrast our segmentation task requires the discovery of topical units of text i.e. distinct news stories from broadcast news programmes. Our system SeLeCT first builds a set of lexical chains, in order to model the discourse structure of the text. A boundary detector is then used to search for breaking points in this structure indicated by patterns of cohesive strength and weakness within the text. We evaluate this technique on a test set of concatenated CNN news story transcripts and compare it with an established statistical approach to segmentation called TextTiling

    Mapping wordnets from the perspective of inter-lingual equivalence

    Get PDF
    Mapping wordnets from the perspective of inter-lingual equivalence This paper explores inter-lingual equivalence from the perspective of linking two large lexico-semantic databases, namely the Princeton WordNet of English and the plWordnet (pl. Słowosieć) of Polish. Wordnets are built as networks of lexico-semantic relations between words and their meanings, and constitute a type of monolingual dictionary cum thesaurus. The development of wordnets for different languages has given rise to many wordnet linking projects (e.g. EuroWordNet, Vossen, 2002). Regardless of a linking method used, these projects require defining rules for establishing equivalence links between wordnet building blocks, known as synsets (sets of synonymous lexical units, i.e., lemma-sense pairs). In this paper an analysis is carried out of a set of inter-wordnet relations used in the mapping of the plWordNet onto the Princeton WordNet, and an attempt is made to relate them to equivalence taxonomies described in specialist literature on bilingual lexicography and translation.   Rzutowanie wordnetów w perspektywie ekwiwalencji międzyjęzykowej Artykuł przedstawia analizę zjawiska ekwiwalencji międzyjęzykowej z perspektywy powiązania dwóch wielkich wordnetów: polskiej Słowosieci i angielskiego WordNetu princetońskiego. Wordnety są relacyjnymi bazami danych leksykalno-semantycznych opisującymi sieć relacji leksykalno-semantycznych pomiędzy słowami i ich znaczeniami. Stanowią zatem rodzaj słownika jednojęzycznego połączonego z tezaurusem. Rozwój wordnetów dla wielu języków świata zaowocował następnie ich wzajemnymi powiązaniami. Wymagało to zdefiniowania metodologii dla ustalenia ekwiwalencji pomiędzy ich podstawowymi elementami tzn. synsetami, które są zbiorami synonimicznych jednostek leksykalnych tzn. par lemat numer znaczenia. W artykule analizujemy zbiór relacji międzywordnetowych używanych w rzutowaniu pomiędzy Słowosiecią a WordNetem princetońskim, odnosząc je do taksonomii ekwiwalencji postulowanych w literaturze leksykograficznej i translatorycznej
    corecore