21 research outputs found

    Workshop Proceedings of the 12th edition of the KONVENS conference

    Get PDF
    The 2014 issue of KONVENS is even more a forum for exchange: its main topic is the interaction between Computational Linguistics and Information Science, and the synergies such interaction, cooperation and integrated views can produce. This topic at the crossroads of different research traditions which deal with natural language as a container of knowledge, and with methods to extract and manage knowledge that is linguistically represented is close to the heart of many researchers at the Institut für Informationswissenschaft und Sprachtechnologie of Universität Hildesheim: it has long been one of the institute’s research topics, and it has received even more attention over the last few years

    Approximation in Morphology

    Get PDF
    This Special Issue "Approximation in Morphology" has been collated from peer-reviewed papers presented at the ApproxiMo 'discontinuous' workshop (2022), which was held online between December 2021 and May 2022, and organized by Francesca Masini (Bologna), Muriel Norde (Berlin) and Kristel Van Goethem (Louvain)

    Automating the Production of the Balance Mix in Music Production

    Get PDF
    Historically, the junior engineer is an individual who would assist the sound engineer to produce a mix by performing a number of mixing and pre-processing tasks ahead of the main session. With improvements in technology, these tasks can be done more efficiently, so many aspects of this role are now assigned to the lead engineer. Similarly, these technological advances mean amateur producers now have access to similar mixing tools at home, without the need for any studio time or record label investments. As the junior engineer’s role is now embedded into the process it creates a steeper learning curve for these amateur engineers, and adding time onto the mixing process. In order to build tools to help users overcome the hurdles associated with this increased workload, we first aim to quantify the role of a modern studio engineer. To do this, a production environment was built to collect session data, allowing subjects to construct a balance mix, which is the starting point of the mixing life-cycle. This balance-mix is generally designed to ensure that all the recordings in a mix are audible, as well as to build routing structures and apply pre-processing. Improvements in web technologies allow for this data-collection system to run in a browser, making remote data acquisition feasible in a short space of time. The data collected in this study was then used to develop a set of assistive tools, designed to be non-intrusive and to provide guidance, allowing the engineer to understand the process. From the data, grouping of the audio tracks proved to be one of the most important, yet overlooked tasks in the production life-cycle. This step is often misunderstood by novice engineers, and can enhance the quality of the final product. The first assistive tool we present in this thesis takes multi-track audio sessions and uses semantic information to group and label them. The system can work with any collection of audio tracks, and can be embedded into a poroduction environment. It was also apparent from the data that the minimisation of masking is a primary task of the mixing stage. We therefore present a tool which can automatically balance a mix by minimising the masking between separate audio tracks. Using evolutionary computing as a solver, the mix space can be searched effectively without the requirement for complex models to be trained on production data. The evaluation of these systems show they are capable of producing a session structure similar to that of a real engineer. This provides a balance mix which is routed and pre-processed, before creative mixing can take place. This provides an engineer with several steps completed for them, similar to the work of a junior engineer

    Proceedings of the Eighth Italian Conference on Computational Linguistics CliC-it 2021

    Get PDF
    The eighth edition of the Italian Conference on Computational Linguistics (CLiC-it 2021) was held at Università degli Studi di Milano-Bicocca from 26th to 28th January 2022. After the edition of 2020, which was held in fully virtual mode due to the health emergency related to Covid-19, CLiC-it 2021 represented the first moment for the Italian research community of Computational Linguistics to meet in person after more than one year of full/partial lockdown

    Gamer speak : a case study of gaming terminology in Spain

    Get PDF
    La globalització dels videojocs ha obert noves vies d'investigació en el camp de la traducció. Encara que la localització de videojocs és objecte d'estudi, existeixen poques investigacions sobre l'ús real de terminologia lúdica que realitzen els jugadors espanyols. Per tenir un coneixement més profund de "gamer speak", l'argot real usat pels jugadors espanyols, cal donar més atenció acadèmica a la influència que exerceix l'anglès sobre el seu lèxic. Mitjançant un estudi de corpus exploratori, s'extreu la terminologia "gaming" d'una selecció de vídeos "Let's Play" publicats per dos YouTubers espanyols. S'analitza la lexicologia dels termes per revelar els processes neològics i mecanismes de creació de paraules que constitueixen el lèxic compartit per la comunitat de jugadors a Espanya. L'anàlisi de les dades extretes del corpus confirma que els jugadors espanyols depenen de la terminologia anglesa mentre juguen, en manllevar i adaptar paraules estrangeres, i generalment ignoren termes que han sigut localitzats en pro de l'argot col·loquial establert. S'han de realitzar més estudis sobre aquesta discrepància per entendre la mecànica que regeix aquestes preferències.La globalización de los videojuegos ha abierto nuevas vías de investigación en el campo de la traducción. Aunque la localización de videojuegos es objeto de estudio, existen pocas investigaciones sobre el uso real de terminología lúdica que realizan los jugadores españoles. Para tener un conocimiento más profundo de "gamer speak", la jerga real usada por jugadores españoles, se debe prestar más atención académica a la influencia que ejerce el inglés sobre su léxico. Mediante un estudio de corpus exploratorio, se extrae la terminología "gaming" de una selección de vídeos "Let's Play" publicados por dos YouTubers españoles. Se analiza la lexicología de los términos para desvelar los procesos neológicos y mecanismos de creación de palabras que dan lugar al léxico compartido por la comunidad de jugadores de España. El análisis de los datos extraídos del corpus confirma que los jugadores españoles dependen de la terminología inglesa mientras juegan, al tomar prestadas y adaptar palabras extranjeras, y generalmente ignoran términos que han sido localizados en pro del argot coloquial establecido. Se deben realizar más estudios sobre esta discrepancia para entender la mecánica que rige estas preferencias.The globalization of video games has opened new investigation pathways for translation studies. While research is being performed on video game localization, little academic research has focused on the real-life application of gaming terminology by Spanish gamers. In order to bring awareness to "gamer speak", real-world gaming lingo used by Spanish players, the influence of English on their lexicon requires academic attention. Through an exploratory corpus study, gaming terminology is extracted from a selection of "Let's Play" videos posted by two Spanish YouTubers. Lexicology of the terms is analyzed to uncover the neology processes and word creation mechanisms that give rise to the lexis shared by the Spanish gaming community. The analysis of the data extracted from the corpus confirms that Spanish gamers rely heavily on English terminology during gameplay, borrowing and adapting foreign words, and generally ignore officially localized terms in favor of colloquially established jargon. Further investigation must be performed into this discrepancy in order to understand the mechanics behind these preferences

    Towards More Human-Like Text Summarization: Story Abstraction Using Discourse Structure and Semantic Information.

    Get PDF
    PhD ThesisWith the massive amount of textual data being produced every day, the ability to effectively summarise text documents is becoming increasingly important. Automatic text summarization entails the selection and generalisation of the most salient points of a text in order to produce a summary. Approaches to automatic text summarization can fall into one of two categories: abstractive or extractive approaches. Extractive approaches involve the selection and concatenation of spans of text from a given document. Research in automatic text summarization began with extractive approaches, scoring and selecting sentences based on the frequency and proximity of words. In contrast, abstractive approaches are based on a process of interpretation, semantic representation, and generalisation. This is closer to the processes that psycholinguistics tells us that humans perform when reading, remembering and summarizing. However in the sixty years since its inception, the field has largely remained focused on extractive approaches. This thesis aims to answer the following questions. Does knowledge about the discourse structure of a text aid the recognition of summary-worthy content? If so, which specific aspects of discourse structure provide the greatest benefit? Can this structural information be used to produce abstractive summaries, and are these more informative than extractive summaries? To thoroughly examine these questions, they are each considered in isolation, and as a whole, on the basis of both manual and automatic annotations of texts. Manual annotations facilitate an investigation into the upper bounds of what can be achieved by the approach described in this thesis. Results based on automatic annotations show how this same approach is impacted by the current performance of imperfect preprocessing steps, and indicate its feasibility. Extractive approaches to summarization are intrinsically limited by the surface text of the input document, in terms of both content selection and summary generation. Beginning with a motivation for moving away from these commonly used methods of producing summaries, I set out my methodology for a more human-like approach to automatic summarization which examines the benefits of using discourse-structural information. The potential benefit of this is twofold: moving away from a reliance on the wording of a text in order to detect important content, and generating concise summaries that are independent of the input text. The importance of discourse structure to signal key textual material has previously been recognised, however it has seen little applied use in the field of autovii matic summarization. A consideration of evaluation metrics also features significantly in the proposed methodology. These play a role in both preprocessing steps and in the evaluation of the final summary product. I provide evidence which indicates a disparity between the performance of coreference resolution systems as indicated by their standard evaluation metrics, and their performance in extrinsic tasks. Additionally, I point out a range of problems for the most commonly used metric, ROUGE, and suggest that at present summary evaluation should not be automated. To illustrate the general solutions proposed to the questions raised in this thesis, I use Russian Folk Tales as an example domain. This genre of text has been studied in depth and, most importantly, it has a rich narrative structure that has been recorded in detail. The rules of this formalism are suitable for the narrative structure reasoning system presented as part of this thesis. The specific discourse-structural elements considered cover the narrative structure of a text, coreference information, and the story-roles fulfilled by different characters. The proposed narrative structure reasoning system produces highlevel interpretations of a text according to the rules of a given formalism. For the example domain of Russian Folktales, a system is implemented which constructs such interpretations of a tale according to an existing set of rules and restrictions. I discuss how this process of detecting narrative structure can be transferred to other genres, and a key factor in the success of this process: how constrained are the rules of the formalism. The system enumerates all possible interpretations according to a set of constraints, meaning a less restricted rule set leads to a greater number of interpretations. For the example domain, sentence level discourse-structural annotations are then used to predict summary-worthy content. The results of this study are analysed in three parts. First, I examine the relative utility of individual discourse features and provide a qualitative discussion of these results. Second, the predictive abilities of these features are compared when they are manually annotated to when they are annotated with varying degrees of automation. Third, these results are compared to the predictive capabilities of classic extractive algorithms. I show that discourse features can be used to more accurately predict summary-worthy content than classic extractive algorithms. This holds true for automatically obtained annotations, but with a much clearer difference when using manual annotations. The classifiers learned in the prediction of summary-worthy sentences are subsequently used to inform the production of both extractive and abstractive summaries to a given length. A human-based evaluation is used to compare these summaries, as well as the outputs of a classic extractive summarizer. I analyse the impact of knowledge about discourse structure, obtained both manually and automatically, on summary production. This allows for some insight into the knock on effects on summary production that can occur from inaccurate discourse information (narrative structure and coreference information). My analyses show that even given inaccurate discourse information, the resulting abstractive summaries are considered more informative than their extractive counterparts. With human-level knowledge about discourse structure, these results are even clearer. In conclusion, this research provides a framework which can be used to detect the narrative structure of a text, and shows its potential to provide a more human-like approach to automatic summarization. I show the limit of what is achievable with this approach both when manual annotations are obtainable, and when only automatic annotations are feasible. Nevertheless, this thesis supports the suggestion that the future of summarization lies with abstractive and not extractive techniques

    LEGAL STYLE MARKERS ANO THEIR TRANSLATION IN WRITTEN PLEADINGS BEFORE THE EUROPEAN COURT OF HUMAN RIGHTS

    Get PDF
    This corpus-based study investigates language use in the occluded genre of written pleadings before the European Court of Human Rights through the paradigms of legal phraseology and Translation Studies. The analysis is carried out on three subcorpora of authentic texts: (a) pleadings translated from Russian into English, (b) pleadings translated from Italian into English and (c) pleadings originally drafted in English. Legal language is intricate and formulaic, and frequently makes recourse to prefabricated patterns and routines. Legal phraseology is a major challenge for professional legal translators, and yet its translation has not received much scholarly attention until recently. Legal phraseological units are prefabricated patterns that form the matrix of legal texts and reveal interesting information about both the language and structure of the genre of written pleadings. Over the last thirty years, linguistic deviations occurring in the translation process have constituted one of the main areas within Translation Studies. It has been postulated that translated language has distinctive linguistic characteristics. Legal translation, in addition to linguistic factors, is conditioned by the tension between the legal systems involved, which can result in peculiar language dynamics in the translation of legal texts. This study draws inspiration from Toury\u2019s (1995) and Chesterman\u2019s (2004a) works to describe the different dynamics of translated language, applying a combination of translation norms and universals to identify and describe regularities in translated pleadings. This work is carried out using both linguistic and translational insights in order to demonstrate empirically how written pleadings can be characterised in terms of their phraseological content and how translated pleadings differ from non-translated pleadings. Distributional patterns of recurrent and anomalous legal phraseological units are compared across the corpora and analysed for typicality of frequencies and patterning as well as for quantity and quality of linguistic variation. The results contain a list of legal style markers typical of this genre, obtained in a translational and phraseological perspective. The list supplements the rather scant information about the language of written pleadings at the European Court of Human Rights. The analysis also provides confirmatory evidence of the differences between translated and non-translated texts, specifically, proving the co-existence of two opposite tendencies in translation: conventionalisation (translation of source text textemes with conventional repertoremes of the target environment) and discourse transfer (introduction of prefabricated patterns from the source language). The results may also be of some use and applicability for Russian-to-English and Italian-to-English translators, helping them avoid interference, use of unnatural or overly conservative patterns

    Contrastive terminography

    Get PDF
    Contrastive terminographyContrastive methods have long been employed in lexicography, in particular in bi- and multilingual dictionary projects. The main rationale for this is the necessity to comprehensively study, i.e. compare and contrast, two or more linguistic systems that are to be presented in one way or another in respective dictionaries. Similarly, the contrastive approach is of paramount importance in terminographic undertakings, on account of the need to draw a distinction between terminological (conceptual) systems existing in various languages and across cultures. It must be emphasised, however, that the contrastive element is not only a part of terminographic practice, but also of the theory of terminography. This article aims to present the role of contrastive research across various spheres of specialised (=LSP) lexicography. Terminografia kontrastywnaMetody kontrastywne stosowane są w słownikarstwie od dawna, w szczególności w odniesieniu do dwu- i wielojęzycznych projektów leksykograficznych, z uwagi na konieczność przeprowadzenia analiz porównawczych dwóch lub więcej systemów językowych, których elementy mają być w konkretny sposób przedstawione/zestawione w słowniku. Badania kontrastywne odgrywają równie ważną rolę w pracy terminograficznej, przede wszystkim ze względu na potrzebę dokonania porównania systemów terminologicznych (pojęciowych) funkcjonujących w różnych językach i kulturach. Należy podkreślić, że elementy analiz kontrastywnych nie są jedynie domeną praktyki – korzysta z nich również teoria terminografii. W artykule przedstawiono rolę badań kontrastywnych na różnych płaszczyznach działalności terminograficznej

    Contrastive terminography

    Get PDF
    Contrastive terminography Contrastive methods have long been employed in lexicography, in particular in bi- and multilingual dictionary projects. The main rationale for this is the necessity to comprehensively study, i.e. compare and contrast, two or more linguistic systems that are to be presented in one way or another in respective dictionaries. Similarly, the contrastive approach is of paramount importance in terminographic undertakings, on account of the need to draw a distinction between terminological (conceptual) systems existing in various languages and across cultures. It must be emphasised, however, that the contrastive element is not only a part of terminographic practice, but also of the theory of terminography. This article aims to present the role of contrastive research across various spheres of specialised (=LSP) lexicography.   Terminografia kontrastywna Metody kontrastywne stosowane są w słownikarstwie od dawna, w szczególności w odniesieniu do dwu- i wielojęzycznych projektów leksykograficznych, z uwagi na konieczność przeprowadzenia analiz porównawczych dwóch lub więcej systemów językowych, których elementy mają być w konkretny sposób przedstawione/zestawione w słowniku. Badania kontrastywne odgrywają równie ważną rolę w pracy terminograficznej, przede wszystkim ze względu na potrzebę dokonania porównania systemów terminologicznych (pojęciowych) funkcjonujących w różnych językach i kulturach. Należy podkreślić, że elementy analiz kontrastywnych nie są jedynie domeną praktyki – korzysta z nich również teoria terminografii. W artykule przedstawiono rolę badań kontrastywnych na różnych płaszczyznach działalności terminograficznej

    Factors Influencing Customer Satisfaction towards E-shopping in Malaysia

    Get PDF
    Online shopping or e-shopping has changed the world of business and quite a few people have decided to work with these features. What their primary concerns precisely and the responses from the globalisation are the competency of incorporation while doing their businesses. E-shopping has also increased substantially in Malaysia in recent years. The rapid increase in the e-commerce industry in Malaysia has created the demand to emphasize on how to increase customer satisfaction while operating in the e-retailing environment. It is very important that customers are satisfied with the website, or else, they would not return. Therefore, a crucial fact to look into is that companies must ensure that their customers are satisfied with their purchases that are really essential from the ecommerce’s point of view. With is in mind, this study aimed at investigating customer satisfaction towards e-shopping in Malaysia. A total of 400 questionnaires were distributed among students randomly selected from various public and private universities located within Klang valley area. Total 369 questionnaires were returned, out of which 341 questionnaires were found usable for further analysis. Finally, SEM was employed to test the hypotheses. This study found that customer satisfaction towards e-shopping in Malaysia is to a great extent influenced by ease of use, trust, design of the website, online security and e-service quality. Finally, recommendations and future study direction is provided. Keywords: E-shopping, Customer satisfaction, Trust, Online security, E-service quality, Malaysia
    corecore