68 research outputs found

    Text complexity and text simplification in the crisis management domain

    Get PDF
    Due to the fact that emergency situations can lead to substantial losses, both financial and in terms of human lives, it is essential that texts used in a crisis situation be clearly understandable. This thesis is concerned with the study of the complexity of the crisis management sub-language and with methods to produce new, clear texts and to rewrite pre-existing crisis management documents which are too complex to be understood. By doing this, this interdisciplinary study makes several contributions to the crisis management field. First, it contributes to the knowledge of the complexity of the texts used in the domain, by analysing the presence of a set of written language complexity issues derived from the psycholinguistic literature in a novel corpus of crisis management documents. Second, since the text complexity analysis shows that crisis management documents indeed exhibit high numbers of text complexity issues, the thesis adapts to the English language controlled language writing guidelines which, when applied to the crisis management language, reduce its complexity and ambiguity, leading to clear text documents. Third, since low quality of communication can have fatal consequences in emergency situations, the proposed controlled language guidelines and a set of texts which were re-written according to them are evaluated from multiple points of view. In order to achieve that, the thesis both applies existing evaluation approaches and develops new methods which are more appropriate for the task. These are used in two evaluation experiments – evaluation on extrinsic tasks and evaluation of users’ acceptability. The evaluations on extrinsic tasks (evaluating the impact of the controlled language on text complexity, reading comprehension under stress, manual translation, and machine translation tasks) Text Complexity and Text Simplification in the Crisis Management domain 4 show a positive impact of the controlled language on simplified documents and thus ensure the quality of the resource. The evaluation of users’ acceptability contributes additional findings about manual simplification and helps to determine directions for future implementation. The thesis also gives insight into reading comprehension, machine translation, and cross-language adaptability, and provides original contributions to machine translation, controlled languages, and natural language generation evaluation techniques, which make it valuable for several scientific fields, including Linguistics, Psycholinguistics, and a number of different sub-fields of NLP.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Novel statistical approaches to text classification, machine translation and computer-assisted translation

    Full text link
    Esta tesis presenta diversas contribuciones en los campos de la clasificación automática de texto, traducción automática y traducción asistida por ordenador bajo el marco estadístico. En clasificación automática de texto, se propone una nueva aplicación llamada clasificación de texto bilingüe junto con una serie de modelos orientados a capturar dicha información bilingüe. Con tal fin se presentan dos aproximaciones a esta aplicación; la primera de ellas se basa en una asunción naive que contempla la independencia entre las dos lenguas involucradas, mientras que la segunda, más sofisticada, considera la existencia de una correlación entre palabras en diferentes lenguas. La primera aproximación dió lugar al desarrollo de cinco modelos basados en modelos de unigrama y modelos de n-gramas suavizados. Estos modelos fueron evaluados en tres tareas de complejidad creciente, siendo la más compleja de estas tareas analizada desde el punto de vista de un sistema de ayuda a la indexación de documentos. La segunda aproximación se caracteriza por modelos de traducción capaces de capturar correlación entre palabras en diferentes lenguas. En nuestro caso, el modelo de traducción elegido fue el modelo M1 junto con un modelo de unigramas. Este modelo fue evaluado en dos de las tareas más simples superando la aproximación naive, que asume la independencia entre palabras en differentes lenguas procedentes de textos bilingües. En traducción automática, los modelos estadísticos de traducción basados en palabras M1, M2 y HMM son extendidos bajo el marco de la modelización mediante mixturas, con el objetivo de definir modelos de traducción dependientes del contexto. Asimismo se extiende un algoritmo iterativo de búsqueda basado en programación dinámica, originalmente diseñado para el modelo M2, para el caso de mixturas de modelos M2. Este algoritmo de búsqueda nCivera Saiz, J. (2008). Novel statistical approaches to text classification, machine translation and computer-assisted translation [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/2502Palanci

    Tune your brown clustering, please

    Get PDF
    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly unexplored. Accordingly, we present information for practitioners on the behaviour of Brown clustering in order to assist hyper-parametre tuning, in the form of a theoretical model of Brown clustering utility. This model is then evaluated empirically in two sequence labelling tasks over two text types. We explore the dynamic between the input corpus size, chosen number of classes, and quality of the resulting clusters, which has an impact for any approach using Brown clustering. In every scenario that we examine, our results reveal that the values most commonly used for the clustering are sub-optimal

    Characterizing Spoken Discourse in Individuals with Parkinson Disease Without Dementia

    Get PDF
    Background: The effects of disease (PD) on cognition, word retrieval, syntax, and speech/voice processes may interact to manifest uniquely in spoken language tasks. A handful of studies have explored spoken discourse production in PD and, while not ubiquitously, have reported a number of impairments including: reduced words per minute, reduced grammatical complexity, reduced informativeness, and increased verbal disruption. Methodological differences have impeded cross-study comparisons. As such, the profile of spoken language impairments in PD remains ambiguous. Method: A cross-genre, multi-level discourse analysis, prospective, cross-sectional between groups study design was conducted with 19 PD participants (Mage = 70.74, MUPDRS-III = 30.26) and 19 healthy controls (Mage = 68.16) without dementia. The extensive protocol included a battery of cognitive, language, and speech measures in addition to four discourse tasks. Two tasks each from two discourse genres (picture sequence description; story retelling) were collected. Discourse samples were analysed using both microlinguistic and macrostructural measures. Discourse variables were collapsed statistically to a primal set of variables used to distinguish the spoken discourse of PD vs. controls. Results: Participants with PD differed significantly from controls along a continuum of productivity, grammar, informativeness, and verbal disruption domains including total words F(1,36) = 3.87, p = .06; words/minute F(1,36) = 7.74, p = .01 , % grammatical utterances F(1,36) = 11.92, p = .001, total CIUs F(1,36) = 13.30, p = .001, % CIUs (Correct Information Units) F(1,36) = 9.35, p = .004, CIUs/minute F(1,36) = 14.06, p = .001, and verbal disruptions/100 words F(1,36) = 3.87, p = .06 (α = .10). Discriminant function analyses showed that optimally weighted discourse variables discriminated the spoken discourse of PD vs. controls with 81.6% sensitivity and 86.8% specificity. For both discourse genres, discourse performance showed robust, positive, correlations with global cognition. In PD (picture sequence description), more impaired discourse performance correlated significantly with more severe motor impairment, more advanced disease staging, and higher doses of PD medications. Conclusions: The spoken discourse in PD without dementia differs significantly and predictably from controls. Results have both research and clinical implications

    The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE)

    Get PDF

    Can humain association norm evaluate latent semantic analysis?

    Get PDF
    This paper presents the comparison of word association norm created by a psycholinguistic experiment to association lists generated by algorithms operating on text corpora. We compare lists generated by Church and Hanks algorithm and lists generated by LSA algorithm. An argument is presented on how those automatically generated lists reflect real semantic relations

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The Models and Analysis of Vocal Emissions with Biomedical Applications (MAVEBA) workshop came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy

    Language transfer in second language acquisition. Some effects of L1 instruction (Romanian) on L2/L3 learning (Catalan/Spanish)

    Get PDF
    In migration contexts, the diversity of languages in contact triggers the processes of second language (L2) acquisition and language transfer; as well as drawing attention to the importance of mother tongue (L1) maintenance. The present study examines the processes of L2 acquisition (Catalan and Spanish), L1 (Romanian) maintenance, and L1transfer, in the case of 130 immigrant Romanian students, as well as the effect of attendance at L1 classes and length of residence on the three languages analysed. Accordingly, three parallel language competence tests were applied in seven public schools of Compulsory Secondary Education in Catalonia. Generally, the results indicate that the language transfer from the L1 to the L2s occurs and a longer length of residence facilitates the learning of Catalan and Spanish, but, at the same time, hinders the level of competence in L1. Also, attendance at Romanian classes seems to influence the maintenance of the mother tongue and the acquisition of the second languages.En contextos de migració, la diversitat de llengües en contacte esdevé processos d’adquisició de segones llengües (L2) i de transferència lingüística; a més de revifar el debat sobre la importància del manteniment de la llengua materna (L1). En el següent treball s’exploren els processos d’adquisició de l’L2 (català i castellà), del manteniment de l’L1 (romanès) i de la transferència lingüística de l’L1, de 130 estudiants immigrants d’origen romanès; així com l’efecte d’assistir a classes d’L1 i el temps d’estada, en les tres llengües estudiades. Per a aquest propòsit, s’han aplicat tres proves paral•leles de competència lingüística en set instituts d’Educació Secundària Obligatòria de Catalunya. A nivell general, els resultats indiquen que la transferència lingüística de l’L1 a les L2s sorgeix i que un major temps d’estada afavoreix l’aprenentatge del català i del castellà però, al mateix temps, va en detriment del nivell del coneixement adquirit en la seva L1. Així mateix, l’assistència a classes de romanès sembla influir en el manteniment de la seva llengua materna i en l’aprenentatge de segones llengües.En contextos de migración, la diversidad de lenguas en contacto desencadena procesos de adquisición de segundas lenguas (L2) y de transferencia lingüística; además de reavivar el debate sobre la importancia del mantenimiento de la lengua materna (L1). En el siguiente trabajo se exploran los procesos de adquisición de L2 (catalán y castellano), del mantenimiento de la L1 (rumano) y de la transferencia lingüística de la L1, de 130 estudiantes inmigrantes de origen rumano, así como el efecto de asistir a clases de L1 y el tiempo de estancia, en las tres lenguas estudiadas. Para ello, se han aplicado tres pruebas paralelas de competencia lingüística en siete institutos de Educación Secundaria Obligatoria de Cataluña. A nivel general, los resultados indican que se da la influencia de la L1 en las L2 y que un mayor tiempo de estancia favorece el aprendizaje del catalán y del castellano, pero, a su vez, va en detrimento del nivel de conocimiento adquirido en su L1. Asimismo, la asistencia a clases de rumano parece influir en el mantenimiento de su lengua materna y en el aprendizaje de segundas lenguas

    Manufacturing systems simulation using the principles of system dy

    Get PDF
    Manufacturing is the largest single contributor to the global economy. The evolution of consumer demands has pressurised companies into producing a larger variety of products, with improved specifications, reduced costs, and shorter lead times. In this context, companies have found simulation techniques useful in their manufacturing systems design processes; simulation based on Discrete Event Simulation (DES) is the preferred technique. The complexity of manufacturing systems, and the mechanisms of DES, means that the simulation task often consumes excessive time and resources, such as data, software, and training. Evidence suggests that an alternative modelling technique, named System Dynamics (SD), is also appropriate for conducting this task. SD has been applied successfully in other fields, where its graphical notation is considered beneficial. However, the lack of an SD tool that is tailored toward manufacturing systems has prevented industry from adopting this technique more extensively. This thesis determines the extent to which SD can provide a credible alternative to DES in the manufacturing system design process. Information concerning DES, SD and practitioners' needs was gathered from published literature and from an interview survey. A functional prototype of a tool based on the SD principles, but tailored to model manufacturing systems was then developed. Three case studies then provided valuable information concerning the requirements of industry and the capabilities of the SD technique. This research programme has found SD to be sufficiently accurate and quicker than DES tools under certain conditions, requiring less data and skills. In addition, the user interface appears to have had a significant impact on the lack of adoption of SD techniques within the manufacturing sector. Simp1ifications made by this technique can reduce both model building and model execution time, and thus, experimentation time. However, evidence suggests that DES is still more prevalent, and that further work is required to develop SD based tools tailored to manufacturing systems. Therefore, this thesis provides a much improved understanding of the capabilities of SD as an aid to manufacturing systems design

    Proceedings of Monterey Workshop 2001 Engineering Automation for Sofware Intensive System Integration

    Get PDF
    The 2001 Monterey Workshop on Engineering Automation for Software Intensive System Integration was sponsored by the Office of Naval Research, Air Force Office of Scientific Research, Army Research Office and the Defense Advance Research Projects Agency. It is our pleasure to thank the workshop advisory and sponsors for their vision of a principled engineering solution for software and for their many-year tireless effort in supporting a series of workshops to bring everyone together.This workshop is the 8 in a series of International workshops. The workshop was held in Monterey Beach Hotel, Monterey, California during June 18-22, 2001. The general theme of the workshop has been to present and discuss research works that aims at increasing the practical impact of formal methods for software and systems engineering. The particular focus of this workshop was "Engineering Automation for Software Intensive System Integration". Previous workshops have been focused on issues including, "Real-time & Concurrent Systems", "Software Merging and Slicing", "Software Evolution", "Software Architecture", "Requirements Targeting Software" and "Modeling Software System Structures in a fastly moving scenario".Office of Naval ResearchAir Force Office of Scientific Research Army Research OfficeDefense Advanced Research Projects AgencyApproved for public release, distribution unlimite
    • …
    corecore