997 research outputs found

    Designing Statistical Language Learners: Experiments on Noun Compounds

    Full text link
    The goal of this thesis is to advance the exploration of the statistical language learning design space. In pursuit of that goal, the thesis makes two main theoretical contributions: (i) it identifies a new class of designs by specifying an architecture for natural language analysis in which probabilities are given to semantic forms rather than to more superficial linguistic elements; and (ii) it explores the development of a mathematical theory to predict the expected accuracy of statistical language learning systems in terms of the volume of data used to train them. The theoretical work is illustrated by applying statistical language learning designs to the analysis of noun compounds. Both syntactic and semantic analysis of noun compounds are attempted using the proposed architecture. Empirical comparisons demonstrate that the proposed syntactic model is significantly better than those previously suggested, approaching the performance of human judges on the same task, and that the proposed semantic model, the first statistical approach to this problem, exhibits significantly better accuracy than the baseline strategy. These results suggest that the new class of designs identified is a promising one. The experiments also serve to highlight the need for a widely applicable theory of data requirements.Comment: PhD thesis (Macquarie University, Sydney; December 1995), LaTeX source, xii+214 page

    The process of constructing ontological meaning based on criminal law verbs

    Get PDF
    This study intends to account for the process involved in the construction of the conceptual meaning of verbs (#EVENTS) directly related to legal aspects of terrorism and organized crime based on the evidence provided by the Globalcrimeterm Corpus and the consistent application of specific criteria for term extraction. The selected 49 concepts have eventually been integrated in the Core Ontology of FunGramKB (Functional Grammar Knowledge Base), a knowledge base which is founded on the principles of deep semantics and is also aimed at the computational development of the Lexical Constructional Model (www.fungramkb.com). To achieve this purpose, key phases of the COHERENT methodology (Periñån Pascual & Mairal Usón 2011) are followed, particularly those which involve the modelling, subsumption and hierarchisation of the aforementioned verbal concepts. The final outcome of this research shows that most of the apparently specialised conceptual units should eventually be included in the Core Ontology instead of the specific Globalcrimeterm Subontology, due to the fact that the semantic content of their corresponding lexical units can be found in widely used learner`s dictionaries and, consequently, this conceptual information is not only shared by the experts in the field but also by the layperson and the average speaker of the language

    Modélisation basée sur une ontologie pour la simulation de comportements de nouveaux-nés lors de la réanimation cardio-pulmonaire

    Get PDF
    International audienceThis chapter concerns the formulation of a methodology and its implementation to elaborate a training simulator for medical staff who may be confronted with the critical situations of new-borns resuscitation. The simulator reproduces the different cardio-pulmonary pathological behaviours of new-borns, the working environment of resuscitation rooms, and the monitoring and control environment of the learners by a teacher. Conceptual models of new-borns behaviours combined with the cardio-pulmonary resuscitation gestures have been developed. The methodological process is jointly using cognitive approaches with formal modelling and simulation. Cognitive approaches are mobilized to elaborate application ontologies to be the bases for the development of the conceptual models and the specification of the simulator. Ontologies have been developed on the bases of a corpus of academic documents, return on experience documents, and practitioner interviews, by means of the Knowledge Oriented Design (KOD) method. A discrete event formalism has been used to formalize the conceptual models of the new-borns behaviours. As a result, a simulator has been built to train medical practitioners to face situations, which are reported to potentially cause errors and thus improve the safety of the resuscitation gestures

    Computer Vision and Architectural History at Eye Level:Mixed Methods for Linking Research in the Humanities and in Information Technology

    Get PDF
    Information on the history of architecture is embedded in our daily surroundings, in vernacular and heritage buildings and in physical objects, photographs and plans. Historians study these tangible and intangible artefacts and the communities that built and used them. Thus valuableinsights are gained into the past and the present as they also provide a foundation for designing the future. Given that our understanding of the past is limited by the inadequate availability of data, the article demonstrates that advanced computer tools can help gain more and well-linked data from the past. Computer vision can make a decisive contribution to the identification of image content in historical photographs. This application is particularly interesting for architectural history, where visual sources play an essential role in understanding the built environment of the past, yet lack of reliable metadata often hinders the use of materials. The automated recognition contributes to making a variety of image sources usable forresearch.<br/

    Computer Vision and Architectural History at Eye Level:Mixed Methods for Linking Research in the Humanities and in Information Technology

    Get PDF
    Information on the history of architecture is embedded in our daily surroundings, in vernacular and heritage buildings and in physical objects, photographs and plans. Historians study these tangible and intangible artefacts and the communities that built and used them. Thus valuableinsights are gained into the past and the present as they also provide a foundation for designing the future. Given that our understanding of the past is limited by the inadequate availability of data, the article demonstrates that advanced computer tools can help gain more and well-linked data from the past. Computer vision can make a decisive contribution to the identification of image content in historical photographs. This application is particularly interesting for architectural history, where visual sources play an essential role in understanding the built environment of the past, yet lack of reliable metadata often hinders the use of materials. The automated recognition contributes to making a variety of image sources usable forresearch.<br/

    Mixing Methods: Practical Insights from the Humanities in the Digital Age

    Get PDF
    The digital transformation is accompanied by two simultaneous processes: digital humanities challenging the humanities, their theories, methodologies and disciplinary identities, and pushing computer science to get involved in new fields. But how can qualitative and quantitative methods be usefully combined in one research project? What are the theoretical and methodological principles across all disciplinary digital approaches? This volume focusses on driving innovation and conceptualising the humanities in the 21st century. Building on the results of 10 research projects, it serves as a useful tool for designing cutting-edge research that goes beyond conventional strategies

    Alternating ditransitives in English: a corpus-based study

    Get PDF
    This thesis is a large-scale investigation of ditransitive constructions and their alternants in English. Typically both constructions involve three participants: participant A transfers an element B to participant C. A speaker can linguistically encode this type of situation in one of two ways: by using either a double object construction or a prepositional paraphrase. This study examines this syntactic choice in the British component of the International Corpus of English (ICE-GB), a fully tagged and parsed corpus incorporating both spoken and written English. After a general introduction, chapter 2 reviews the different grammatical treatments of the constructions. Chapter 3 discusses whether indirect objects have to be considered necessary complements or optional adjuncts of the verb. I then examine the tension between rigid classification and authentic (corpus) data in order to demonstrate that the distinction between complements and adjuncts evidences gradient categorisation effects. This study has both a linguistic and a methodological angle. The overall design and methodology employed in this study are discussed in chapter 4. The thesis considers a number of variables that help predict the occurrence of each pattern. The evaluation of the variables, the determination of their significance, and the measurement of their contribution to the model involve reliance on statistical methods (but not statistical software packages). Chapters 5, 6, and 7 review pragmatic factors claimed to influence a speaker’s choice of construction, among them the information status and the syntactic ‘heaviness’ of the constituents involved. The explanatory power and coverage of these factors are experimentally tested independently against the corpus data, in order to highlight several features which only emerge after examining authentic sources. Chapter 8 posits a novel method of bringing these factors together; the resulting model predicts the dative alternation with almost 80% accuracy in ICE-GB. Conclusions are offered in chapter 9
    • 

    corecore