Search CORE

51 research outputs found

The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants

Author: Gurevych Iryna
Habernal Ivan
Stein Benno
Wachsmuth Henning
Publication venue
Publication date: 01/01/2018
Field of study

Reasoning is a crucial part of natural language argumentation. To comprehend an argument, one must analyze its warrant, which explains why its claim follows from its premises. As arguments are highly contextualized, warrants are usually presupposed and left implicit. Thus, the comprehension does not only require language understanding and logic skills, but also depends on common sense. In this paper we develop a methodology for reconstructing warrants systematically. We operationalize it in a scalable crowdsourcing process, resulting in a freely licensed dataset with warrants for 2k authentic arguments from news comments. On this basis, we present a new challenging task, the argument reasoning comprehension task. Given an argument with a claim and a premise, the goal is to choose the correct implicit warrant from two options. Both warrants are plausible and lexically close, but lead to contradicting claims. A solution to this task will define a substantial step towards automatic warrant reconstruction. However, experiments with several neural attention and language models reveal that current approaches do not suffice.Comment: Accepted as NAACL 2018 Long Paper; see details on the front pag

arXiv.org e-Print Archive

TUbiblio

Crossref

The Illinois Studies in Inquiry Training: A Critical Review

Author: Cleland James Edward
Publication venue: Loyola eCommons
Publication date: 01/01/1993
Field of study

Loyola eCommons

Generalized Hidden Filter Markov Models Applied to Speaker Recognition

Author: Colombi John M.
Publication venue: AFIT Scholar
Publication date: 01/03/1996
Field of study

Classification of time series has wide Air Force, DoD and commercial interest, from automatic target recognition systems on munitions to recognition of speakers in diverse environments. The ability to effectively model the temporal information contained in a sequence is of paramount importance. Toward this goal, this research develops theoretical extensions to a class of stochastic models and demonstrates their effectiveness on the problem of text-independent (language constrained) speaker recognition. Specifically within the hidden Markov model architecture, additional constraints are implemented which better incorporate observation correlations and context, where standard approaches fail. Two methods of modeling correlations are developed, and their mathematical properties of convergence and reestimation are analyzed. These differ in modeling correlation present in the time samples and those present in the processed features, such as Mel frequency cepstral coefficients. The system models speaker dependent phonemes, making use of word dictionary grammars, and recognition is based on normalized log-likelihood Viterbi decoding. Both closed set identification and speaker verification using cohorts are performed on the YOHO database. YOHO is the only large scale, multiple-session, high-quality speech database for speaker authentication and contains over one hundred speakers stating combination locks. Equal error rates of 0.21% for males and 0.31% for females are demonstrated. A critical error analysis using a hypothesis test formulation provides the maximum number of errors observable while still meeting the goal error rates of 1% False Reject and 0.1% False Accept. Our system achieves this goal

AFTI Scholar (Air Force Institute of Technology)

Defining and Assessing Critical Thinking: toward an automatic analysis of HiEd students’ written texts

Author: AMENDUNI FRANCESCA
Publication venue: Università di Foggia
Publication date: 01/01/2021
Field of study

L'obiettivo principale di questa tesi di dottorato è testare, attraverso due studi empirici, l'affidabilità di un metodo volto a valutare automaticamente le manifestazioni del Pensiero Critico (CT) nei testi scritti da studenti universitari. Gli studi empirici si sono basati su una review critica della letteratura volta a proporre una nuova classificazione per sistematizzare le diverse definizioni di CT e i relativi approcci teorici. La review esamina anche la relazione tra le diverse definizioni di CT e i relativi metodi di valutazione. Dai risultati emerge la necessità di concentrarsi su misure aperte per la valutazione del CT e di sviluppare strumenti automatici basati su tecniche di elaborazione del linguaggio naturale (NLP) per superare i limiti attuali delle misure aperte, come l’attendibilità e i costi di scoring. Sulla base di una rubrica sviluppata e implementata dal gruppo di ricerca del Centro di Didattica Museale – Università di Roma Tre (CDM) per la valutazione e l'analisi dei livelli di CT all'interno di risposte aperte (Poce, 2017), è stato progettato un prototipo per la misurazione automatica di alcuni indicatori di CT. Il primo studio empirico condotto su un gruppo di 66 docenti universitari mostra livelli di affidabilità soddisfacenti della rubrica di valutazione, mentre la valutazione effettuata dal prototipo non era sufficientemente attendibile. I risultati di questa sperimentazione sono stati utilizzati per capire come e in quali condizioni il modello funziona meglio. La seconda indagine empirica era volta a capire quali indicatori del linguaggio naturale sono maggiormente associati a sei sottodimensioni del CT, valutate da esperti in saggi scritti in lingua italiana. Lo studio ha utilizzato un corpus di 103 saggi pre-post di studenti universitari di laurea magistrale che hanno frequentato il corso di "Pedagogia sperimentale e valutazione scolastica". All'interno del corso, sono state proposte due attività per stimolare il CT degli studenti: la valutazione delle risorse educative aperte (OER) (obbligatoria e online) e la progettazione delle OER (facoltativa e in modalità blended). I saggi sono stati valutati sia da valutatori esperti, considerando sei sotto-dimensioni del CT, sia da un algoritmo che misura automaticamente diversi tipi di indicatori del linguaggio naturale. Abbiamo riscontrato un'affidabilità interna positiva e un accordo tra valutatori medio-alto. I livelli di CT degli studenti sono migliorati in modo significativo nel post-test. Tre indicatori del linguaggio naturale sono 5 correlati in modo significativo con il punteggio totale di CT: la lunghezza del corpus, la complessità della sintassi e la funzione di peso tf-idf (term frequency–inverse document frequency). I risultati raccolti durante questo dottorato hanno implicazioni sia teoriche che pratiche per la ricerca e la valutazione del CT. Da un punto di vista teorico, questa tesi mostra sovrapposizioni inesplorate tra diverse tradizioni, prospettive e metodi di studio del CT. Questi punti di contatto potrebbero costituire la base per un approccio interdisciplinare e la costruzione di una comprensione condivisa di CT. I metodi di valutazione automatica possono supportare l’uso di misure aperte per la valutazione del CT, specialmente nell'insegnamento online. Possono infatti facilitare i docenti e i ricercatori nell'affrontare la crescente presenza di dati linguistici prodotti all'interno di piattaforme educative (es. Learning Management Systems). A tal fine, è fondamentale sviluppare metodi automatici per la valutazione di grandi quantità di dati che sarebbe impossibile analizzare manualmente, fornendo agli insegnanti e ai valutatori un supporto per il monitoraggio e la valutazione delle competenze dimostrate online dagli studenti.The main goal of this PhD thesis is to test, through two empirical studies, the reliability of a method aimed at automatically assessing Critical Thinking (CT) manifestations in Higher Education students’ written texts. The empirical studies were based on a critical review aimed at proposing a new classification for systematising different CT definitions and their related theoretical approaches. The review also investigates the relationship between the different adopted CT definitions and CT assessment methods. The review highlights the need to focus on open-ended measures for CT assessment and to develop automatic tools based on Natural Language Processing (NLP) technique to overcome current limitations of open-ended measures, such as reliability and costs. Based on a rubric developed and implemented by the Center for Museum Studies – Roma Tre University (CDM) research group for the evaluation and analysis of CT levels within open-ended answers (Poce, 2017), a NLP prototype for the automatic measurement of CT indicators was designed. The first empirical study was carried out on a group of 66 university teachers. The study showed satisfactory reliability levels of the CT evaluation rubric, while the evaluation carried out by the prototype was not yet sufficiently reliable. The results were used to understand how and under what conditions the model works better. The second empirical investigation was aimed at understanding which NLP features are more associated with six CT sub-dimensions as assessed by human raters in essays written in the Italian language. The study used a corpus of 103 students’ pre-post essays who attended a Master's Degree module in “Experimental Education and School Assessment” to assess students' CT levels. Within the module, we proposed two activities to stimulate students' CT: Open Educational Resources (OERs) assessment (mandatory and online) and OERs design (optional and blended). The essays were assessed both by expert evaluators, considering six CT sub-dimensions, and by an algorithm that automatically calculates different kinds of NLP features. The study shows a positive internal reliability and a medium to high inter-coder agreement in expert evaluation. Students' CT levels improved significantly in the post-test. Three NLP indicators significantly correlate with CT total score: the Corpus Length, the Syntax Complexity, and an adapted measure of Term Frequency- Inverse Document Frequency. The results collected during this PhD have both theoretical and practical implications for CT research and assessment. From a theoretical perspective, this thesis shows unexplored similarities among different CT traditions, perspectives, and study methods. These similarities could be exploited to open up an interdisciplinary dialogue among experts and build up a shared understanding of CT. Automatic assessment methods can enhance the use of open-ended measures for CT assessment, especially in online teaching. Indeed, they can support teachers and researchers to deal with the growing presence of linguistic data produced within educational 4 platforms. To this end, it is pivotal to develop automatic methods for the evaluation of large amounts of data which would be impossible to analyse manually, providing teachers an

Archivio Istituzionale della Ricerca- Università degli Studi di Foggia

Contribution au pronostic de défaillances guidé par des données

Author: Medjaher Kamal
Publication venue: HAL CCSD
Publication date: 11/12/2014
Field of study

Ce mémoire d’Habilitation à Diriger des Recherche (HDR) présente, dans la première partie, une synthèse de mes travaux d’enseignement et de recherche réalisés au sein de l’École Nationale Supérieure de Mécanique et des Microtechniques (ENSMM) et de l’Institut FEMTO-ST. Ces travaux s’inscrivent dans la thématique du PHM (Prognostics and Health Management) et concernent le développement d’une approche intégrée de pronostic de défaillances guidée par des données. L’approche proposée repose sur l’acquisition de données représentatives des dégradations de systèmes physiques, l’extraction de caractéristiques pertinentes et la construction d’indicateurs de santé, la modélisation des dégradations, l’évaluation de l’état de santé et la prédiction de durées de fonctionnement avant défaillances (RUL : Remaining Useful Life). Elle fait appel à deux familles d’outils : d’un côté des outils probabilistes/stochastiques, tels que les réseaux Bayésiens dynamiques, et de l’autre côté les modèles de régression non linéaires, notamment les machines à vecteurs de support pour la régression. La seconde partie du mémoire présente le projet de recherche autour du PHM de systèmes complexes et de MEMS (Micro-Electro-Mechanical Systems), avec une orientation vers l’approche de pronostic hybride en combinant l’approche guidée par des données et l’approche basée sur des modèles physiques.This Habilitation manuscript presents, in the first part, a synthesis of my teaching and research works achieved at the National Institute of Mechanics and Microtechnologies (ENSMM) and at FEMTO-ST Institute. These works are within the topic of Prognostics and Health Management (PHM) and concern the development of an integrated data-driven failure prognostic approach. The proposed approach relies on acquisition of data which are representative of systems degradations, extraction of relevant features and construction of health indicators, degradation modeling, health assessment and Remaining Useful Life (RUL) prediction. This approach uses two groups of tools: probabilistic/stochastic tools, such as dynamic Bayesian networks, from one hand, and nonlinear regression models such as support vector machine for regression and Gaussian process regression, from the other hand. The second part of the manuscript presents the research project related to PHM of complex systems and MEMS (Micro-Electro-Mechanical Systems), with an orientation towards a hybrid prognostic approach by considering both model-based and data-driven approaches

Thèses en Ligne

HAL - Université de Franche-Comté

Cultural Consultations in Criminal Forensic Psychology: A Thematic Analysis of the Literature

Author: Radosteva Alesya
Publication venue: AURA - Antioch University Repository and Archive
Publication date: 01/01/2018
Field of study

The importance of culture as a reference point in clinical practices such as forensic psychology has been considerably valued yet poorly understood, especially in an age where precision and sophistication outlast cultural authenticity and patient-clinician relationship. This paper looks at the gaps and inconsistencies that exist in current forensic psychology research. The topic is introduced by delving into the understanding of the phenomenon of culture and its influences on our everyday conditioning. Aspects such as language, biological development, traditions, rituals, and narratives are emphasized as potent tools that drive individuals to create and mold culture according to needs and requirements of the moment. These elements are then used for signifying the inherent ways in which culture can result in both despair as well as positive enforcement, thereby being a powerful element of consideration in forensic assessment practice. The essential concept explored in this paper involves the clinicians’ perspectives on the meaning of cultural values, norms and beliefs that shape the behavior of the patient. Through this exploration I attempted to understand how the clinical practice of forensic psychology can be made more authentic and less cold and calculated by consideration of cultural malleability. By using thematic analysis, I reviewed a large collection of the relevant literature in an attempt to understand the core concepts that drive clinicians in their cultural considerations. I emphasized attention to the malleable nature of culture and the intricate ways in which culture is related to biological, psychological, anthropological, and legal aspects of forensic psychology. The conclusions of the paper include specific considerations for creating a well-structured cultural consultation model, which emphasizes attention to aspects like clinical approach, patient’s family of origin, current community, as well as biological and psychological conditions of the patient and the patient’s cultural perspective on those conditions

Antioch University Repository and Archive (AURA)

Better predictions when models are wrong or underspecified

Author: Ommen M. (Thijs) van
Publication venue
Publication date: 01/01/2015
Field of study

Many statistical methods rely on models of reality in order to learn from data and to make predictions about future data. By necessity, these models usually do not match reality exactly, but are either wrong (none of the hypotheses in the model provides an accurate description of reality) or underspecified (the hypotheses in the model describe only part of the data). In this thesis, we discuss three scenarios involving models that are wrong or underspecified. In each case, we find that standard statistical methods may fail, sometimes dramatically, and present different methods that continue to perform well even if the models are wrong or underspecified. The first two of these scenarios involve regression problems and investigate AIC (Akaike's Information Criterion) and Bayesian statistics. The third scenario has the famous Monty Hall problem as a special case, and considers the question how we can update our belief about an unknown outcome given new evidence when the precise relation between outcome and evidence is unknown.UBL - phd migration 201

CiteSeerX

CWI's Institutional Repository

Leiden University Scholary Publications

Turvalisel ühisarvutusel põhinev privaatsust säilitav statistiline analüüs

Author: Kamm Liina
Publication venue
Publication date: 06/02/2015
Field of study

Väitekirja elektrooniline versioon ei sisalda publikatsioone.Kaasaegses ühiskonnas luuakse inimese kohta digitaalne kirje kohe pärast tema sündi. Sellest hetkest alates jälgitakse tema käitumist ning kogutakse andmeid erinevate eluvaldkondade kohta. Kui kasutate poes kliendikaarti, käite arsti juures, täidate maksudeklaratsiooni või liigute lihtsalt ringi mobiiltelefoni taskus kandes, koguvad ning salvestavad firmad ja riigiasutused teie tundlikke andmeid. Vahel anname selliseks jälitustegevuseks vabatahtlikult loa, et saada mingit kasu. Näiteks võime saada soodustust, kui kasutame kliendikaarti. Teinekord on meil vaja teha keeruline otsus, kas loobuda võimalusest teha mobiiltelefonikõnesid või lubada enda jälgimine mobiilimastide kaudu edastatava info abil. Riigiasutused haldavad infot meie tervise, hariduse ja sissetulekute kohta, et meid paremini ravida, harida ja meilt makse koguda. Me loodame, et meie andmeid kasutatakse mõistlikult, aga samas eeldame, et meie privaatsus on tagatud. Käesolev töö uurib, kuidas teostada statistilist analüüsi nii, et tagada üksikisiku privaatsus. Selle eesmärgi saavutamiseks kasutame turvalist ühisarvutust. See krüptograafiline meetod lubab analüüsida andmeid nii, et üksikuid väärtuseid ei ole kunagi võimalik näha. Hoolimata sellest, et turvalise ühisarvutuse kasutamine on aeganõudev protsess, näitame, et see on piisavalt kiire ja seda on võimalik kasutada isegi väga suurte andmemahtude puhul. Me oleme teinud võimalikuks populaarseimate statistilise analüüsi meetodite kasutamise turvalise ühisarvutuse kontekstis. Me tutvustame privaatsust säilitavat statistilise analüüsi tööriista Rmind, mis sisaldab kõiki töö käigus loodud funktsioone. Rmind sarnaneb tööriistadele, millega statistikud on harjunud. See lubab neil viia läbi uuringuid ilma, et nad peaksid üksikasjalikult tundma allolevaid krüptograafilisi protokolle. Kasutame dissertatsioonis kirjeldatud meetodeid, et valmistada ette statistiline uuring, mis ühendab kaht Eesti riiklikku andmekogu. Uuringu eesmärk on teada saada, kas Eesti tudengid, kes töötavad ülikooliõpingute ajal, lõpetavad nominaalajaga väiksema tõenäosusega kui nende õpingutele keskenduvad kaaslased.In a modern society, from the moment a person is born, a digital record is created. From there on, the person’s behaviour is constantly tracked and data are collected about the different aspects of his or her life. Whether one is swiping a customer loyalty card in a store, going to the doctor, doing taxes or simply moving around with a mobile phone in one’s pocket, sensitive data are being gathered and stored by governments and companies. Sometimes, we give our permission for this kind of surveillance for some benefit. For instance, we could get a discount using a customer loyalty card. Other times we have a difficult choice – either we cannot make phone calls or our movements are tracked based on cellular data. The government tracks information about our health, education and income to cure us, educate us and collect taxes. We hope that the data are used in a meaningful way, however, we also have an expectation of privacy. This work focuses on how to perform statistical analyses in a way that preserves the privacy of the individual. To achieve this goal, we use secure multi-‐party computation. This cryptographic technique allows data to be analysed without seeing the individual values. Even though using secure multi-‐party computation is a time-‐consuming process, we show that it is feasible even for large-‐scale databases. We have developed ways for using the most popular statistical analysis methods with secure multi-‐party computation. We introduce a privacy-‐preserving statistical analysis tool called Rmind that contains all of our resulting implementations. Rmind is similar to tools that statistical analysts are used to. This allows them to carry out studies on the data without having to know the details of the underlying cryptographic protocols. The methods described in the thesis are used in practice to prepare for running a statistical study on large-‐scale real-‐life data to find out whether Estonian students who are working during university studies are less likely to graduate in nominal time

DSpace at Tartu University Library