Search CORE

9,172 research outputs found

Better representation learning for TPMS

Author: Raza Amir
Publication venue
Publication date: 01/10/2021
Field of study

Avec l’augmentation de la popularité de l’IA et de l’apprentissage automatique, le nombre de participants a explosé dans les conférences AI/ML. Le grand nombre d’articles soumis et la nature évolutive des sujets constituent des défis supplémentaires pour les systèmes d’évaluation par les pairs qui sont cruciaux pour nos communautés scientifiques. Certaines conférences ont évolué vers l’automatisation de l’attribution des examinateurs pour les soumissions, le TPMS [1] étant l’un de ces systèmes existants. Actuellement, TPMS prépare des profils de chercheurs et de soumissions basés sur le contenu, afin de modéliser l’adéquation des paires examinateur-soumission. Dans ce travail, nous explorons différentes approches pour le réglage fin auto-supervisé des transformateurs BERT pour les données des documents de conférence. Nous démontrons quelques nouvelles approches des vues d’augmentation pour l’auto-supervision dans le traitement du langage naturel, qui jusqu’à présent était davantage axée sur les problèmes de vision par ordinateur. Nous utilisons ensuite ces représentations d’articles individuels pour construire un modèle d’expertise qui apprend à combiner la représentation des différents travaux publiés d’un examinateur et à prédire leur pertinence pour l’examen d’un article soumis. Au final, nous montrons que de meilleures représentations individuelles des papiers et une meilleure modélisation de l’expertise conduisent à de meilleures performances dans la tâche de prédiction de l’adéquation de l’examinateur.With the increase in popularity of AI and Machine learning, participation numbers have exploded in AI/ML conferences. The large number of submission papers and the evolving nature of topics constitute additional challenges for peer-review systems that are crucial for our scientific communities. Some conferences have moved towards automating the reviewer assignment for submissions, TPMS [1] being one such existing system. Currently, TPMS prepares content-based profiles of researchers and submission papers, to model the suitability of reviewer-submission pairs. In this work, we explore different approaches to self-supervised fine-tuning of BERT transformers for conference papers data. We demonstrate some new approaches to augmentation views for self-supervision in natural language processing, which till now has been more focused on problems in computer vision. We then use these individual paper representations for building an expertise model which learns to combine the representation of different published works of a reviewer and predict their relevance for reviewing a submission paper. In the end, we show that better individual paper representations and expertise modeling lead to better performance on the reviewer suitability prediction task

Dépôt Institutionnel Numérique

AN ANALYSIS OF ENGLISH SUMMATIVE TEST FOR THE SECOND GRADE STUDENTS OF JUNIOR HIGH SCHOOL 2 KARTASURA IN ACADEMIC YEAR 2016 / 2017

Author: Dra.Hj. Woro Retnaningsih, M.Pd.
Rahmawati Maisa
Publication venue
Publication date: 24/08/2017
Field of study

Rahmawati,Maisa.2013. An Analysis on English Summative Test For The Second Grade Students of Junior High School 2 Kartasura In Academic Year 2016 / 2017. Thesis.English Education Study Program, Islamic Education and Teacher Training Faculty. Advisors : Dra. Hj. Woro Retnaningsih M.Pd. Key Words : Summative Test, Analysis Item The objective in this study are (1) to describe about the content material tested in English summative test for the second grade students of SMPN 2 Kartasura weather suitable with their English syllabus KTSP. The research analyzed the quality between syllabus of the summative test in the final test of the second semester for the eighth grade at SMPN 2 Kartasura in the academic year 2016 / 2017. The quality of the test can be known by performing an analysis called the test item analysis. Grain test analysis has several advantages: (a) providing information about the test, (b) knowing the student's progress in learning that can later be improved, and (c) providing knowledge to the teacher about making quality questions. In this research, the researcher used descriptive qualitative research. The form of descriptive qualitative research is used to analysis the data. The researcher collected, the data from the English summative test teacher of xiii SMPN 2 Kartasura, the researcher asked for the syllabus and the summative test of the English subject on the second semester 2016 / 2017 academic year of the second grade of SMPN 2 Kartasura. The researcher analyzed which test items number have suitable and have not suitable to the syllabus of curriculum. The test is measured adaptable with the syllabus and indicators especially from reading, speaking and writing skills. The result of the research show that the final test items for the second semester of eighth grade students of SMPN 2 Kartasura, Pabelan in 2016 / 2017 academic year is good and suitable based on the syllabus and lesson plan . The test items form are divided into two kinds, there are multiple-choice, and essay items . The multiple choice items from 50 items, 3 not suitable items and 47 suitable items on the syllabus, the essay items from 5 items, 3 essay items is suitable, 2 essay items not suitable on the syllabus. Based on the data analysis, the research conclude that the final test items for the second semester on eight grade students SMPN 2 Kartasura Pabelan in 2016 / 2017 academic year is good and suitable based on the syllabus and lesson plan used at SMPN 2 Kartasura in academic year 2016 / 2017. xiv ABSTRAK Rahmawati, Maisa.2013. Analisis Tes Sumatif Bahasa Inggris untuk Siswa Kelas delapan SMP 2 Kartasura Tahun Ajaran 2016 / 2017. Skripsi. Program Studi Pendidikan Bahasa Inggris, Fakultas Ilmu Tarbiyah dan Keguruan. Pembibing : Dra. Hj. Woro Retnaningsih M.Pd. Kata Kunci: Tes Sumatif, Analisis Butir Soal Tujuan dalam penelitian ini adalah (1) mendeskripsikan isi materi yang dalam tes sumatif bahasa Inggris untuk siswa kelas delapan SMPN 2 Kartasura apakah sesuai dengan silabus bahasa Inggris KTSP di SMPN 2 Kartasura. Penelitian ini menganalisis kesesuaian antara silabus dengan tes sumatif pada ulangan akhir semester kedua untuk kelas delapan di SMPN 2 Kartasuratahun akademik 2016 / 2017. Kualitas pengujian dapat diketahui dengan melakukan analisis yang disebut uji Analisis item.Analisis butir soal memiliki beberapa keunggulan: (a) memberikan informasi tentang tes tersebut, (b) mengetahui kemajuan siswa dalam pembelajaran yang nantinya dapat ditingkatkan, dan (c) memberikan pengetahuan kepada guru tentang membuat pertanyaanberkualitas. Dalam penelitian ini, peneliti menggunakan penelitian kualitatif deskriptif. Bentuk penelitian deskriptif kualitatif digunakan untuk analisis data. Peneliti mengumpulkan data dari guru bahasa Inggris SMPN 2 Kartasura, peneliti meminta silabus dan soal tes sumatif mata pelajaran bahasa Inggris pada semester kedua tahun 2016/2017 tahun ajaran kedua SMPN 2 Kartasura. Peneliti xv menganalisis item tes mana yang sesuai dan tidak sesuai dengan silabus kurikulum. Tes ini diukur dan di adaptasi dari silabus dan indikator terutama dari kemampuanmembaca,berbicara dan menulis. Hasil penelitian menunjukkan bahwa item ulangan akhir untuk semester kedua siswa kelas delapan SMPN 2 KartasuraPabelan tahun 2016/2017 baik dan sesuai berdasarkan silabus dan rencana pembelajaran.Item tes dibagi menjadi dua macam yaitu soal pilihan ganda dan essai. Soal pilihan ganda terdiri dari 50 soal, 3 soal tidak sesuai dan 47 soal sesuai dengan silabus, sedangkan essai terdiri dari 5 soal , 2 item esai tidak sesuai dengan silabus dan 3 soal lainnya sesuai dengan silabus kurikulum. Berdasarkan hasil analisis data, penelitian ini menyimpulkan bahwa item ulangan akhir semester kedua kelas delapan SMPN 2 Kartasura Pabelan tahun 2016/2017 baik dan sesuai berdasarkan silabus dan rencana pembelajaran yang digunakan di SMPN 2 Kartasura Tahun 2016/2017

iainska repository

Context-aware ranking : from search to dialogue

Author: Zhu Yutao
Publication venue
Publication date: 01/03/2023
Field of study

Les systèmes de recherche d'information (RI) ou moteurs de recherche ont été largement utilisés pour trouver rapidement les informations pour les utilisateurs. Le classement est la fonction centrale de la RI, qui vise à ordonner les documents candidats dans une liste classée en fonction de leur pertinence par rapport à une requête de l'utilisateur. Alors que IR n'a considéré qu'une seule requête au début, les systèmes plus récents prennent en compte les informations de contexte. Par exemple, dans une session de recherche, le contexte de recherche tel que le requêtes et interactions précédentes avec l'utilisateur, est largement utilisé pour comprendre l'intention de la recherche de l'utilisateur et pour aider au classement des documents. En plus de la recherche ad-hoc traditionnelle, la RI a été étendue aux systèmes de dialogue (c'est-à-dire, le dialogue basé sur la recherche, par exemple, XiaoIce), où on suppose avoir un grand référentiel de dialogues et le but est de trouver la réponse pertinente à l'énoncé courant d'un utilisateur. Encore une fois, le contexte du dialogue est un élément clé pour déterminer la pertinence d'une réponse. L'utilisation des informations contextuelles a fait l'objet de nombreuses études, allant de l'extraction de mots-clés importants du contexte pour étendre la requête ou l'énoncé courant de dialogue, à la construction d'une représentation neuronale du contexte qui sera utilisée avec la requête ou l'énoncé de dialogue pour la recherche. Nous remarquons deux d'importantes insuffisances dans la littérature existante. (1) Pour apprendre à utiliser les informations contextuelles, on doit extraire des échantillons positifs et négatifs pour l'entraînement. On a généralement supposé qu'un échantillon positif est formé lorsqu'un utilisateur interagit avec (clique sur) un document dans un contexte, et un un échantillon négatif est formé lorsqu'aucune interaction n'est observée. En réalité, les interactions des utilisateurs sont éparses et bruitées, ce qui rend l'hypothèse ci-dessus irréaliste. Il est donc important de construire des exemples d'entraînement d'une manière plus appropriée. (2) Dans les systèmes de dialogue, en particulier les systèmes de bavardage (chitchat), on cherche à trouver ou générer les réponses sans faire référence à des connaissances externes, ce qui peut facilement provoquer des réponses non pertinentes ou des hallucinations. Une solution consiste à fonder le dialogue sur des documents ou graphe de connaissances externes, où les documents ou les graphes de connaissances peuvent être considérés comme de nouveaux types de contexte. Le dialogue fondé sur les documents et les connaissances a été largement étudié, mais les approches restent simplistes dans la mesure où le contenu du document ou les connaissances sont généralement concaténés à l'énoncé courant. En réalité, seules certaines parties du document ou du graphe de connaissances sont pertinentes, ce qui justifie un modèle spécifique pour leur sélection. Dans cette thèse, nous étudions le problème du classement de textes en tenant compte du contexte dans le cadre de RI ad-hoc et de dialogue basé sur la recherche. Nous nous concentrons sur les deux problèmes mentionnés ci-dessus. Spécifiquement, nous proposons des approches pour apprendre un modèle de classement pour la RI ad-hoc basée sur des exemples d'entraîenemt sélectionnés à partir d'interactions utilisateur bruitées (c'est-à-dire des logs de requêtes) et des approches à exploiter des connaissances externes pour la recherche de réponse pour le dialogue. La thèse est basée sur cinq articles publiés. Les deux premiers articles portent sur le classement contextuel des documents. Ils traitent le problème ovservé dans les études existantes, qui considèrent tous les clics dans les logs de recherche comme des échantillons positifs, et prélever des documents non cliqués comme échantillons négatifs. Dans ces deux articles, nous proposons d'abord une stratégie d'augmentation de données non supervisée pour simuler les variations potentielles du comportement de l'utilisateur pour tenir compte de la sparcité des comportements des utilisateurs. Ensuite, nous appliquons l'apprentissage contrastif pour identifier ces variations et à générer une représentation plus robuste du comportement de l'utilisateur. D'un autre côté, comprendre l'intention de recherche dans une session de recherche peut représentent différents niveaux de difficulté - certaines intentions sont faciles à comprendre tandis que d'autres sont plus difficiles et nuancées. Mélanger directement ces sessions dans le même batch d'entraînement perturbera l'optimisation du modèle. Par conséquent, nous proposons un cadre d'apprentissage par curriculum avec des examples allant de plus faciles à plus difficiles. Les deux méthodes proposées obtiennent de meilleurs résultats que les méthodes existantes sur deux jeux de données de logs de requêtes réels. Les trois derniers articles se concentrent sur les systèmes de dialogue fondé les documents/connaissances. Nous proposons d'abord un mécanisme de sélection de contenu pour le dialogue fondé sur des documents. Les expérimentations confirment que la sélection de contenu de document pertinent en fonction du contexte du dialogue peut réduire le bruit dans le document et ainsi améliorer la qualité du dialogue. Deuxièmement, nous explorons une nouvelle tâche de dialogue qui vise à générer des dialogues selon une description narrative. Nous avons collecté un nouveau jeu de données dans le domaine du cinéma pour nos expérimentations. Les connaissances sont définies par une narration qui décrit une partie du scénario du film (similaire aux dialogues). Le but est de créer des dialogues correspondant à la narration. À cette fin, nous concevons un nouveau modèle qui tient l'état de la couverture de la narration le long des dialogues et déterminer la partie non couverte pour le prochain tour. Troisièmement, nous explorons un modèle de dialogue proactif qui peut diriger de manière proactive le dialogue dans une direction pour couvrir les sujets requis. Nous concevons un module de prédiction explicite des connaissances pour sélectionner les connaissances pertinentes à utiliser. Pour entraîner le processus de sélection, nous générons des signaux de supervision par une méthode heuristique. Les trois articles examinent comment divers types de connaissances peuvent être intégrés dans le dialogue. Le contexte est un élément important dans la RI ad-hoc et le dialogue, mais nous soutenons que le contexte doit être compris au sens large. Dans cette thèse, nous incluons à la fois les interactions précédentes avec l'utilisateur, le document et les connaissances dans le contexte. Cette série d'études est un pas dans la direction de l'intégration d'informations contextuelles diverses dans la RI et le dialogue.Information retrieval (IR) or search systems have been widely used to quickly find desired information for users. Ranking is the central function of IR, which aims at ordering the candidate documents in a ranked list according to their relevance to a user query. While IR only considered a single query in the early stages, more recent systems take into account the context information. For example, in a search session, the search context, such as the previous queries and interactions with the user, is widely used to understand the user's search intent and to help document ranking. In addition to the traditional ad-hoc search, IR has been extended to dialogue systems (i.e., retrieval-based dialogue, e.g., XiaoIce), where one assumes a large repository of previous dialogues and the goal is to retrieve the most relevant response to a user's current utterance. Again, the dialogue context is a key element for determining the relevance of a response. The utilization of context information has been investigated in many studies, which range from extracting important keywords from the context to expand the query or current utterance, to building a neural context representation used with the query or current utterance for search. We notice two important insufficiencies in the existing literature. (1) To learn to use context information, one has to extract positive and negative samples for training. It has been generally assumed that a positive sample is formed when a user interacts with a document in a context, and a negative sample is formed when no interaction is observed. In reality, user interactions are scarce and noisy, making the above assumption unrealistic. It is thus important to build more appropriate training examples. (2) In dialogue systems, especially chitchat systems, responses are typically retrieved or generated without referring to external knowledge. This may easily lead to hallucinations. A solution is to ground dialogue on external documents or knowledge graphs, where the grounding document or knowledge can be seen as new types of context. Document- and knowledge-grounded dialogue have been extensively studied, but the approaches remain simplistic in that the document content or knowledge is typically concatenated to the current utterance. In reality, only parts of the grounding document or knowledge are relevant, which warrant a specific model for their selection. In this thesis, we study the problem of context-aware ranking for ad-hoc document ranking and retrieval-based dialogue. We focus on the two problems mentioned above. Specifically, we propose approaches to learning a ranking model for ad-hoc retrieval based on training examples selected from noisy user interactions (i.e., query logs), and approaches to exploit external knowledge for response retrieval in retrieval-based dialogue. The thesis is based on five published articles. The first two articles are about context-aware document ranking. They deal with the problem in the existing studies that consider all clicks in the search logs as positive samples, and sample unclicked documents as negative samples. In the first paper, we propose an unsupervised data augmentation strategy to simulate potential variations of user behavior sequences to take into account the scarcity of user behaviors. Then, we apply contrastive learning to identify these variations and generate a more robust representation for user behavior sequences. On the other hand, understanding the search intent of search sessions may represent different levels of difficulty -- some are easy to understand while others are more difficult. Directly mixing these search sessions in the same training batch will disturb the model optimization. Therefore, in the second paper, we propose a curriculum learning framework to learn the training samples in an easy-to-hard manner. Both proposed methods achieve better performance than the existing methods on two real search log datasets. The latter three articles focus on knowledge-grounded retrieval-based dialogue systems. We first propose a content selection mechanism for document-grounded dialogue and demonstrate that selecting relevant document content based on dialogue context can effectively reduce the noise in the document and increase dialogue quality. Second, we explore a new task of dialogue, which is required to generate dialogue according to a narrative description. We collect a new dataset in the movie domain to support our study. The knowledge is defined as a narrative that describes a part of a movie script (similar to dialogues). The goal is to create dialogues corresponding to the narrative. To this end, we design a new model that can track the coverage of the narrative along the dialogues and determine the uncovered part for the next turn. Third, we explore a proactive dialogue model that can proactively lead the dialogue to cover the required topics. We design an explicit knowledge prediction module to select relevant pieces of knowledge to use. To train the selection process, we generate weak-supervision signals using a heuristic method. All of the three papers investigate how various types of knowledge can be integrated into dialogue. Context is an important element in ad-hoc search and dialogue, but we argue that context should be understood in a broad sense. In this thesis, we include both previous interactions and the grounding document and knowledge as part of the context. This series of studies is one step in the direction of incorporating broad context information into search and dialogue

Dépôt Institutionnel Numérique

A Project Portfolio Management model adapted to non-profit organizations

Author: de Freitas Henrique M R
Lacerda Fabrício Martins
Martens Cristina Dai Prá
Publication venue: 'University of Technology, Sydney (UTS)'
Publication date: 01/11/2016
Field of study

As they strive towards greater professionalism in carrying out their activities, non-profit organizations (NPOs) have begun paying attention to project management. The non-profit sector (NPS) has also begun to adopt strategic planning techniques, thus making the acceptance of project portfolio management (PPM) methodology a natural consequence. This article aims to propose a project portfolio management model adapted to the context of NPOs

Directory of Open Access Journals

UTS ePress

Persuasive system design does matter: a systematic review of adherence to web-based interventions

Author: Gemert-Pijnen J.E.W.C. (Lisette) van
Kelders S.M.
Kok Robin N.
Ossebaard Hans C.
Publication venue: JMIR
Publication date: 01/01/2012
Field of study

Background: Although web-based interventions for promoting health and health-related behavior can be effective, poor adherence is a common issue that needs to be addressed. Technology as a means to communicate the content in web-based interventions has been neglected in research. Indeed, technology is often seen as a black-box, a mere tool that has no effect or value and serves only as a vehicle to deliver intervention content. In this paper we examine technology from a holistic perspective. We see it as a vital and inseparable aspect of web-based interventions to help explain and understand adherence. Objective: This study aims to review the literature on web-based health interventions to investigate whether intervention characteristics and persuasive design affect adherence to a web-based intervention. Methods: We conducted a systematic review of studies into web-based health interventions. Per intervention, intervention characteristics, persuasive technology elements and adherence were coded. We performed a multiple regression analysis to investigate whether these variables could predict adherence. Results: We included 101 articles on 83 interventions. The typical web-based intervention is meant to be used once a week, is modular in set-up, is updated once a week, lasts for 10 weeks, includes interaction with the system and a counselor and peers on the web, includes some persuasive technology elements, and about 50% of the participants adhere to the intervention. Regarding persuasive technology, we see that primary task support elements are most commonly employed (mean 2.9 out of a possible 7.0). Dialogue support and social support are less commonly employed (mean 1.5 and 1.2 out of a possible 7.0, respectively). When comparing the interventions of the different health care areas, we find significant differences in intended usage (p = .004), setup (p < .001), updates (p < .001), frequency of interaction with a counselor (p < .001), the system (p = .003) and peers (p = .017), duration (F = 6.068, p = .004), adherence (F = 4.833, p = .010) and the number of primary task support elements (F = 5.631, p = .005). Our final regression model explained 55% of the variance in adherence. In this model, a RCT study as opposed to an observational study, increased interaction with a counselor, more frequent intended usage, more frequent updates and more extensive employment of dialogue support significantly predicted better adherence. Conclusions: Using intervention characteristics and persuasive technology elements, a substantial amount of variance in adherence can be explained. Although there are differences between health care areas on intervention characteristics, health care area per se does not predict adherence. Rather, the differences in technology and interaction predict adherence. The results of this study can be used to make an informed decision about how to design a web-based intervention to which patients are more likely to adher

VU Research Portal

PubMed Central

University of Twente Research Information

To blockchain or not to blockchain, these are the questions:A structured analysis of blockchain decision schemes

Author: Easton John
Preece Joseph
Publication venue
Publication date: 01/03/2024
Field of study

Blockchain technology has garnered significant attention in recent years, prompting researchers, entrepreneurs, and businesses to seek viable ways to validate the application of blockchain within their specific use cases. Blockchain decision schemes (BDSs) can assist in this decision-making process, offering a potentially more cost-effective alternative to domain experts. Flow chart blockchain decision schemes (FC-BDSs) constitute 77.5 % of all BDSs, and this paper systematically reviews these by standardising and aggregating the most prominent schemes into an open-source package. Central to our approach is the definition of an FC-BDS as a directed acyclic graph (DAG). Upon this mathematical foundation, we engage in a meticulous exploration and analysis of various elements within FC-BDSs. We present an in-depth analysis of the structure of FC-BDSs, exploring features such as vertex count, question categorisation, and outcome distribution. Notably, the majority of FC-BDS questions ask about data and participation (34.1 %) above other domains such as security (18.6 %) and performance (10.8 %). Observations regarding outcomes shows an overall balance in suggesting the usage or avoidance of blockchains; however, there is a discrepancy between the average questions required to reach these outcomes, revealing potential biases within schemes. Further analysis using similarity metrics (based on both structural and semantic features) identifies significant overlaps between FC-BDSs, with some schemes showing over 90 % similarity. These observations could be attributed to a more informal publishing routine for FC-BDSs, and help trace the evolution of FC-BDSs over time. The insights drawn from this research provide valuable insights into the broader BDSs landscape, and stand to make significant strides towards the standardisation of FC-BDSs, thereby promoting a more coherent and effective utilisation of these decision-making tools in the realm of blockchain technology application

University of Birmingham Research Portal

A comparative study of three ICT network programs using usability testing

Author: Van der Linde P.L.
Publication venue: [Bloemfontein?] : Central University of Technology, Free State
Publication date: 01/01/2013
Field of study

Thesis (M. Tech. (Information Technology)) -- Central University of technology, Free State, 2013This study compared the usability of three Information and Communication Technology (ICT) network programs in a learning environment. The researcher wanted to establish which program was most adequate from a usability perspective among second-year Information Technology (IT) students at the Central University of Technology (CUT), Free State. The Software Usability Measurement Inventory (SUMI) testing technique can measure software quality from a user perspective. The technique is supported by an extensive reference database to measure a software product’s quality in use and is embedded in an effective analysis and reporting tool called SUMI scorer (SUMISCO). SUMI was applied in a controlled laboratory environment where second-year IT students of the CUT, utilized SUMI as part of their networking subject, System Software 1 (SPG1), to evaluate each of the three ICT network programs. The results, strengths and weaknesses, as well as usability improvements, as identified by SUMISCO, are discussed to determine the best ICT network program from a usability perspective according to SPG1 students

Central University Of Technology Free State - LibraryCUT, South Africa

Spoken content retrieval: A survey of techniques and technologies

Author: Ani Nenkova
C A. Nenkova
K. Mckeown
Kathleen Mckeown
Publication venue: 'Now Publishers'
Publication date: 01/01/2012
Field of study

Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Childish Nonsense? The Value of Interpretation in Plato\u27s \u3cem\u3eProtagoras\u3c/em\u3e

Author: Trivigno Franco
Publication venue: e-Publications@Marquette
Publication date: 01/01/2013
Field of study

epublications@Marquette

Argumentation Mining in User-Generated Web Discourse

Author: Gurevych Iryna
Habernal Ivan
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2015
Field of study

The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17

arXiv.org e-Print Archive

TUbiblio

Crossref

Directory of Open Access Journals

TUdatalib Repository (TU Darmstadt)