9,172 research outputs found

    Better representation learning for TPMS

    Full text link
    Avec l’augmentation de la popularitĂ© de l’IA et de l’apprentissage automatique, le nombre de participants a explosĂ© dans les confĂ©rences AI/ML. Le grand nombre d’articles soumis et la nature Ă©volutive des sujets constituent des dĂ©fis supplĂ©mentaires pour les systĂšmes d’évaluation par les pairs qui sont cruciaux pour nos communautĂ©s scientifiques. Certaines confĂ©rences ont Ă©voluĂ© vers l’automatisation de l’attribution des examinateurs pour les soumissions, le TPMS [1] Ă©tant l’un de ces systĂšmes existants. Actuellement, TPMS prĂ©pare des profils de chercheurs et de soumissions basĂ©s sur le contenu, afin de modĂ©liser l’adĂ©quation des paires examinateur-soumission. Dans ce travail, nous explorons diffĂ©rentes approches pour le rĂ©glage fin auto-supervisĂ© des transformateurs BERT pour les donnĂ©es des documents de confĂ©rence. Nous dĂ©montrons quelques nouvelles approches des vues d’augmentation pour l’auto-supervision dans le traitement du langage naturel, qui jusqu’à prĂ©sent Ă©tait davantage axĂ©e sur les problĂšmes de vision par ordinateur. Nous utilisons ensuite ces reprĂ©sentations d’articles individuels pour construire un modĂšle d’expertise qui apprend Ă  combiner la reprĂ©sentation des diffĂ©rents travaux publiĂ©s d’un examinateur et Ă  prĂ©dire leur pertinence pour l’examen d’un article soumis. Au final, nous montrons que de meilleures reprĂ©sentations individuelles des papiers et une meilleure modĂ©lisation de l’expertise conduisent Ă  de meilleures performances dans la tĂąche de prĂ©diction de l’adĂ©quation de l’examinateur.With the increase in popularity of AI and Machine learning, participation numbers have exploded in AI/ML conferences. The large number of submission papers and the evolving nature of topics constitute additional challenges for peer-review systems that are crucial for our scientific communities. Some conferences have moved towards automating the reviewer assignment for submissions, TPMS [1] being one such existing system. Currently, TPMS prepares content-based profiles of researchers and submission papers, to model the suitability of reviewer-submission pairs. In this work, we explore different approaches to self-supervised fine-tuning of BERT transformers for conference papers data. We demonstrate some new approaches to augmentation views for self-supervision in natural language processing, which till now has been more focused on problems in computer vision. We then use these individual paper representations for building an expertise model which learns to combine the representation of different published works of a reviewer and predict their relevance for reviewing a submission paper. In the end, we show that better individual paper representations and expertise modeling lead to better performance on the reviewer suitability prediction task

    AN ANALYSIS OF ENGLISH SUMMATIVE TEST FOR THE SECOND GRADE STUDENTS OF JUNIOR HIGH SCHOOL 2 KARTASURA IN ACADEMIC YEAR 2016 / 2017

    Get PDF
    Rahmawati,Maisa.2013. An Analysis on English Summative Test For The Second Grade Students of Junior High School 2 Kartasura In Academic Year 2016 / 2017. Thesis.English Education Study Program, Islamic Education and Teacher Training Faculty. Advisors : Dra. Hj. Woro Retnaningsih M.Pd. Key Words : Summative Test, Analysis Item The objective in this study are (1) to describe about the content material tested in English summative test for the second grade students of SMPN 2 Kartasura weather suitable with their English syllabus KTSP. The research analyzed the quality between syllabus of the summative test in the final test of the second semester for the eighth grade at SMPN 2 Kartasura in the academic year 2016 / 2017. The quality of the test can be known by performing an analysis called the test item analysis. Grain test analysis has several advantages: (a) providing information about the test, (b) knowing the student's progress in learning that can later be improved, and (c) providing knowledge to the teacher about making quality questions. In this research, the researcher used descriptive qualitative research. The form of descriptive qualitative research is used to analysis the data. The researcher collected, the data from the English summative test teacher of xiii SMPN 2 Kartasura, the researcher asked for the syllabus and the summative test of the English subject on the second semester 2016 / 2017 academic year of the second grade of SMPN 2 Kartasura. The researcher analyzed which test items number have suitable and have not suitable to the syllabus of curriculum. The test is measured adaptable with the syllabus and indicators especially from reading, speaking and writing skills. The result of the research show that the final test items for the second semester of eighth grade students of SMPN 2 Kartasura, Pabelan in 2016 / 2017 academic year is good and suitable based on the syllabus and lesson plan . The test items form are divided into two kinds, there are multiple-choice, and essay items . The multiple choice items from 50 items, 3 not suitable items and 47 suitable items on the syllabus, the essay items from 5 items, 3 essay items is suitable, 2 essay items not suitable on the syllabus. Based on the data analysis, the research conclude that the final test items for the second semester on eight grade students SMPN 2 Kartasura Pabelan in 2016 / 2017 academic year is good and suitable based on the syllabus and lesson plan used at SMPN 2 Kartasura in academic year 2016 / 2017. xiv ABSTRAK Rahmawati, Maisa.2013. Analisis Tes Sumatif Bahasa Inggris untuk Siswa Kelas delapan SMP 2 Kartasura Tahun Ajaran 2016 / 2017. Skripsi. Program Studi Pendidikan Bahasa Inggris, Fakultas Ilmu Tarbiyah dan Keguruan. Pembibing : Dra. Hj. Woro Retnaningsih M.Pd. Kata Kunci: Tes Sumatif, Analisis Butir Soal Tujuan dalam penelitian ini adalah (1) mendeskripsikan isi materi yang dalam tes sumatif bahasa Inggris untuk siswa kelas delapan SMPN 2 Kartasura apakah sesuai dengan silabus bahasa Inggris KTSP di SMPN 2 Kartasura. Penelitian ini menganalisis kesesuaian antara silabus dengan tes sumatif pada ulangan akhir semester kedua untuk kelas delapan di SMPN 2 Kartasuratahun akademik 2016 / 2017. Kualitas pengujian dapat diketahui dengan melakukan analisis yang disebut uji Analisis item.Analisis butir soal memiliki beberapa keunggulan: (a) memberikan informasi tentang tes tersebut, (b) mengetahui kemajuan siswa dalam pembelajaran yang nantinya dapat ditingkatkan, dan (c) memberikan pengetahuan kepada guru tentang membuat pertanyaanberkualitas. Dalam penelitian ini, peneliti menggunakan penelitian kualitatif deskriptif. Bentuk penelitian deskriptif kualitatif digunakan untuk analisis data. Peneliti mengumpulkan data dari guru bahasa Inggris SMPN 2 Kartasura, peneliti meminta silabus dan soal tes sumatif mata pelajaran bahasa Inggris pada semester kedua tahun 2016/2017 tahun ajaran kedua SMPN 2 Kartasura. Peneliti xv menganalisis item tes mana yang sesuai dan tidak sesuai dengan silabus kurikulum. Tes ini diukur dan di adaptasi dari silabus dan indikator terutama dari kemampuanmembaca,berbicara dan menulis. Hasil penelitian menunjukkan bahwa item ulangan akhir untuk semester kedua siswa kelas delapan SMPN 2 KartasuraPabelan tahun 2016/2017 baik dan sesuai berdasarkan silabus dan rencana pembelajaran.Item tes dibagi menjadi dua macam yaitu soal pilihan ganda dan essai. Soal pilihan ganda terdiri dari 50 soal, 3 soal tidak sesuai dan 47 soal sesuai dengan silabus, sedangkan essai terdiri dari 5 soal , 2 item esai tidak sesuai dengan silabus dan 3 soal lainnya sesuai dengan silabus kurikulum. Berdasarkan hasil analisis data, penelitian ini menyimpulkan bahwa item ulangan akhir semester kedua kelas delapan SMPN 2 Kartasura Pabelan tahun 2016/2017 baik dan sesuai berdasarkan silabus dan rencana pembelajaran yang digunakan di SMPN 2 Kartasura Tahun 2016/2017

    Context-aware ranking : from search to dialogue

    Full text link
    Les systĂšmes de recherche d'information (RI) ou moteurs de recherche ont Ă©tĂ© largement utilisĂ©s pour trouver rapidement les informations pour les utilisateurs. Le classement est la fonction centrale de la RI, qui vise Ă  ordonner les documents candidats dans une liste classĂ©e en fonction de leur pertinence par rapport Ă  une requĂȘte de l'utilisateur. Alors que IR n'a considĂ©rĂ© qu'une seule requĂȘte au dĂ©but, les systĂšmes plus rĂ©cents prennent en compte les informations de contexte. Par exemple, dans une session de recherche, le contexte de recherche tel que le requĂȘtes et interactions prĂ©cĂ©dentes avec l'utilisateur, est largement utilisĂ© pour comprendre l'intention de la recherche de l'utilisateur et pour aider au classement des documents. En plus de la recherche ad-hoc traditionnelle, la RI a Ă©tĂ© Ă©tendue aux systĂšmes de dialogue (c'est-Ă -dire, le dialogue basĂ© sur la recherche, par exemple, XiaoIce), oĂč on suppose avoir un grand rĂ©fĂ©rentiel de dialogues et le but est de trouver la rĂ©ponse pertinente Ă  l'Ă©noncĂ© courant d'un utilisateur. Encore une fois, le contexte du dialogue est un Ă©lĂ©ment clĂ© pour dĂ©terminer la pertinence d'une rĂ©ponse. L'utilisation des informations contextuelles a fait l'objet de nombreuses Ă©tudes, allant de l'extraction de mots-clĂ©s importants du contexte pour Ă©tendre la requĂȘte ou l'Ă©noncĂ© courant de dialogue, Ă  la construction d'une reprĂ©sentation neuronale du contexte qui sera utilisĂ©e avec la requĂȘte ou l'Ă©noncĂ© de dialogue pour la recherche. Nous remarquons deux d'importantes insuffisances dans la littĂ©rature existante. (1) Pour apprendre Ă  utiliser les informations contextuelles, on doit extraire des Ă©chantillons positifs et nĂ©gatifs pour l'entraĂźnement. On a gĂ©nĂ©ralement supposĂ© qu'un Ă©chantillon positif est formĂ© lorsqu'un utilisateur interagit avec (clique sur) un document dans un contexte, et un un Ă©chantillon nĂ©gatif est formĂ© lorsqu'aucune interaction n'est observĂ©e. En rĂ©alitĂ©, les interactions des utilisateurs sont Ă©parses et bruitĂ©es, ce qui rend l'hypothĂšse ci-dessus irrĂ©aliste. Il est donc important de construire des exemples d'entraĂźnement d'une maniĂšre plus appropriĂ©e. (2) Dans les systĂšmes de dialogue, en particulier les systĂšmes de bavardage (chitchat), on cherche Ă  trouver ou gĂ©nĂ©rer les rĂ©ponses sans faire rĂ©fĂ©rence Ă  des connaissances externes, ce qui peut facilement provoquer des rĂ©ponses non pertinentes ou des hallucinations. Une solution consiste Ă  fonder le dialogue sur des documents ou graphe de connaissances externes, oĂč les documents ou les graphes de connaissances peuvent ĂȘtre considĂ©rĂ©s comme de nouveaux types de contexte. Le dialogue fondĂ© sur les documents et les connaissances a Ă©tĂ© largement Ă©tudiĂ©, mais les approches restent simplistes dans la mesure oĂč le contenu du document ou les connaissances sont gĂ©nĂ©ralement concatĂ©nĂ©s Ă  l'Ă©noncĂ© courant. En rĂ©alitĂ©, seules certaines parties du document ou du graphe de connaissances sont pertinentes, ce qui justifie un modĂšle spĂ©cifique pour leur sĂ©lection. Dans cette thĂšse, nous Ă©tudions le problĂšme du classement de textes en tenant compte du contexte dans le cadre de RI ad-hoc et de dialogue basĂ© sur la recherche. Nous nous concentrons sur les deux problĂšmes mentionnĂ©s ci-dessus. SpĂ©cifiquement, nous proposons des approches pour apprendre un modĂšle de classement pour la RI ad-hoc basĂ©e sur des exemples d'entraĂźenemt sĂ©lectionnĂ©s Ă  partir d'interactions utilisateur bruitĂ©es (c'est-Ă -dire des logs de requĂȘtes) et des approches Ă  exploiter des connaissances externes pour la recherche de rĂ©ponse pour le dialogue. La thĂšse est basĂ©e sur cinq articles publiĂ©s. Les deux premiers articles portent sur le classement contextuel des documents. Ils traitent le problĂšme ovservĂ© dans les Ă©tudes existantes, qui considĂšrent tous les clics dans les logs de recherche comme des Ă©chantillons positifs, et prĂ©lever des documents non cliquĂ©s comme Ă©chantillons nĂ©gatifs. Dans ces deux articles, nous proposons d'abord une stratĂ©gie d'augmentation de donnĂ©es non supervisĂ©e pour simuler les variations potentielles du comportement de l'utilisateur pour tenir compte de la sparcitĂ© des comportements des utilisateurs. Ensuite, nous appliquons l'apprentissage contrastif pour identifier ces variations et Ă  gĂ©nĂ©rer une reprĂ©sentation plus robuste du comportement de l'utilisateur. D'un autre cĂŽtĂ©, comprendre l'intention de recherche dans une session de recherche peut reprĂ©sentent diffĂ©rents niveaux de difficultĂ© - certaines intentions sont faciles Ă  comprendre tandis que d'autres sont plus difficiles et nuancĂ©es. MĂ©langer directement ces sessions dans le mĂȘme batch d'entraĂźnement perturbera l'optimisation du modĂšle. Par consĂ©quent, nous proposons un cadre d'apprentissage par curriculum avec des examples allant de plus faciles Ă  plus difficiles. Les deux mĂ©thodes proposĂ©es obtiennent de meilleurs rĂ©sultats que les mĂ©thodes existantes sur deux jeux de donnĂ©es de logs de requĂȘtes rĂ©els. Les trois derniers articles se concentrent sur les systĂšmes de dialogue fondĂ© les documents/connaissances. Nous proposons d'abord un mĂ©canisme de sĂ©lection de contenu pour le dialogue fondĂ© sur des documents. Les expĂ©rimentations confirment que la sĂ©lection de contenu de document pertinent en fonction du contexte du dialogue peut rĂ©duire le bruit dans le document et ainsi amĂ©liorer la qualitĂ© du dialogue. DeuxiĂšmement, nous explorons une nouvelle tĂąche de dialogue qui vise Ă  gĂ©nĂ©rer des dialogues selon une description narrative. Nous avons collectĂ© un nouveau jeu de donnĂ©es dans le domaine du cinĂ©ma pour nos expĂ©rimentations. Les connaissances sont dĂ©finies par une narration qui dĂ©crit une partie du scĂ©nario du film (similaire aux dialogues). Le but est de crĂ©er des dialogues correspondant Ă  la narration. À cette fin, nous concevons un nouveau modĂšle qui tient l'Ă©tat de la couverture de la narration le long des dialogues et dĂ©terminer la partie non couverte pour le prochain tour. TroisiĂšmement, nous explorons un modĂšle de dialogue proactif qui peut diriger de maniĂšre proactive le dialogue dans une direction pour couvrir les sujets requis. Nous concevons un module de prĂ©diction explicite des connaissances pour sĂ©lectionner les connaissances pertinentes Ă  utiliser. Pour entraĂźner le processus de sĂ©lection, nous gĂ©nĂ©rons des signaux de supervision par une mĂ©thode heuristique. Les trois articles examinent comment divers types de connaissances peuvent ĂȘtre intĂ©grĂ©s dans le dialogue. Le contexte est un Ă©lĂ©ment important dans la RI ad-hoc et le dialogue, mais nous soutenons que le contexte doit ĂȘtre compris au sens large. Dans cette thĂšse, nous incluons Ă  la fois les interactions prĂ©cĂ©dentes avec l'utilisateur, le document et les connaissances dans le contexte. Cette sĂ©rie d'Ă©tudes est un pas dans la direction de l'intĂ©gration d'informations contextuelles diverses dans la RI et le dialogue.Information retrieval (IR) or search systems have been widely used to quickly find desired information for users. Ranking is the central function of IR, which aims at ordering the candidate documents in a ranked list according to their relevance to a user query. While IR only considered a single query in the early stages, more recent systems take into account the context information. For example, in a search session, the search context, such as the previous queries and interactions with the user, is widely used to understand the user's search intent and to help document ranking. In addition to the traditional ad-hoc search, IR has been extended to dialogue systems (i.e., retrieval-based dialogue, e.g., XiaoIce), where one assumes a large repository of previous dialogues and the goal is to retrieve the most relevant response to a user's current utterance. Again, the dialogue context is a key element for determining the relevance of a response. The utilization of context information has been investigated in many studies, which range from extracting important keywords from the context to expand the query or current utterance, to building a neural context representation used with the query or current utterance for search. We notice two important insufficiencies in the existing literature. (1) To learn to use context information, one has to extract positive and negative samples for training. It has been generally assumed that a positive sample is formed when a user interacts with a document in a context, and a negative sample is formed when no interaction is observed. In reality, user interactions are scarce and noisy, making the above assumption unrealistic. It is thus important to build more appropriate training examples. (2) In dialogue systems, especially chitchat systems, responses are typically retrieved or generated without referring to external knowledge. This may easily lead to hallucinations. A solution is to ground dialogue on external documents or knowledge graphs, where the grounding document or knowledge can be seen as new types of context. Document- and knowledge-grounded dialogue have been extensively studied, but the approaches remain simplistic in that the document content or knowledge is typically concatenated to the current utterance. In reality, only parts of the grounding document or knowledge are relevant, which warrant a specific model for their selection. In this thesis, we study the problem of context-aware ranking for ad-hoc document ranking and retrieval-based dialogue. We focus on the two problems mentioned above. Specifically, we propose approaches to learning a ranking model for ad-hoc retrieval based on training examples selected from noisy user interactions (i.e., query logs), and approaches to exploit external knowledge for response retrieval in retrieval-based dialogue. The thesis is based on five published articles. The first two articles are about context-aware document ranking. They deal with the problem in the existing studies that consider all clicks in the search logs as positive samples, and sample unclicked documents as negative samples. In the first paper, we propose an unsupervised data augmentation strategy to simulate potential variations of user behavior sequences to take into account the scarcity of user behaviors. Then, we apply contrastive learning to identify these variations and generate a more robust representation for user behavior sequences. On the other hand, understanding the search intent of search sessions may represent different levels of difficulty -- some are easy to understand while others are more difficult. Directly mixing these search sessions in the same training batch will disturb the model optimization. Therefore, in the second paper, we propose a curriculum learning framework to learn the training samples in an easy-to-hard manner. Both proposed methods achieve better performance than the existing methods on two real search log datasets. The latter three articles focus on knowledge-grounded retrieval-based dialogue systems. We first propose a content selection mechanism for document-grounded dialogue and demonstrate that selecting relevant document content based on dialogue context can effectively reduce the noise in the document and increase dialogue quality. Second, we explore a new task of dialogue, which is required to generate dialogue according to a narrative description. We collect a new dataset in the movie domain to support our study. The knowledge is defined as a narrative that describes a part of a movie script (similar to dialogues). The goal is to create dialogues corresponding to the narrative. To this end, we design a new model that can track the coverage of the narrative along the dialogues and determine the uncovered part for the next turn. Third, we explore a proactive dialogue model that can proactively lead the dialogue to cover the required topics. We design an explicit knowledge prediction module to select relevant pieces of knowledge to use. To train the selection process, we generate weak-supervision signals using a heuristic method. All of the three papers investigate how various types of knowledge can be integrated into dialogue. Context is an important element in ad-hoc search and dialogue, but we argue that context should be understood in a broad sense. In this thesis, we include both previous interactions and the grounding document and knowledge as part of the context. This series of studies is one step in the direction of incorporating broad context information into search and dialogue

    A Project Portfolio Management model adapted to non-profit organizations

    Get PDF
    As they strive towards greater professionalism in carrying out their activities, non-profit organizations (NPOs) have begun paying attention to project management. The non-profit sector (NPS) has also begun to adopt strategic planning techniques, thus making the acceptance of project portfolio management (PPM) methodology a natural consequence. This article aims to propose a project portfolio management model adapted to the context of NPOs

    Persuasive system design does matter: a systematic review of adherence to web-based interventions

    Get PDF
    Background: Although web-based interventions for promoting health and health-related behavior can be effective, poor adherence is a common issue that needs to be addressed. Technology as a means to communicate the content in web-based interventions has been neglected in research. Indeed, technology is often seen as a black-box, a mere tool that has no effect or value and serves only as a vehicle to deliver intervention content. In this paper we examine technology from a holistic perspective. We see it as a vital and inseparable aspect of web-based interventions to help explain and understand adherence. Objective: This study aims to review the literature on web-based health interventions to investigate whether intervention characteristics and persuasive design affect adherence to a web-based intervention. Methods: We conducted a systematic review of studies into web-based health interventions. Per intervention, intervention characteristics, persuasive technology elements and adherence were coded. We performed a multiple regression analysis to investigate whether these variables could predict adherence. Results: We included 101 articles on 83 interventions. The typical web-based intervention is meant to be used once a week, is modular in set-up, is updated once a week, lasts for 10 weeks, includes interaction with the system and a counselor and peers on the web, includes some persuasive technology elements, and about 50% of the participants adhere to the intervention. Regarding persuasive technology, we see that primary task support elements are most commonly employed (mean 2.9 out of a possible 7.0). Dialogue support and social support are less commonly employed (mean 1.5 and 1.2 out of a possible 7.0, respectively). When comparing the interventions of the different health care areas, we find significant differences in intended usage (p = .004), setup (p < .001), updates (p < .001), frequency of interaction with a counselor (p < .001), the system (p = .003) and peers (p = .017), duration (F = 6.068, p = .004), adherence (F = 4.833, p = .010) and the number of primary task support elements (F = 5.631, p = .005). Our final regression model explained 55% of the variance in adherence. In this model, a RCT study as opposed to an observational study, increased interaction with a counselor, more frequent intended usage, more frequent updates and more extensive employment of dialogue support significantly predicted better adherence. Conclusions: Using intervention characteristics and persuasive technology elements, a substantial amount of variance in adherence can be explained. Although there are differences between health care areas on intervention characteristics, health care area per se does not predict adherence. Rather, the differences in technology and interaction predict adherence. The results of this study can be used to make an informed decision about how to design a web-based intervention to which patients are more likely to adher

    To blockchain or not to blockchain, these are the questions:A structured analysis of blockchain decision schemes

    Get PDF
    Blockchain technology has garnered significant attention in recent years, prompting researchers, entrepreneurs, and businesses to seek viable ways to validate the application of blockchain within their specific use cases. Blockchain decision schemes (BDSs) can assist in this decision-making process, offering a potentially more cost-effective alternative to domain experts. Flow chart blockchain decision schemes (FC-BDSs) constitute 77.5 % of all BDSs, and this paper systematically reviews these by standardising and aggregating the most prominent schemes into an open-source package. Central to our approach is the definition of an FC-BDS as a directed acyclic graph (DAG). Upon this mathematical foundation, we engage in a meticulous exploration and analysis of various elements within FC-BDSs. We present an in-depth analysis of the structure of FC-BDSs, exploring features such as vertex count, question categorisation, and outcome distribution. Notably, the majority of FC-BDS questions ask about data and participation (34.1 %) above other domains such as security (18.6 %) and performance (10.8 %). Observations regarding outcomes shows an overall balance in suggesting the usage or avoidance of blockchains; however, there is a discrepancy between the average questions required to reach these outcomes, revealing potential biases within schemes. Further analysis using similarity metrics (based on both structural and semantic features) identifies significant overlaps between FC-BDSs, with some schemes showing over 90 % similarity. These observations could be attributed to a more informal publishing routine for FC-BDSs, and help trace the evolution of FC-BDSs over time. The insights drawn from this research provide valuable insights into the broader BDSs landscape, and stand to make significant strides towards the standardisation of FC-BDSs, thereby promoting a more coherent and effective utilisation of these decision-making tools in the realm of blockchain technology application

    A comparative study of three ICT network programs using usability testing

    Get PDF
    Thesis (M. Tech. (Information Technology)) -- Central University of technology, Free State, 2013This study compared the usability of three Information and Communication Technology (ICT) network programs in a learning environment. The researcher wanted to establish which program was most adequate from a usability perspective among second-year Information Technology (IT) students at the Central University of Technology (CUT), Free State. The Software Usability Measurement Inventory (SUMI) testing technique can measure software quality from a user perspective. The technique is supported by an extensive reference database to measure a software product’s quality in use and is embedded in an effective analysis and reporting tool called SUMI scorer (SUMISCO). SUMI was applied in a controlled laboratory environment where second-year IT students of the CUT, utilized SUMI as part of their networking subject, System Software 1 (SPG1), to evaluate each of the three ICT network programs. The results, strengths and weaknesses, as well as usability improvements, as identified by SUMISCO, are discussed to determine the best ICT network program from a usability perspective according to SPG1 students

    Spoken content retrieval: A survey of techniques and technologies

    Get PDF
    Speech media, that is, digital audio and video containing spoken content, has blossomed in recent years. Large collections are accruing on the Internet as well as in private and enterprise settings. This growth has motivated extensive research on techniques and technologies that facilitate reliable indexing and retrieval. Spoken content retrieval (SCR) requires the combination of audio and speech processing technologies with methods from information retrieval (IR). SCR research initially investigated planned speech structured in document-like units, but has subsequently shifted focus to more informal spoken content produced spontaneously, outside of the studio and in conversational settings. This survey provides an overview of the field of SCR encompassing component technologies, the relationship of SCR to text IR and automatic speech recognition and user interaction issues. It is aimed at researchers with backgrounds in speech technology or IR who are seeking deeper insight on how these fields are integrated to support research and development, thus addressing the core challenges of SCR

    Argumentation Mining in User-Generated Web Discourse

    Full text link
    The goal of argumentation mining, an evolving research field in computational linguistics, is to design methods capable of analyzing people's argumentation. In this article, we go beyond the state of the art in several ways. (i) We deal with actual Web data and take up the challenges given by the variety of registers, multiple domains, and unrestricted noisy user-generated Web discourse. (ii) We bridge the gap between normative argumentation theories and argumentation phenomena encountered in actual data by adapting an argumentation model tested in an extensive annotation study. (iii) We create a new gold standard corpus (90k tokens in 340 documents) and experiment with several machine learning methods to identify argument components. We offer the data, source codes, and annotation guidelines to the community under free licenses. Our findings show that argumentation mining in user-generated Web discourse is a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17
    • 

    corecore