1,585 research outputs found
MenetelmiÀ luonnollisella kielellÀ kirjoitettujen raporttien automaattiseen tuottamiseen
The use of computer software to automatically produce natural language texts expressing factual content is of interest to practitioners of multiple fields, ranging from journalists to researchers to educators. This thesis studies natural language report generation from structured data for the purposes of journalism. The topic is approached from three directions.
First, we approach the problem from the perspective of analysing what requirements the journalistic domain imposes on the software, and how software might be architectured to account for the requirements. This includes identifying the key domain norms (such as the "objectivity norm") and business requirements (such as system transferability) and mapping them to software requirements. Based on the identified requirements, we then describe how a modular data-to-text approach to natural language generation can be implemented in the specific context of hard news reporting.
Second, we investigate how the highly domain-specific natural language generation subtask of document planning - deciding what information is to be included in an automatically produced text, and in what order - might be conducted in a less domain-specific manner. To this end, we describe an approach to operationalizing the complex concept of "newsworthiness" in a manner where a natural language generation system can employ it. We also present a broadly applicable baseline method for structuring the content in a data-to-text setting without explicit domain knowledge.
Third, we discuss how bias in text generation systems is perceived by key stakeholders, and whether those perceptions align with the reality of news automation. This discussion includes identifying how automated systems might exhibit bias and how the biases might be - potentially unconsciously - embedded in the systems. As a result, we conclude that common perceptions of automated journalism as fundamentally "unbiased" are unfounded, and that beliefs about "unbiased" automation might have the negative effect of further entrenching pre-existing biases in organizations or society.
Together, through these three avenues, the thesis sketches out a way towards more widespread use of news automation in newsrooms, taking into account the various ethical questions associated with the use of such systems.TĂ€mĂ€ vĂ€itöskirja kĂ€sittelee luonnollisen kielen â siis esimerkiksi suomen tai englannin kielen â tuottamista automaattisesti sellaisissa yhteyksissĂ€, joissa kielen asiasisĂ€llön oikeellisuus on kriittistĂ€. TĂ€llaisia tietokonejĂ€rjestelmiĂ€ kĂ€ytetÀÀn esimerkiksi sÀÀtiedotteiden, urheilu- ja talousuutisten sekĂ€ potilaskuvausten kirjoittamiseen. VĂ€itöskirja lĂ€hestyy aihetta kolmesta eri nĂ€kökulmasta, keskittyen erityisesti journalismiin.
EnsimmÀisenÀ vÀitöskirjassa tarkastellaan, kuinka journalistinen konteksti vaikuttaa siihen, kuinka luonnollista kieltÀ tuottava tietokonejÀrjestelmÀ tulisi rakentaa. VÀitöskirjassa analysoidaan journalismiin liittyviÀ normeja ja kÀytÀntöjÀ ja siirretÀÀn ne ohjelmistotuotannollisiksi vaatimuksiksi. Vaatimusten pohjalta vÀitöskirjassa tunnistetaan journalistisiin tarkoituksiin sopiva luonnollisen kielen tuotannon ohjelmistoarkkitehtuuri.
Toiseksi vĂ€itöskirjassa perehdytÀÀn luonnollisen kielen tuotannon yhteen aliongelmaan, tekstinsuunnitteluun. Tekstinsuunnitteluvaiheessa valitaan ne tietoalkiot, jotka tekstiin sisĂ€llytetÀÀn, ja jĂ€rjestetÀÀn valitut tietoalkiot siten, ettĂ€ ne muodostavat ymmĂ€rrettĂ€vĂ€n tekstin. TĂ€tĂ€ työvaihetta on yleisesti pidetty erÀÀnĂ€ tekstintuotannon âsovelluskohderiippuvaisimmistaâ vaiheista. TĂ€mĂ€ tarkoittaa sitĂ€, ettĂ€ se pitÀÀ ratkaista erikseen jokaiselle eri sovellukselle: vaaliuutisia jĂ€sentĂ€vĂ€ menetelmĂ€ ei vĂ€lttĂ€mĂ€ttĂ€ sovellu talousuutisten jĂ€sentĂ€miseen. VĂ€itöskirjassa analysoidaan journalismissa kĂ€ytettyĂ€ âuutisarvonâ kĂ€sitettĂ€ ja kuvataan siihen perustuva menetelmĂ€ tietoalkioiden valinnalle. LisĂ€ksi vĂ€itöskirjassa esitellÀÀn tietoalkioiden jĂ€rjestĂ€miseen laaja-alaisesti soveltuva menetelmĂ€. YhdessĂ€ nĂ€mĂ€ menetelmĂ€t yksinkertaistavat uusien tekstintuotantojĂ€rjestelmien rakentamista tietyissĂ€ konteksteissa.
Kolmanneksi vĂ€itöskirjassa kĂ€sitellÀÀn tekstintuotantojĂ€rjestelmien vinoumia. Kirjassa kuvataan, kuinka automaattisen tekstintuotannon journalistisen kĂ€ytön kannalta avainasemassa olevat henkilöt nĂ€kevĂ€t vinoumien uhkan ja kuinka nĂ€mĂ€ nĂ€kemykset vastaavat automaattisen tekstintuotannon todellisuutta. Tarkemmin kirjassa kuvataan, millaisia vinoumia automaattisen tekstintuotannon jĂ€rjestelmistĂ€ saattaa löytyĂ€ ja kuinka vinoumat voivat pÀÀtyĂ€ jĂ€rjestelmiin. TĂ€ltĂ€ osin vĂ€itöskirjan pÀÀtelmĂ€ on, ettĂ€ automaattisen tekstintuotannon jĂ€rjestelmiĂ€ ei tulisi pitÀÀ lĂ€htökohtaisesti vĂ€hemmĂ€n vinoutuneina kuin ihmisiĂ€ ja ettĂ€ uskomukset automaattisten menetelmien sisÀÀnrakennetusta âreiluudestaâ saattavat johtaa epĂ€toivottuihin vaikutuksiin organisaatioiden ja yhteiskunnan vinoumia vakiinnuttaen.
NÀiden kolmen nÀkökulman kautta vÀitöskirjassa hahmotellaan tietÀ automaattisten tekstintuotannon jÀrjestelmien laajemmalle kÀytöllÀ erityisesti uutishuoneissa eettisesti kestÀvÀllÀ tavalla
Ludwig Wittgenstein & Gertrude Stein â Meeting in Language
Former Director of Studies: Professor Antonio CaroniaTine Melzer: Ludwig Wittgenstein & Gertrude Stein â Meeting in Language
The purpose of this study is to show transitions between verbal and visual meaning in ordinary language, based on philosophical concepts and conceptual artworks. It offers models for artistic research and collaboration in arts and science. Shared experiences in ordinary language are fundamental to this thesis and make it an accessible and trans-disciplinary study. Language as such, is approached from different practices and disciplines and becomes the central object of investigation.
The research introduces a general set of mechanisms in language, stemming from the Wittgensteinian notion of the language-game. The study examines the possibility of a meeting between the philosopher Ludwig Wittgenstein and the writer Gertrude Stein in a linguistic, biographical and poetic sense. The main claim is that Wittgenstein and Stein share the understanding of language as a game, which is a fruitful principle for artistic and poetic production.
Gertrude Stein developed a dimension in her writing which partly succeeds in showing this notion of creating meaning-as-practice and making sense on the âedgeâ of conventional meaning. In this way she augments Wittgensteinâs idea of the language-game and puts it into practice, tests its limits on her own language and on the readerâs habits. The artistic works represented in this thesis are equally experimental tests of Wittgensteinâs meaning-as-use hypothesis. They put his ideas into practice. They extend the research with strategies from the arts, poetry and fiction.
The methodology of the research is based on Wittgensteinâs notion of meaning as context-dependent use. This concept defines the meaning of a word by the way it is used in a specific context. This perspective is then challenged with visual artistic work. This hypothesis is tested throughout the research by applying tools and concepts from several practices, like computer linguistic tools, collaboration with writers and artists from other fields and autonomous visual and poetic work to augment the study of facts.
Conceptual artworks, often produced in collaboration, function as language experiments, or language-games. The Wittgensteinian differentiation between what can be shown and what can be said is examined. The context of the research lies in the practices developed as a conceptual artist in which theoretical research informs artistic practice. This thesis, on the border between verbal and visual language, is founded upon antecedent studies in philosophy of language and the practice of Fine Arts. Against this background the research focuses on the relationship between word, context and meaning: issues of communication, ordinary language, words and their composition, context-based meaning, naming visual phenomena, examination of word-and-world-relationships and vocabularies.
Main sources are the major works and biographies of Ludwig Wittgenstein, Gertrude Stein, the critical work of Marjorie Perloff, language philosophers concerned with ordinary language and the contrastive corpus linguistic approach.
The results of this research are generated by several interdisciplinary productive methods. Artworks, poetic and scientific work, all of which employ modes of language, and whose their domains overlap. Additionally, the notion of meeting acts as model metaphor for the development of a solid trans-disciplinary methodology for research between science and the arts. One major result of comparing their ideas on language is reflected in the meeting of the language used by Wittgenstein and Stein. Their meeting is materialized in the computer generated Shared Vocabulary, which is a list of words which both Wittgenstein and Stein used in their writing. It applies linguistic tools from contrastive corpus linguistics to compare their vocabularies (corpora), which offers new methods for investigating the works of the philosopher Wittgenstein and writer Stein.
Generally, this thesis may act as an introduction to language as ideal fundament for interdisciplinary study. The application of the principle of the language-game (Wittgenstein) is a significant of displaying possible strategies for artists and researchers who work transdisciplinarily. The research results directly inform practice and practitioners from other fields, which means that collaboration is central to the research. It implies that language permeates every sort of research, art and its discourse. It also suggests that the meaning of words and images depend on their use, which extends the Wittgensteinian meaning-as-use hypothesis to visual language. The findings of the research on vocabularies are quite specific, but they overlap with offering simple general mechanisms of the language-game. The consequent alliance of the discussion with the language of the everyday makes the research a general contribution to everyone who is genuinely interested in language and the arts.Parts of this research were supported by
The Netherlands Foundation for Visual Arts, Design and
Architecture (Fonds BKVB, Studiebeurs)and
Prins Bernhard Cultuurfonds Amsterdam (Cultuurfondsbeurs
Jewish Studies in the Digital Age
The digitisation boom of the last two decades, and the rapid advancement of digital tools to analyse data in myriad ways, have opened up new avenues for humanities research. This volume discusses how the so-called digital turn has affected the field of Jewish Studies, explores the current state of the art and probes how digital developments can be harnessed to address the specific questions, challenges and problems in the field
Processing temporal information in unstructured documents
Tese de doutoramento, InformĂĄtica (CiĂȘncia da Computação), Universidade de Lisboa, Faculdade de CiĂȘncias, 2013Temporal information processing has received substantial attention in the last few years, due to the appearance of evaluation challenges focused on the extraction of temporal information from texts written in natural language. This research area belongs to the broader field of information extraction, which aims to automatically find specific pieces of information in texts, producing structured representations of that information, which can then be easily used by other computer applications. It has the potential to be useful in several applications that deal with natural language, given that many languages, among which we find Portuguese, extensively refer to time. Despite that, temporal processing is still incipient for many language, Portuguese being one of them. The present dissertation has various goals. On one hand, it addresses this current gap, by developing and making available resources that support the development of tools for this task, employing this language, and also by developing precisely this kind of tools. On the other hand, its purpose is also to report on important results of the research on this area of temporal processing. This work shows how temporal processing requires and benefits from modeling different kinds of knowledge: grammatical knowledge, logical knowledge, knowledge about the world, etc. Additionally, both machine learning methods and rule-based approaches are explored and used in the development of hybrid systems that are capable of taking advantage of the strengths of each of these two types of approach.O processamento de informação temporal tem recebido bastante atenção nos Ășltimos anos, devido ao surgimento de desafios de avaliação focados na extração de informação temporal de textos escritos em linguagem natural. Esta ĂĄrea de investigação enquadra-se no campo mais lato da extração de informação, que visa encontrar automaticamente informação especĂfica presente em textos, produzindo representaçÔes estruturadas da mesma, que podem depois ser facilmente utilizadas por outras aplicaçÔes computacionais. Tem o potencial de ser Ăștil em diversas aplicaçÔes que lidam com linguagem natural, dado o carĂĄter quase ubĂquo da referĂȘncia ao tempo cronĂłlogico em muitas lĂnguas, entre as quais o PortuguĂȘs. Apesar de tudo, o processamento temporal encontra-se ainda incipiente para bastantes lĂnguas, sendo o PortuguĂȘs uma delas. A presente dissertação tem vĂĄrios objetivos. Por um lado vem colmatar esta lacuna existente, desenvolvendo e disponibilizando recursos que suportam o desenvolvimento de ferramentas para esta tarefa, utilizando esta lĂngua, e desenvolvendo tambĂ©m precisamente este tipo de ferramentas. Por outro serve tambĂ©m para relatar resultados importantes da pesquisa nesta ĂĄrea do processamento temporal. Neste trabalho, mostra- -se como o processamento temporal requer e beneficia da modelação de conhecimento de diversos nĂveis: gramatical, lĂłgico, acerca do mundo, etc. Adicionalmente, sĂŁo explorados tanto mĂ©todos de aprendizagem automĂĄtica como abordagens baseadas em regras, desenvolvendo-se sistemas hĂbridos capazes de tirar partido das vantagens de cada um destes dois tipos de abordagem.Fundação para a CiĂȘncia e a Tecnologia (FCT, SFRH/BD/40140/2007
Dublin Institute of Technology, Kevin Street : Calendar 1991/92
Calendar of academic year 1991/92.
Contents include. DIT Courses, fee structures, undergrad programmes, short courses, fees, research & development, campus companies, student services, college regulations, Graduates and prizewinners, awards and external examiners, advisory services for prospective students, college structures, college staff and college library.
Foreward by F.M. Brennan, President
Recommended from our members
B!SON: A Tool for Open Access Journal Recommendation
Finding a suitable open access journal to publish scientific work is a complex task: Researchers have to navigate a constantly growing number of journals, institutional agreements with publishers, fundersâ conditions and the risk of Predatory Publishers. To help with these challenges, we introduce a web-based journal recommendation system called B!SON. It is developed based on a systematic requirements analysis, built on open data, gives publisher-independent recommendations and works across domains. It suggests open access journals based on title, abstract and references provided by the user. The recommendation quality has been evaluated using a large test set of 10,000 articles. Development by two German scientific libraries ensures the longevity of the project
Jewish Studies in the Digital Age
The digitisation boom of the last two decades, and the rapid advancement of digital tools to analyse data in myriad ways, have opened up new avenues for humanities research. This volume discusses how the so-called digital turn has affected the field of Jewish Studies, explores the current state of the art and probes how digital developments can be harnessed to address the specific questions, challenges and problems in the field
Teaching Classics in the Digital Age
The papers and videos presented here are the result of the international conference 'Teaching Classics in the Digital Age' held online on the 15 and 16 June 2020. As digital media provide new possibilities for teaching and outreach in Classics, the conference 'Teaching Classics in the Digital Age' aimed at presenting current approaches to digital teaching and sharing best practices by bringing together different projects and practitioners from all fields of Classics (including Classical Archaeology, Greek and Latin Studies and Ancient History). Furthermore, it aimed at starting a discussion about principles, problems and the future of teaching Classics in the 21st century within and beyond its single fields
Creating a frequency-based Turkish-English Loanword Cognates Word List (TELCWL)
This lexical study aims to establish a frequency-based Turkish-English Loanword Cognates Word List (TELCWL) to assist Turkish English learnersâ improvement in English language learning and the corresponding pedagogical practice. A final list of 582 Turkish-English loan-based cognate word pairs was derived from the New General Service List (NGSL) and the Frequency Dictionary of Turkish (FDT). For pedagogical purposes, the TELCWL was divided into five sublists with different features of the cognates in spelling and pronunciation. The coverages of the TELCWL were particularly high in discipline and field-specific corpora on average compared to general service written (5%) and spoken corpora (3.5%), accounting for more than 7%. This result suggests that the TELCWL may be more beneficial for enhancing learnersâ reading and writing ability; in addition, not only general Turkish English learners but also learners who need to improve their English language proficiency in specific disciplines can benefit from the TELCWL. Further pedagogical implications are made for English instructors regarding the employment of the TELCWL in English classrooms in Turkey
CyberResearch on the Ancient Near East and Eastern Mediterranean
CyberResearch on the Ancient Near East and Neighboring Regions provides case studies on archaeology, objects, cuneiform texts, and online publishing, digital archiving, and preservation.
Eleven chapters present a rich array of material, spanning the fifth through the first millennium BCE, from Anatolia, the Levant, Mesopotamia, and Iran. Customized cyber- and general glossaries support readers who lack either a technical background or familiarity with the ancient cultures. Edited by Vanessa Bigot Juloux, Amy Rebecca Gansell, and Alessandro Di Ludovico, this volume is dedicated to broadening the understanding and accessibility of digital humanities tools, methodologies, and results to Ancient Near Eastern Studies. Ultimately, this book provides a model for introducing cyber-studies to the mainstream of humanities research
- âŠ