3,197 research outputs found

    Discourse structure and language technology

    Get PDF
    This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.An increasing number of researchers and practitioners in Natural Language Engineering face the prospect of having to work with entire texts, rather than individual sentences. While it is clear that text must have useful structure, its nature may be less clear, making it more difficult to exploit in applications. This survey of work on discourse structure thus provides a primer on the bases of which discourse is structured along with some of their formal properties. It then lays out the current state-of-the-art with respect to algorithms for recognizing these different structures, and how these algorithms are currently being used in Language Technology applications. After identifying resources that should prove useful in improving algorithm performance across a range of languages, we conclude by speculating on future discourse structure-enabled technology.Peer Reviewe

    Econometrics meets sentiment : an overview of methodology and applications

    Get PDF
    The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software

    COMPENDIUM: a text summarisation tool for generating summaries of multiple purposes, domains, and genres

    Get PDF
    In this paper, we present a Text Summarisation tool, compendium, capable of generating the most common types of summaries. Regarding the input, single- and multi-document summaries can be produced; as the output, the summaries can be extractive or abstractive-oriented; and finally, concerning their purpose, the summaries can be generic, query-focused, or sentiment-based. The proposed architecture for compendium is divided in various stages, making a distinction between core and additional stages. The former constitute the backbone of the tool and are common for the generation of any type of summary, whereas the latter are used for enhancing the capabilities of the tool. The main contributions of compendium with respect to the state-of-the-art summarisation systems are that (i) it specifically deals with the problem of redundancy, by means of textual entailment; (ii) it combines statistical and cognitive-based techniques for determining relevant content; and (iii) it proposes an abstractive-oriented approach for facing the challenge of abstractive summarisation. The evaluation performed in different domains and textual genres, comprising traditional texts, as well as texts extracted from the Web 2.0, shows that compendium is very competitive and appropriate to be used as a tool for generating summaries.This research has been supported by the project “Desarrollo de Técnicas Inteligentes e Interactivas de Minería de Textos” (PROMETEO/2009/119) and the project reference ACOMP/2011/001 from the Valencian Government, as well as by the Spanish Government (grant no. TIN2009-13391-C04-01)

    Investigating and extending the methods in automated opinion analysis through improvements in phrase based analysis

    Get PDF
    Opinion analysis is an area of research which deals with the computational treatment of opinion statement and subjectivity in textual data. Opinion analysis has emerged over the past couple of decades as an active area of research, as it provides solutions to the issues raised by information overload. The problem of information overload has emerged with the advancements in communication technologies which gave rise to an exponential growth in user generated subjective data available online. Opinion analysis has a rich set of applications which are used to enable opportunities for organisations such as tracking user opinions about products, social issues in communities through to engagement in political participation etc.The opinion analysis area shows hyperactivity in recent years and research at different levels of granularity has, and is being undertaken. However it is observed that there are limitations in the state-of-the-art, especially as dealing with the level of granularities on their own does not solve current research issues. Therefore a novel sentence level opinion analysis approach utilising clause and phrase level analysis is proposed. This approach uses linguistic and syntactic analysis of sentences to understand the interdependence of words within sentences, and further uses rule based analysis for phrase level analysis to calculate the opinion at each hierarchical structure of a sentence. The proposed opinion analysis approach requires lexical and contextual resources for implementation. In the context of this Thesis the approach is further presented as part of an extended unifying framework for opinion analysis resulting in the design and construction of a novel corpus. The above contributions to the field (approach, framework and corpus) are evaluated within the Thesis and are found to make improvements on existing limitations in the field, particularly with regards to opinion analysis automation. Further work is required in integrating a mechanism for greater word sense disambiguation and in lexical resource development

    Automatic Summarization of Video Game Reviews

    Get PDF
    Τα ηλεκτρονικά παιχνίδια αποτελούν εδώ και πολλά χρόνια ένα μοναδικό μέσο αλληλεπίδρασης μεταξύ παιχτών και εταιρειών. Οι δυναμικές που εμφανίζονται μέσω των διαδικτυακών δημοσιεύσεων έχουν αυξήσει σημαντικά τον αριθμό των σχολίων ανά παιχνίδι, οδηγώντας στην ανάπτυξη ενδιαφερουσών κοινοτήτων. Αυτή η αύξηση έχει, με τη σειρά της, οδηγήσει στη δύσκολη αντιμετώπιση του τεράστιου όγκου και ποικίλης ποιότητας σχολίων σαν πηγή πληροφορίας. Αυτή η δουλειά εξετάζει αν και πως οι κριτικές ηλ. παιχνιδιών μπορούν να συνοψιστούν, βάση των προυπάρχουσων εννοιών στην περίληψη βασισμένη σε χαρακτηριστικά και στην ανάλυση συναισθήματος. Παρέχουμε αρχικά ένα τυπικό ορισμό του προβλήματος με σκοπό να θέσουμε τη βάση για την προσέγγιση που προτείνουμε. Έπειτα, αναπτύσσουμε μία βασική υλοποίηση, που προσπαθεί να αντιμετωπίσει τις μεμονωμένες υποεργασίες που συντελούν το πρόβλημα της περιλήψης κριτικών ηλ. παιχνιδιών. Πιο συγκεκριμένα, δεδομένου ενός συνόλου από κριτικές για ένα παιχνίδι, εφαρμόζουμε συσταδοποίηση κ-μέσων για να αναγνωρίσουμε ομάδες όμοιων προτάσεων. Στη συνέχεια χρησιμοποιούμε λίστες λέξεων με σκοπό να αντιστοιχίσουμε τις παραγόμενες συστάδες σε προκαθορισμένα χαρακτηρικά των ηλ. παιχνιδιών, όπως τα γραφικά και το gameplay. Εν συνεχεία, εφαρμόζουμε ανάλυση συναισθήματος χρησιμοποιώντας μία μέθοδο βασισμένη σε κανόνες με σκοπό να εξάγουμε τα συναισθήματα που κυριαρχούν στη συστάδα. Επιπροσθέτως, προσφέρουμε προκαταρκτικά ευρήματα για το κατά πόσο τα χαρακτηριστικά που εντοπίστηκαν σε ένα σύνολο από σχόλια μπορούν να αξιολογηθούν με συνέπεια από ανθρώπους. Αυτή η διαδικασία αξιολόγησης επιβεβαιώνει ότι οι υποεργασίες της περίληψης κριτικών είναι εφικτές και ορίζει μία μέθοδο για την αξιολόγηση της επίδοσης μελλοντικών συστημάτων.Video game reviews have constituted a unique means of interaction between players and companies for many years. The dynamics appearing through online publishing have significantly grown the number of comments per game, giving rise to very interesting communities. The growth has, in turn, led to a difficulty in dealing with the volume and varying quality of the comments as a source of information. This work studies whether and how game reviews can be summarized, based on the notions pre-existing in aspect-based summarization and sentiment analysis. We initially provide a formal definition of the problem in order to set the basis for our suggested approach. We then devise a baseline implementation, that attempts to tackle the individual subtasks that constitute the problem of video game review summarization. More precisely, given a set of reviews of a video game, we apply k-means clustering in order to identify groups of similar sentences. We then utilize word lists with the aim of mapping the produced clusters to predefined game aspects, like graphics and gameplay. Subsequently, we apply sentiment analysis using a rule-based method in order to extract the sentiments that pervade each cluster. Additionally, we offer preliminary findings on whether aspects detected in a set of comments can be consistently evaluated by human users. The evaluation ascertains that review summarization subtasks are achievable and sets a method for the evaluation of performance of future systems

    Linguistic and Cultural Analysis of Empathy: Strategies for Japanese-English Translation

    Get PDF
    Examining linguistic and pragmatic aspects of the translation of Japanese empathy and politeness in contemporary novels reveals that socio-cultural meaning is often neutralised. From an educational perspective, examples for intercultural language teaching and learning, universal and culturally specific values, and the attribution of meaning in collectivist and individualist societies can be examined. Implications for the viability of a universal approach to translation are discussed in relation to values that are specific to Japanese culture

    Essential Speech and Language Technology for Dutch: Results by the STEVIN-programme

    Get PDF
    Computational Linguistics; Germanic Languages; Artificial Intelligence (incl. Robotics); Computing Methodologie
    corecore