214 research outputs found

    Language technologies for an eLearning scenario

    Get PDF
    One of the problems with eLearning platforms when collating together documents from different resources is the retrieval of documents and their accessibility. By providing documents with additional metadata using Language Technologies one enables users to access information more effectively. In this paper we present an overview of the objectives and results achieved for the LT4eL Project, which aims at providing Language Technologies to eLearning platforms and to integrate semantic knowledge to facilitate the management, distribution and retrieval of the learning material.peer-reviewe

    Approaches towards a Lexical Web: the role of Interoperability

    Get PDF
    After highlighting some of the major dimensions that are relevant for Language Resources (LR) and contribute to their infrastructural role, I underline some priority areas of concern today with respect to implementing an open Language Infrastructure, and specifically what we could call a ?Lexical Web?. My objective is to show that it is imperative to define an underlying global strategy behind the set of initiatives which are/can be launched in Europe and world-wide, and that it is necessary an allembracing vision and a cooperation among different communities to achieve more coherent and useful results. I end up mentioning two new European initiatives that in this direction and promise to be influential in shaping the future of the LR area

    Romanian Language Technology — a view from an academic perspective

    Get PDF
    The article reports on research and developments pursued by the Research Institute for Artificial Intelligence "Mihai Draganescu" of the Romanian Academy in order to narrow the gaps identified by the deep analysis on the European languages made by Meta-Net white papers and published by Springer in 2012. Except English, all the European languages needed significant research and development in order to reach an adequate technological level, in line with the expectations and requirements of the knowledge society

    {YAGO}2: A Spatially and Temporally Enhanced Knowledge Base from {Wikipedia}

    Get PDF
    We present YAGO2, an extension of the YAGO knowledge base, in which entities, facts, and events are anchored in both time and space. YAGO2 is built automatically from Wikipedia, GeoNames, and WordNet. It contains 80 million facts about 9.8 million entities. Human evaluation confirmed an accuracy of 95\% of the facts in YAGO2. In this paper, we present the extraction methodology, the integration of the spatio-temporal dimension, and our knowledge representation SPOTL, an extension of the original SPO-triple model to time and space

    VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

    Get PDF
    We introduce VoxPopuli, a large-scale multilingual corpus providing 100K hours of unlabelled speech data in 23 languages. It is the largest open data to date for unsupervised representation learning as well as semi-supervised learning. VoxPopuli also contains 1.8K hours of transcribed speeches in 16 languages and their aligned oral interpretations into 5 other languages totaling 5.1K hours. We provide speech recognition baselines and validate the versatility of VoxPopuli unlabelled data in semi-supervised learning under challenging out-of-domain settings. We will release the corpus at https://github.com/facebookresearch/voxpopuli under an open license.Comment: Accepted to ACL 2021 (long paper

    Computational Etymology: Word Formation and Origins

    Get PDF
    While there are over seven thousand languages in the world, substantial language technologies exist only for a small percentage of these. The large majority of world languages do not have enough bilingual or even monolingual data for developing technologies like machine translation using current approaches. The computational study and modeling of word origins and word formation is a key step in developing comprehensive translation dictionaries for low-resource languages. This dissertation presents novel foundational work in computational etymology, a promising field which this work is pioneering. The dissertation also includes novel models of core vocabulary, dictionary information distillation, and of the diverse linguistic processes of word formation and concept realization between languages, including compounding, derivation, sense-extension, borrowing, and historical cognate relationships, utilizing statistical and neural models trained on the unprecedented scale of thousands of languages. Collectively these are important components in tackling the grand challenges of universal translation, endangered language documentation and revitalization, and supporting technologies for speakers of thousands of underserved languages

    MSPGI : a geoportal feasibility study - Planning Authority MSP geoportal MSP Implementation Initiative

    Get PDF
    Directive 2014/89/EU calls for Member States to apply Maritime Spatial Planning (MSP) in their marine waters. In applying this framework, Member States are required to adopt a process to analyse and organise human activities to achieve ecological, economic and social objectives. The preparation of a MSP plan is the key deliverable expected from Member States and in doing so are expected to organise the use of the best available data, and decide how to organise the sharing of information necessary for MSP plans. The availability of information for stakeholders can also contribute towards effective co-ordination at a national level particularly in regulating different maritime sectors.EASME/EMFF/2015/

    Multidimensional opinion mining from social data

    Get PDF
    Social media popularity and importance is on the increase due to people using it for various types of social interaction across multiple channels. This thesis focuses on the evolving research area of Social Opinion Mining, tasked with the identification of multiple opinion dimensions, such as subjectivity, sentiment polarity, emotion, affect, sarcasm, and irony, from user-generated content represented across multiple social media platforms and in various media formats, like textual, visual, and audio. Mining people’s social opinions from social sources, such as social media platforms and newswires commenting sections, is a valuable business asset that can be utilised in many ways and in multiple domains, such as Politics, Finance, and Government. The main objective of this research is to investigate how a multidimensional approach to Social Opinion Mining affects fine-grained opinion search and summarisation at an aspect-based level and whether such a multidimensional approach outperforms single dimension approaches in the context of an extrinsic human evaluation conducted in a real-world context: the Malta Government Budget, where five social opinion dimensions are taken into consideration, namely subjectivity, sentiment polarity, emotion, irony, and sarcasm. This human evaluation determines whether the multidimensional opinion summarisation results provide added-value to potential end-users, such as policy-makers and decision-takers, thereby providing a nuanced voice to the general public on their social opinions on topics of a national importance. Results obtained indicate that a more fine-grained aspect-based opinion summary based on the combined dimensions of subjectivity, sentiment polarity, emotion, and sarcasm or irony is more informative and more useful than one based on sentiment polarity only. This research contributes towards the advancement of intelligent search and information retrieval from social data and impacts entities utilising Social Opinion Mining results towards effective policy formulation, policy-making, decision-making, and decision-taking at a strategic level

    Blogging the hyperlocal : the disruption and renegotiation of hegemony in Malta

    Get PDF
    This thesis examines how blogging is being deployed to disrupt institutional hegemony in Malta. The island state is an example of a hyperlocal context that includes strong political, ecclesiastical and media institutions, advanced take-up of social technologies and a popular culture adjusting to the promise of modernity represented by EU membership. Popular discourse is dominated by political partisanship and advocacy journalism, with Malta being the only European country that permits political parties to directly own broadcasting stations.The primary evidence in this study is derived from an analysis of online texts during an organic crisis that eventually led to a national referendum to consider the introduction of divorce legislation in Malta. Using netnography supplemented by critical discourse analysis, the research identifies a set of strategies bloggers used to resist, challenge and disrupt the discourse of a hegemonic alliance that included the ruling political party, the Roman Catholic Church and their media. The empirical results indicate that blogging in Malta is contributing to the erosion of the Church’s hegemony. Subjects that were previously marginalised as alternative are increasingly finding an online outlet in blog posts, social media networks and commentary on newspaper portals.Nevertheless, a culture of social surveillance together with the natural barriers of size and the permeability of the social web facilitates the appropriation of blogging by political blocs, who remain vigilant to the opportunity of extending their influence in new media to disrupt horizontal networks of information exchange. Blogging is increasingly operating as a component of a hybrid media ecosystem that thrives on reflexive cycles of entertainment: the independent newspaper media, for long an active partner in the hegemonic set up in Malta, are being transformed and rendered more permeable at the same time as their power and influence are being eroded. The study concludes that a new episteme is more likely to emerge through the symbiosis of hybrid media and reflexive waves of networked individualism than systemic, organised attempts at online political disruption