webLyzard technology gmbh
Not a member yet
    102 research outputs found

    Topic Wizard - Interactive Visual Tool for Defining and Disambiguating Topics via Regular Expressions

    Get PDF
    The Topic Wizard presented in this paper is a new interactive tool that supports the definition and revision of topics in form of regular expressions, combining a dialog for prefix and suffix extension with a word tree-based representation of phrases for restricting the query to specific expressions. The examples stem from the Media Watch on Climate Change, a public Web portal based on the webLyzard Web intelligence platform that aggregates environmental stakeholder communication from multiple online sources

    Optimizing Dependency Parsing Throughput

    Get PDF
    Dependency parsing is considered a key technology for improving information extraction tasks. Research indicates that dependency parsers spend more than 95 of their total runtime on feature computations. Based on this insight, this paper investigates the potential of improving parsing throughput by designing feature representations which are optimized for combining single features to more complex feature templates and by optimizing parser constraints. Applying these techniques to MDParser increased its throughput four fold, yielding Syntactic Parser, a dependency parser that outperforms comparable approaches by factor 25 to 400

    User Profile Modelling in Online Communities

    Get PDF
    With the rise of social networking sites user information is becoming increasingly complex and sophisticated. The needs, behaviours and preferences of users are dynamically changing, depending on their background knowledge, their current task, and many other parameters. Existing ontology models capture demographic information as well as the users’ activities and interactions in online communities. These vocabularies represent the raw data, but actionable knowledge comes from filtering these data, selecting useful features, and mining the resulting information to uncover the most salient preferences, behaviours and needs of the users. In this paper we propose reusing and reengineering ontological resources to provide a broader representation of users and the dynamics that emerge from the virtual social environments in which they participate

    Visualizing Contextual Information in Aggregated Web Content Repositories

    Get PDF
    Understanding stakeholder perceptions and the impact of campaigns are key insights for communication experts and policy makers. A structured analysis of Web content can help answer these questions, particularly if this analysis involves the ability to extract, disambiguate and visualize contextual information. After summarizing methods used for acquiring and annotating Web content repositories, we present visualization techniques to explore the lexical, geospatial and relational context of entities in these repositories. The examples stem from the Media Watch on Climate Change, a publicly available Web portal that aggregates environmental resources from various online sources

    Enriching Semantic Knowledge Bases for Opinion Mining in Big Data Applications

    Get PDF
    This paper presents a novel method for contextualizing and enriching large semantic knowledge bases for opinion mining with a focus on Web intelligence platforms and other high-throughput big data applications. The method is not only applicable to traditional sentiment lexicons, but also to more comprehensive, multi-dimensional affective resources such as SenticNet. It comprises the following steps: (i) identify ambiguous sentiment terms, (ii) provide context information extracted from a domain-specific training corpus, and (iii) ground this contextual information to structured background knowledge sources such as ConceptNet and WordNet. A quantitative evaluation shows a significant improvement when using an enriched version of SenticNet for polarity classification. Crowdsourced gold standard data in conjunction with a qualitative evaluation sheds light on the strengths and weaknesses of the concept grounding, and on the quality of the enrichment process

    Metadata Enriched Visualization of Keywords in Context

    Get PDF
    This paper presents an interactive, synchronized and metadata enriched implementation of the Word Tree metaphor, which is an interactive visualization technique to show Keywords-in-Context (KWIC). Embedded into a Web intelligence platform focusing on climate change coverage, it provides users with a tool to better understand the usage of terms in large document collections. One of the novelties is the implementation of filters for the Word Tree, which shifts the focus of attention directly onto significant phrases, instead of punctuation or fill-words, inherent to natural language usage

    Supporting the Collaborative Editing of Documents with Real-Time Content Recommendations

    Get PDF
    A new culture of participation and social innovation is driven by advances in Web technology and the proliferation of social media platforms. The document editing environment introduced in this paper reflects this trend, enabling users to collaboratively create and edit documents. In order to support this process, the system provides tailored information services in the form of content recommendations based on the evolving text of the co-authored documents. The context-sensitive editor is part of the real-time synchronization framework of the Media Watch on Climate Change, a public Web intelligence portal that provides a rich repository of environmental knowledge and a portfolio of visual components including tag clouds, keyword graphs and geographic projections

    Linked Enterprise Data for Fine Grained Named Entity Linking and Web Intelligence

    Get PDF
    To identify trends and assign metadata elements such as location and sentiment to the correct entities, Web intelligence applications require methods for linking named entities and revealing relations between organizations, persons and products. For this purpose we introduce Recognyze, a named entity linking component that uses background knowledge obtained from linked data repositories. This paper outlines the underlying methods, provides insights into the migration of proprietary knowledge sources to linked enterprise data, and discusses the lessons learned from adapting linked data for named entity linking. A large dataset obtained from Orell Füssli, the largest Swiss business information provider, serves as the main showcase. This dataset includes more than nine million triples on companies, their contact information, management, products and brands. We identify major challenges towards applying this data for named entity linking and conduct a comprehensive evaluation based on several news corpora to illustrate how Recognyze helps address them, and how it improves the performance of named entity linking components drawing upon linked data rather than machine learning techniques

    Enhancing Web Intelligence with the Content of Online Video Fragments

    Get PDF
    This demo will show work to enhance a Web intelligence platform which crawls and analyses online news and social media content about climate change topics to uncover sentiment and opinions around those topics over time to also incorporate the content within non-textual media, in our case YouTube videos. YouTube contains a lot of organisational and individual opinion about climate change which currently cannot be taken into account by the platforms sentiment and opinion mining technology. We describe the approach taken to extract and include the content of YouTube videos and why we believe this can lead to improved Web intelligence capabilities

    Games with a Purpose or Mechanised Labour? A Comparative Study

    Get PDF
    Mechanised labour and games with a purpose are the two most popular human computation genres, frequently employed to support research activities in fields as diverse as natural language processing, semantic web or databases. Research projects typically rely on either one or the other of these genres, and therefore there is a general lack of understanding of how these two genres compare and whether and how they could be used together to offset their respective weaknesses. This paper addresses these open questions. It first identifies the differences between the two genres, primarily in terms of cost, speed and result quality, based on existing studies in the literature. Secondly, it reports on a comparative study which involves performing the same task through both genres and comparing the results. The study’s findings demonstrate that the two genres are highly complementary, which not only makes them suitable for different types of projects, but also opens new opportunities for building cross-genre human computation solutions that exploit the strengths of both genres simultaneously

    99

    full texts

    102

    metadata records
    Updated in last 30 days.
    webLyzard technology gmbh is based in Austria
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇