1,181 research outputs found

    Building the Great Recession News Corpus (GRNC): a contemporary diachronic corpus of economy news in English

    Get PDF
    The paper describes the process involved in developing the Great Recession News Corpus (GRNC); a specialized web corpus, which contains a wide range of written texts obtained from the Business section of The Guardian and The New York Times between 2007 and 2015. The corpus was compiled as the main resource in a sentiment analysis project on the economic/financial domain. In this paper we describe its design, compilation criteria and methodological approach, as well as the description of the overall creation process. Although the corpus can be used for a variety of purposes, we include a sentiment analysis study on the evolution of the sentiment conveyed by the word credit during the years of the Great Recession which we think provides validation of the corpus.Ministerio de Economía, Industria y Competitividad. Proyecto "Lingmotif2: Plataforma Universal de Análisis de Sentimiento" (FFI2016-78141-P

    Artificial Intelligence, Social Media and Supply Chain Management: The Way Forward

    Get PDF
    Supply chain management (SCM) is a complex network of multiple entities ranging from business partners to end consumers. These stakeholders frequently use social media platforms, such as Twitter and Facebook, to voice their opinions and concerns. AI-based applications, such as sentiment analysis, allow us to extract relevant information from these deliberations. We argue that the context-specific application of AI, compared to generic approaches, is more efficient in retrieving meaningful insights from social media data for SCM. We present a conceptual overview of prevalent techniques and available resources for information extraction. Subsequently, we have identified specific areas of SCM where context-aware sentiment analysis can enhance the overall efficiency

    Employability skills: Profiling data scientists in the digital labour market

    Get PDF
    In the current scenario, data scientists are expected to make sense of vast stores of big data, which are becoming increasingly complex and heterogeneous in nature. In the context of today's rapid technological development and its application in a growing array of fields, this role is evolving simultaneously. The present study provides an insight into the current expectations of employers seeking to hire individuals with this job title. It is argued that gaining a better understanding of data scientists’ employability criteria and the evolution of this professional role is crucial. The focus is placed on the desired prerequisites articulated through job advertisements, thus deriving relevant means for furthering theory and practice. It was achieved by harvesting relevant data from job advertisements published on US employment websites, which currently attract the US market's highest recruitment traffic. The key contribution of this study is to have identified means of systematically mapping skills, experience, and qualifications sought by employers for their data scientists, thus providing a data-driven pathway for employability and avoiding skills gaps and mismatches in a profession that is pivotal in the Industry 4.0

    Damage Detection and Mitigation in Open Collaboration Applications

    Get PDF
    Collaborative functionality is changing the way information is amassed, refined, and disseminated in online environments. A subclass of these systems characterized by open collaboration uniquely allow participants to *modify* content with low barriers-to-entry. A prominent example and our case study, English Wikipedia, exemplifies the vulnerabilities: 7%+ of its edits are blatantly unconstructive. Our measurement studies show this damage manifests in novel socio-technical forms, limiting the effectiveness of computational detection strategies from related domains. In turn this has made much mitigation the responsibility of a poorly organized and ill-routed human workforce. We aim to improve all facets of this incident response workflow. Complementing language based solutions we first develop content agnostic predictors of damage. We implicitly glean reputations for system entities and overcome sparse behavioral histories with a spatial reputation model combining evidence from multiple granularity. We also identify simple yet indicative metadata features that capture participatory dynamics and content maturation. When brought to bear over damage corpora our contributions: (1) advance benchmarks over a broad set of security issues ( vandalism ), (2) perform well in the first anti-spam specific approach, and (3) demonstrate their portability over diverse open collaboration use cases. Probabilities generated by our classifiers can also intelligently route human assets using prioritization schemes optimized for capture rate or impact minimization. Organizational primitives are introduced that improve workforce efficiency. The whole of these strategies are then implemented into a tool ( STiki ) that has been used to revert 350,000+ damaging instances from Wikipedia. These uses are analyzed to learn about human aspects of the edit review process, properties including scalability, motivation, and latency. Finally, we conclude by measuring practical impacts of work, discussing how to better integrate our solutions, and revealing outstanding vulnerabilities that speak to research challenges for open collaboration security

    A Data Quality Multidimensional Model for Social Media Analysis

    Get PDF
    Social media platforms have become a new source of useful information for companies. Ensuring the business value of social media first requires an analysis of the quality of the relevant data and then the development of practical business intelligence solutions. This paper aims at building high-quality datasets for social business intelligence (SoBI). The proposed method offers an integrated and dynamic approach to identify the relevant quality metrics for each analysis domain. This method employs a novel multidimensional data model for the construction of cubes with impact measures for various quality metrics. In this model, quality metrics and indicators are organized in two main axes. The first one concerns the kind of facts to be extracted, namely: posts, users, and topics. The second axis refers to the quality perspectives to be assessed, namely: credibility, reputation, usefulness, and completeness. Additionally, quality cubes include a user-role dimension so that quality metrics can be evaluated in terms of the user business roles. To demonstrate the usefulness of this approach, the authors have applied their method to two separate domains: automotive business and natural disasters management. Results show that the trade-off between quantity and quality for social media data is focused on a small percentage of relevant users. Thus, data filtering can be easily performed by simply ranking the posts according to the quality metrics identified with the proposed method. As far as the authors know, this is the first approach that integrates both the extraction of analytical facts and the assessment of social media data quality in the same framework.Funding for open access charge: CRUE-Universitat Jaume
    • …
    corecore