7 research outputs found

    Data Science, Machine learning and big data in Digital Journalism: A survey of state-of-the-art, challenges and opportunities

    Get PDF
    Digital journalism has faced a dramatic change and media companies are challenged to use data science algo-rithms to be more competitive in a Big Data era. While this is a relatively new area of study in the media landscape, the use of machine learning and artificial intelligence has increased substantially over the last few years. In particular, the adoption of data science models for personalization and recommendation has attracted the attention of several media publishers. Following this trend, this paper presents a research literature analysis on the role of Data Science (DS) in Digital Journalism (DJ). Specifically, the aim is to present a critical literature review, synthetizing the main application areas of DS in DJ, highlighting research gaps, challenges, and op-portunities for future studies. Through a systematic literature review integrating bibliometric search, text min-ing, and qualitative discussion, the relevant literature was identified and extensively analyzed. The review reveals an increasing use of DS methods in DJ, with almost 47% of the research being published in the last three years. An hierarchical clustering highlighted six main research domains focused on text mining, event extraction, online comment analysis, recommendation systems, automated journalism, and exploratory data analysis along with some machine learning approaches. Future research directions comprise developing models to improve personalization and engagement features, exploring recommendation algorithms, testing new automated jour-nalism solutions, and improving paywall mechanisms.Acknowledgements This work was supported by the FCT-Funda?a ? o para a Ciência e Tecnologia, under the Projects: UIDB/04466/2020, UIDP/04466/2020, and UIDB/00319/2020

    Combining privileged information to improve context-aware recommender systems

    Get PDF
    A recommender system is an information filtering technology which can be used to predict preference ratings of items (products, services, movies, etc) and/or to output a ranking of items that are likely to be of interest to the user. Context-aware recommender systems (CARS) learn and predict the tastes and preferences of users by incorporating available contextual information in the recommendation process. One of the major challenges in context-aware recommender systems research is the lack of automatic methods to obtain contextual information for these systems. Considering this scenario, in this paper, we propose to use contextual information from topic hierarchies of the items (web pages) to improve the performance of context-aware recommender systems. The topic hierarchies are constructed by an extension of the LUPI-based Incremental Hierarchical Clustering method that considers three types of information: traditional bag-of-words (technical information), and the combination of named entities (privileged information I) with domain terms (privileged information II). We evaluated the contextual information in four context-aware recommender systems. Different weights were assigned to each type of information. The empirical results demonstrated that topic hierarchies with the combination of the two kinds of privileged information can provide better recommendations.FAPESP (grant #2010/20564-8, #2012/13830-9, and #2013/16039-3, São Paulo Research Foundation (FAPESP))CAPE

    Mining future spatiotemporal events and their sentiment from online news articles for location-aware recommendation system

    No full text
    The future-related information mining task for online web resources such as news articles and blogs has been getting more attention due to its potential usefulness in supporting individual’s decision mak-ing in a world where massive new data are generated daily. Instead of building a data-driven model to predict the future, one extracts future events from these massive data with high probability that they occur at a future time and a specific geographic location. Such spatiotemporal future events can be utilized by a recommender sys-tem on a location-aware device to provide localized future event suggestions. In this paper, we describe a systematic approach for mining fu-ture spatiotemporal events from web; in particular, news articles. In our application context, a valid event is defined both spatially and temporally. The mining procedure consists of two main steps

    Automated Detection of Financial Events in News Text

    Get PDF
    Today’s financial markets are inextricably linked with financial events like acquisitions, profit announcements, or product launches. Information extracted from news messages that report on such events could hence be beneficial for financial decision making. The ubiquity of news, however, makes manual analysis impossible, and due to the unstructured nature of text, the (semi-)automatic extraction and application of financial events remains a non-trivial task. Therefore, the studies composing this dissertation investigate 1) how to accurately identify financial events in news text, and 2) how to effectively use such extracted events in financial applications. Based on a detailed evaluation of current event extraction systems, this thesis presents a competitive, knowledge-driven, semi-automatic system for financial event extraction from text. A novel pattern language, which makes clever use of the system’s underlying knowledge base, allows for the definition of simple, yet expressive event extraction rules that can be applied to natural language texts. The system’s knowledge-driven internals remain synchronized with the latest market developments through the accompanying event-triggered update language for knowledge bases, enabling the definition of update rules. Additional research covered by this dissertation investigates the practical applicability of extracted events. In automated stock trading experiments, the best performing trading rules do not only make use of traditional numerical signals, but also employ news-based event signals. Moreover, when cleaning stock data from disruptions caused by financial events, financial risk analyses yield more accurate results. These results suggest that events detected in news can be used advantageously as supplementary parameters in financial applications
    corecore