2,121 research outputs found

    Content-based Filtering Recommendation Approach to Label Irish Legal Judgements

    Get PDF
    Machine learning approaches are applied across several domains to either simplify or automate tasks which directly result in saved time or cost. Text document labelling is one such task that requires immense human knowledge about the domain and efforts to review, understand and label the documents. The company Stare Decisis summarises legal judgements and labels them as they are made available on Irish public legal source www.courts.ie. This research presents a recommendation-based approach to reduce the time for solicitors at Stare Decisis by reducing many numbers of available labels to pick from to a concentrated few that potentially contains the relevant label for a given judgement. To solve this problem, traditional and state-of-the-art text feature representations along with K-Nearest Neighbour recommender using both cosine similarity and word mover\u27s distance are developed and compared. A series of experiments are designed starting from TF vectors and KNN recommender which is set as a baseline. Further experiments were designed after observing the results of the current experiment. Pre-trained word2vec was used in this experiment as a baseline for state-of-the-art approaches and domain specific embeddings were developed using data scraped from legal text sources

    Data Science, Machine learning and big data in Digital Journalism: A survey of state-of-the-art, challenges and opportunities

    Get PDF
    Digital journalism has faced a dramatic change and media companies are challenged to use data science algo-rithms to be more competitive in a Big Data era. While this is a relatively new area of study in the media landscape, the use of machine learning and artificial intelligence has increased substantially over the last few years. In particular, the adoption of data science models for personalization and recommendation has attracted the attention of several media publishers. Following this trend, this paper presents a research literature analysis on the role of Data Science (DS) in Digital Journalism (DJ). Specifically, the aim is to present a critical literature review, synthetizing the main application areas of DS in DJ, highlighting research gaps, challenges, and op-portunities for future studies. Through a systematic literature review integrating bibliometric search, text min-ing, and qualitative discussion, the relevant literature was identified and extensively analyzed. The review reveals an increasing use of DS methods in DJ, with almost 47% of the research being published in the last three years. An hierarchical clustering highlighted six main research domains focused on text mining, event extraction, online comment analysis, recommendation systems, automated journalism, and exploratory data analysis along with some machine learning approaches. Future research directions comprise developing models to improve personalization and engagement features, exploring recommendation algorithms, testing new automated jour-nalism solutions, and improving paywall mechanisms.Acknowledgements This work was supported by the FCT-Funda?a ? o para a CiĂȘncia e Tecnologia, under the Projects: UIDB/04466/2020, UIDP/04466/2020, and UIDB/00319/2020

    Human Resources Recommender system based on discrete variables

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceNatural Language Processing and Understanding has become one of the most exciting and challenging fields in the area of Artificial Intelligence and Machine Learning. With the rapidly changing business environment and surroundings, the importance of having the data transformed in such a way that makes it easy to interpret is the greatest competitive advantage a company can have. Having said this, the purpose of this thesis dissertation is to implement a recommender system for the Human Resources department in a company that will aid the decision-making process of filling a specific job position with the right candidate. The recommender system fill be fed with applicants, each being represented by their skills, and will produce a subset of most adequate candidates given a job position. This work uses StarSpace, a novelty neural embedding model, whose aim is to represent entities in a common vectorial space and further perform similarity measures amongst them

    A Survey on Linked Data and the Social Web as facilitators for TEL recommender systems

    Get PDF
    Personalisation, adaptation and recommendation are central features of TEL environments. In this context, information retrieval techniques are applied as part of TEL recommender systems to filter and recommend learning resources or peer learners according to user preferences and requirements. However, the suitability and scope of possible recommendations is fundamentally dependent on the quality and quantity of available data, for instance, metadata about TEL resources as well as users. On the other hand, throughout the last years, the Linked Data (LD) movement has succeeded to provide a vast body of well-interlinked and publicly accessible Web data. This in particular includes Linked Data of explicit or implicit educational nature. The potential of LD to facilitate TEL recommender systems research and practice is discussed in this paper. In particular, an overview of most relevant LD sources and techniques is provided, together with a discussion of their potential for the TEL domain in general and TEL recommender systems in particular. Results from highly related European projects are presented and discussed together with an analysis of prevailing challenges and preliminary solutions.LinkedU

    User-Generated Data Network Effects and Market Competition Dynamics

    Get PDF
    This Article defines User-Generated Data (“UGD”) network effects, distinguishes them from the more familiar concept of traditional network effects, and explores their implications for market competition dynamics. It explains that UGD network effects produce various efficiencies for digital service providers (“data platforms”) by empowering their services’ optimization, personalization, and continuous diversification. In light of these efficiencies, competition dynamics in UGD-driven markets tend to be unstable and lead to the formation of dominant multi-industry conglomerates. These processes will enhance social welfare because they are natural and efficient. Conversely, countervailing UGD network effects also empower data platforms to detect and neutralize competitive threats, price discriminate among users, and manipulate users’ behaviors. The realization of these effects will result in inefficiencies, which will undermine social welfare. After a comprehensive analysis of conflicting economic forces, this Article sets the ground for informed policymaking. It suggests that emerging calls to aggravate antitrust enforcement and to “break up” Big Tech are ill-advised. Instead, this Article calls for policymakers to draw inspiration from traditional network industries’ public utility and open-access regulations

    An approach to human-machine teaming in legal investigations using anchored narrative visualisation and machine learning

    Get PDF
    During legal investigations, analysts typically create external representations of an investigated domain as resource for cognitive offloading, reflection and collaboration. For investigations involving very large numbers of documents as evidence, creating such representations can be slow and costly, but essential. We believe that software tools, including interactive visualisation and machine learning, can be transformative in this arena, but that design must be predicated on an understanding of how such tools might support and enhance investigator cognition and team-based collaboration. In this paper, we propose an approach to this problem by: (a) allowing users to visually externalise their evolving mental models of an investigation domain in the form of thematically organized Anchored Narratives; and (b) using such narratives as a (more or less) tacit interface to cooperative, mixed initiative machine learning. We elaborate our approach through a discussion of representational forms significant to legal investigations and discuss the idea of linking such representations to machine learning
    • 

    corecore