623 research outputs found

    An Ontological Approach to Misinformation: Quickly Finding Relevant Information

    Get PDF
    Identifying misinformation (i.e. rumors) is a growing field of research in the information systems field. This is due to the fact that during recent tragedies (i.e. Boston Bombings, Ebola, etcetera), rumors spread rapidly on social media platforms, which will hide the facts about an event. This results in rumors being spread even more, further hiding the events. In this study, we draw from research from the semantic web to tackle this problem. We propose the use of ontologies and related concepts can help find accurate information for a case quickly and accurately. Combined with a weighting formula, we will be able to display the most relevant results to an interested party. In this research in progress, we outline our plan on how to accomplish this once an ontology and dataset is found

    A systematic survey of online data mining technology intended for law enforcement

    Get PDF
    As an increasing amount of crime takes on a digital aspect, law enforcement bodies must tackle an online environment generating huge volumes of data. With manual inspections becoming increasingly infeasible, law enforcement bodies are optimising online investigations through data-mining technologies. Such technologies must be well designed and rigorously grounded, yet no survey of the online data-mining literature exists which examines their techniques, applications and rigour. This article remedies this gap through a systematic mapping study describing online data-mining literature which visibly targets law enforcement applications, using evidence-based practices in survey making to produce a replicable analysis which can be methodologically examined for deficiencies

    Mining Meaning from Wikipedia

    Get PDF
    Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced.Comment: An extensive survey of re-using information in Wikipedia in natural language processing, information retrieval and extraction and ontology building. Accepted for publication in International Journal of Human-Computer Studie

    Transformers and tradition: using Generative AI and Deep Learning for financial markets prediction

    Get PDF
    Artificial intelligence has revolutionized numerous industries, and financial markets are no exception. With the ability to process vast amounts of data quickly and accurately, AI algorithms have been increasingly used in finance to predict stock prices, detect fraud, and optimize investment strategies. However, the full potential of AI in finance still needs to be explored, and researchers continue to explore new ways to apply machine learning techniques to financial challenges. This thesis investigates whether advanced Generative AI and Deep Learning techniques are more effective in extracting information for predicting financial markets than conventional natural language processing methods. The first part of this thesis analyzes quarterly SEC 10-Q filings for S&P 500 companies from January 2000 to December 2019 to show how artificial intelligence techniques can provide reasoning about changes in corporate disclosures indicative of future company performance. This thesis finds that by leveraging the reasoning capabilities of the Claude2 large language model on the Management Discussion & Analysis section of a 10-Q, negative excess returns of -5.5% over 180 days (- 11% annualized) can be avoided. The paper introduces two novel approaches: A) Concatenating Deep Learning architectures comparing quarterly filings, and B) Summarization methods using Claude2 to extract sentiment signals related to significant business risks, profitability, legal, and market pressures. Together, these techniques demonstrate new ways of expanding beyond rudimentary natural language processing approaches that many investment firms have historically used, such as lexicons and cosine similarity, to answer fundamental questions related to firm performance. The second part of the thesis takes a step further, developing an enhanced sentiment model and utilizing Bitcoin subreddit data from December 2010 to January 2022 to predict the price of Bitcoin 60 days in advance. The Reddit text data is known for its high noise level, with non-relevant price information such as advertisements or technical advice. This noise can significantly impact the accuracy of the predictions. To address this, the research proposes a novel approach that combines a Few-Shot RoBERTa topic classification model with sample augmentation on training data powered by ChatGPT. This approach effectively reduces the noise, creating a more robust sentiment signal. The enhanced sentiment signal is then integrated with other Bitcoin on-chain features in a nonlinear multivariate LightGBM model. The results clearly demonstrate the impact of noise reduction, with the F1 score for predicting the sign of Bitcoin 60 days in advance increasing from 0.26 to 0.63 on the test set

    SUURJ Volume 7 Entire Volume

    Get PDF
    corecore