Search CORE

150 research outputs found

PredictChain: Empowering Collaboration and Data Accessibility for AI in a Decentralized Blockchain-based Marketplace

Author: Patterson Connor J.
Pisano Matthew T.
Seneviratne Oshani
Publication venue
Publication date: 27/07/2023
Field of study

Limited access to computing resources and training data poses significant challenges for individuals and groups aiming to train and utilize predictive machine learning models. Although numerous publicly available machine learning models exist, they are often unhosted, necessitating end-users to establish their computational infrastructure. Alternatively, these models may only be accessible through paid cloud-based mechanisms, which can prove costly for general public utilization. Moreover, model and data providers require a more streamlined approach to track resource usage and capitalize on subsequent usage by others, both financially and otherwise. An effective mechanism is also lacking to contribute high-quality data for improving model performance. We propose a blockchain-based marketplace called "PredictChain" for predictive machine-learning models to address these issues. This marketplace enables users to upload datasets for training predictive machine learning models, request model training on previously uploaded datasets, or submit queries to trained models. Nodes within the blockchain network, equipped with available computing resources, will operate these models, offering a range of archetype machine learning models with varying characteristics, such as cost, speed, simplicity, power, and cost-effectiveness. This decentralized approach empowers users to develop improved models accessible to the public, promotes data sharing, and reduces reliance on centralized cloud providers

arXiv.org e-Print Archive

Datan esikäsittely- ja visualisointityökalu

Author: Multamäki M. (Markus)
Publication venue: University of Oulu
Publication date: 03/09/2020
Field of study

Tiivistelmä. Koneoppimismenetelmien tehokkaaksi hyödyntämiseksi on tärkeää, että käyttäjällä on tietoa datajoukosta ja sen rakenteista. Tämän takia tavallinen ensimmäinen askel uuden datan tapauksessa on turvautua erilaisiin visualisointimenetelmiin. Visualisoinnin tarkoituksena on löytää samankaltaisuussuhteita datasta ja muodostaa alustavaa käsitystä sen rakenteista. Kiinnostavaa on esimerkiksi tietää datan jakautumisesta erilaisiin ryhmiin. Ennen visualisointia moniulotteinen data on kuitenkin saatava pudotettua kahteen tai kolmeen ulottuvuuteen, jotta ihminen voisi tehdä siitä havaintoja. Tähän vastataan dimensionaalisuuden vähentämismenetelmillä. Dimensionaalisuuden vähentämisellä on visualisoinnin lisäksi roolinsa koneoppimisessa myös piirteiden tehostamisessa. Dimensionaalisuuden aiheuttamien ongelmien lisäksi useimmat koneoppimismenetelmät vaativat datan skaalausta tai normalisointia ennen niiden käyttöä. Skaalaus tai normalisointi on yleisesti tärkeää, sillä useassa tapauksessa datan piirteiden arvoalueet poikkeavat toisistaan huomattavasti. Tässä kandidaatintyössä on perehdytty datan skaalauksiin ja normalisointeihin, sekä dimensionaalisuuden vähentämiseen erilaisilla menetelmillä. Lisäksi on tutkittu erilaisten data-aineistojen rakennetta käsittelemällä niitä edellä mainituin keinoin. Työn tarkoituksena on valaista data-aineistoon perehtymisen tärkeyttä ja esitellä menetelmiä, joilla koneoppimisen tuloksia voidaan parantaa. Työssä on kehitetty Python-kielinen työkalu, jonka avulla datan käsittely ja visualisointi onnistuu helposti graafisen käyttöliittymän avulla. Se on ensisijaisesti tarkoitettu opetustarkoituksiin.Data preprocessing and visualization tool. Abstract. For effective utilization of machine learning methods, it is important that the user has information about the dataset and its structures. Therefore, it is common to use visualization as the first step when dealing with new data. The purpose of visualization is to find similarities in data and to get insight about its structures. For example, it is interesting to find out whether the data is clustered. Before visualization, the data needs to be transformed into two or three dimensions so that humans can make observations from it. This is the step where dimensionality reduction is used. In addition, dimensionality reduction plays role in machine learning when features are required to be more efficient. On top of the problems caused by dimensionality, many machine learning methods require input data to be somehow scaled or normalized. Scaling or normalization is important because it is common that features in datasets are in different scales and distributions. This bachelor’s thesis introduces and experiments with different methods for data scaling, normalization and dimensionality reduction. Various real-life datasets and their structures are explored with these methods. The purpose of this is to underline the importance of gaining familiarity with new datasets and to introduce some common methods that can be used to improve results of machine learning methods. The concrete contribution of this thesis is a data analysis tool developed using Python programming language. The tool is primarily intended for educational purposes and it makes data handling and visualization easier with the use of a graphical user interface

University of Oulu Repository - Jultika

Realizing EDGAR: eliminating information asymmetries through artificial intelligence analysis of SEC filings

Author: Giarusso Ryan H.
Publication venue: UNI ScholarWorks
Publication date: 01/01/2017
Field of study

The U.S. Securities and Exchange Commission (SEC) maintains a publicly-accessible database of all required filings of all publicly traded companies. Known as EDGAR (Electronic Data Gathering, Analysis, and Retrieval), this database contains documents ranging from annual reports of major companies to personal disclosures of senior managers. However, the common user and particularly the retail investor are overwhelmed by the deluge of information, not empowered. EDGAR as it currently functions entrenches the information asymmetry between these retail investors and the large financial institutions with which they often trade. With substantial research staffs and budgets coupled to an industry standard of “playing both sides” of a transaction, these investors “in the know” lead price fluctuations while others must follow. In general, this thesis applies recent technological advancements to the development of software tools that will derive valuable insights from EDGAR documents in an efficient time period. While numerous such commercial products currently exist, all come with significant price tags and many still rely on significant human involvement in deriving such insights. Recent years, however, have seen an explosion in the fields of Machine Learning (ML) and Natural Language Processing (NLP), which show promise in automating many of these functions with greater efficiency. ML aims to develop software which learns parameters from large datasets as opposed to traditional software which merely applies a programmer’s logic. NLP aims to read, understand, and generate language naturally, an area where recent ML advancements have proven particularly adept. Specifically, this thesis serves as an exploratory study in applying recent advancements in ML and NLP to the vast range of documents contained in the EDGAR database. While algorithms will likely never replace the hordes of research analysts that now saturate securities markets nor the advantages that accrue to large and diverse trading desks, they do hold the potential to provide small yet significant insights at little cost. This study first examines methods for document acquisition from EDGAR with a focus on a baseline efficiency sufficient for the real-time trading needs of market participants. Next, it applies recent advancements in ML and NLP, specifically recurrent neural networks, to the task of standardizing financial statements across different filers. Finally, the conclusion contextualizes these findings in an environment of continued technological and commercial evolution

University of Northern Iowa

Quayside Operations Planning Under Uncertainty

Author: Iris Cagatay
Jin Jian Gang
Lee Der-Hong
Publication venue
Publication date: 01/01/2015
Field of study

Online Research Database In Technology

A novel dynamic and social perspective of multiple criteria decision making

Author: Corrente Salvatore
Di Stefano Alessandro
Giacchi Evelina
Greco Salvatore
La Corte Aurelio
Scatá Marialisa
Publication venue
Publication date: 01/01/2015
Field of study

Teeside University's Research Repository

Three Risky Decades: A Time for Econophysics?

Author
Publication venue: 'MDPI AG'
Publication date: 12/08/2022
Field of study

Our Special Issue we publish at a turning point, which we have not dealt with since World War II. The interconnected long-term global shocks such as the coronavirus pandemic, the war in Ukraine, and catastrophic climate change have imposed significant humanitary, socio-economic, political, and environmental restrictions on the globalization process and all aspects of economic and social life including the existence of individual people. The planet is trapped—the current situation seems to be the prelude to an apocalypse whose long-term effects we will have for decades. Therefore, it urgently requires a concept of the planet's survival to be built—only on this basis can the conditions for its development be created. The Special Issue gives evidence of the state of econophysics before the current situation. Therefore, it can provide excellent econophysics or an inter-and cross-disciplinary starting point of a rational approach to a new era

Directory of Open Access Books (DOAB)

Essays in High Frequency Trading and Market Structure

Author: Harrison M.
Harrison M.
Publication venue: University of East London
Publication date: 01/01/2023
Field of study

High Frequency Trading (HFT) is the use of algorithmic trading technology to gain a speed advantage when operating in financial markets. The increasing gap between the fastest and the slowest players in financial markets raises questions around the efficiency of markets, the strategies players must use to trade effectively and the overall fairness of markets which regulators must maintain. This research explores markets affected by HFT activity from three perspectives. Firstly an updated microstructure model is proposed to allow for empirical exploration of current levels of noise in financial markets, this illustrates current noise levels are not disruptive to dominant trading strategies. Second, a ARCH type model is used to de-compose market data into a series of traders working price levels to demonstrate that in cases of suspected market abuse, regulators can assess the impact individual traders make on price even in fast markets. Finally, a review of various HFT control measures are examined in terms of effectiveness and in light of an ordoliberal benchmark of fairness. The work illustrates the extents to which HFT activity is not yet disruptive, but also shows where HFT can be a conduit for market abuse and provides a series of recommendations around use of circuit breakers, algorithmic governance standards and additional considerations where assets are dual listed in different countries

UEL Research Repository at University of East London

2016 Oklahoma Research Day Full Program

Author: Northeastern State University
Publication venue: SWOSU Digital Commons
Publication date: 01/01/2016
Field of study

This document contains all abstracts from the 2016 Oklahoma Research Day held at Northeastern State University

SWOSU Digital Commons (Southwestern Oklahoma State University)