9 research outputs found

    Big Data Fusion Model for Heterogeneous Financial Market Data (FinDF)

    Get PDF
    The dawn of big data has seen the volume, variety, and velocity of data sources increase dramatically. Enormous amounts of structured, semi-structured and unstructured heterogeneous data can be garnered at a rapid rate, making analysis of such big data a herculean task. This has never been truer for data relating to financial stock markets, the biggest challenge being the 7 Vs of big data which relate to the collection, pre-processing, storage and real-time processing of such huge quantities of disparate data sources. Data fusion techniques have been adopted in a wide number of fields to cope with such vast amounts of heterogeneous data from multiple sources and fuse them together in order to produce a more comprehensive view of the data and its underlying relationships. Research into the fusing of heterogeneous financial data is scant within the literature, with existing work only taking into consideration the fusing of text-based financial documents. The lack of integration between financial stock market data, social media comments, financial discussion board posts and broker agencies means that the benefits of data fusion are not being realised to their full potential. This paper proposes a novel data fusion model, inspired by the data fusion model introduced by the Joint Directors of Laboratories, for the fusing of disparate data sources relating to financial stocks. Data with a diverse set of features from different data sources will supplement each other in order to obtain a Smart Data Layer, which will assist in scenarios such as irregularity detection and prediction of stock prices

    A methodology for the resolution of cashtag collisions on Twitter – A natural language processing & data fusion approach

    Get PDF
    Investors utilise social media such as Twitter as a means of sharing news surrounding financials stocks listed on international stock exchanges. Company ticker symbols are used to uniquely identify companies listed on stock exchanges and can be embedded within tweets to create clickable hyperlinks referred to as cashtags, allowing investors to associate their tweets with specific companies. The main limitation is that identical ticker symbols are present on exchanges all over the world, and when searching for such cashtags on Twitter, a stream of tweets is returned which match any company in which the cashtag refers to - we refer to this as a cashtag collision. The presence of colliding cashtags could sow confusion for investors seeking news regarding a specific company. A resolution to this issue would benefit investors who rely on the speediness of tweets for financial information, saving them precious time. We propose a methodology to resolve this problem which combines Natural Language Processing and Data Fusion to construct company-specific corpora to aid in the detection and resolution of colliding cashtags, so that tweets can be classified as being related to a specific stock exchange or not. Supervised machine learning classifiers are trained twice on each tweet – once on a count vectorisation of the tweet text, and again with the assistance of features contained in the company-specific corpora. We validate the cashtag collision methodology by carrying out an experiment involving companies listed on the London Stock Exchange. Results show that several machine learning classifiers benefit from the use of the custom corpora, yielding higher classification accuracy in the prediction and resolution of colliding cashtags

    A Smart Data Ecosystem for the Monitoring of Financial Market Irregularities

    Get PDF
    Investments made on the stock market depend on timely and credible information being made available to investors. Such information can be sourced from online news articles, broker agencies, and discussion platforms such as financial discussion boards and Twitter. The monitoring of such discussion is a challenging yet necessary task to support the transparency of the financial market. Although financial discussion boards are typically monitored by administrators who respond to other users reporting posts for misconduct, actively monitoring social media such as Twitter remains a difficult task. Users sharing news about stock-listed companies on Twitter can embed cashtags in their tweets that mimic a company’s stock ticker symbol (e.g. TSCO on the London Stock Exchange refers to Tesco PLC). A cashtag is simply the ticker characters prefixed with a ’$’ symbol, which then becomes a clickable hyperlink – similar to a hashtag. Twitter, however, does not distinguish between companies with identical ticker symbols that belong to different exchanges. TSCO, for example, refers to Tesco PLC on the London Stock Exchange but also refers to the Tractor Supply Company listed on the NASDAQ. This research has referred to such scenarios as a ’cashtag collision’. Investors who wish to capitalise on the fast dissemination that Twitter provides may become susceptible to tweets containing colliding cashtags. Further exacerbating this issue is the presence of tweets referring to cryptocurrencies, which also feature cashtags that could be identical to the cashtags used for stock-listed companies. A system that is capable of identifying stock-specific tweets by resolving such collisions, and assessing the credibility of such messages, would be of great benefit to a financial market monitoring system by filtering out non-significant messages. This project has involved the design and development of a novel, multi-layered, smart data ecosystem to monitor potential irregularities within the financial market. This ecosystem is primarily concerned with the behaviour of participants’ communicative practices on discussion platforms and the activity surrounding company events (e.g. a broker rating being issued for a company). A wide array of data sources – such as tweets, discussion board posts, broker ratings, and share prices – is collected to support this process. A novel data fusion model fuses together these data sources to provide synchronicity to the data and allow easier analysis of the data to be undertaken by combining data sources for a given time window (based on the company the data refers to and the date and time). This data fusion model, located within the data layer of the ecosystem, utilises supervised machine learning classifiers - due to the domain expertise needed to accurately describe the origin of a tweet in a binary way - that are trained on a novel set of features to classify tweets as being related to a London Stock Exchange-listed company or not. Experiments involving the training of such classifiers have achieved accuracy scores of up to 94.9%. The ecosystem also adopts supervised learning to classify tweets concerning their credibility. Credibility classifiers are trained on both general features found in all tweets, and a novel set of features only found within financial stock tweets. The experiments in which these credibility classifiers were trained have yielded AUC scores of up to 94.3. Once the data has been fused, and irrelevant tweets have been identified, unsupervised clustering algorithms are then used within the detection layer of the ecosystem to cluster tweets and posts for a specific time window or event as potentially irregular. The results are then presented to the user within the presentation and decision layer, where the user may wish to perform further analysis or additional clustering

    A methodology for the resolution of cashtag collisions on Twitter – A natural language processing & data fusion approach

    Get PDF
    © 2019 The Authors. Investors utilise social media such as Twitter as a means of sharing news surrounding financials stocks listed on international stock exchanges. Company ticker symbols are used to uniquely identify companies listed on stock exchanges and can be embedded within tweets to create clickable hyperlinks referred to as cashtags, allowing investors to associate their tweets with specific companies. The main limitation is that identical ticker symbols are present on exchanges all over the world, and when searching for such cashtags on Twitter, a stream of tweets is returned which match any company in which the cashtag refers to - we refer to this as a cashtag collision. The presence of colliding cashtags could sow confusion for investors seeking news regarding a specific company. A resolution to this issue would benefit investors who rely on the speediness of tweets for financial information, saving them precious time. We propose a methodology to resolve this problem which combines Natural Language Processing and Data Fusion to construct company-specific corpora to aid in the detection and resolution of colliding cashtags, so that tweets can be classified as being related to a specific stock exchange or not. Supervised machine learning classifiers are trained twice on each tweet – once on a count vectorisation of the tweet text, and again with the assistance of features contained in the company-specific corpora. We validate the cashtag collision methodology by carrying out an experiment involving companies listed on the London Stock Exchange. Results show that several machine learning classifiers benefit from the use of the custom corpora, yielding higher classification accuracy in the prediction and resolution of colliding cashtags

    The Role of Geospatial Information and Effective Partnerships in the Implementation of the International Agenda for Sustainable Development

    Get PDF
    The former United Nations Secretary-General Ban Ki-Moon (2014), repeated the core promise in the 1986 UN Declaration on the Right to Development, in which the General Assembly called for an approach guaranteeing meaningful participation of everyone in development and the fair distribution of the benefits of that development. To this end, partnerships are central and can lead to the dignity of the citizens involved as they participate in the development of their own communities. This dissertation research conducted in Manyatta A and B in the Port City of Kisumu, Kenya sought to do just that. The purpose of this study is to demonstrate the role of participatory development planning and collaborative technology platforms of geographic information systems (GIS) and GeoDesign in strengthening sustainable development and enhancing of human dignity. The study used a multimethod design comprised of participatory action research, situational analysis, problem tree analysis, and stakeholder analysis approaches in partnership with the government, academia, business, civil society, and other stakeholders. The study shows how the newly formed government structure, post devolution, provides a functional framework to assist county and city governments to better determine and envision the future they want. This vision can be realized more rapidly through integrated planning to achieve poverty eradication and social, economic, and environmental sustainability, which are the three pillars of the 2030 Agenda for Sustainable Development. The citizens of informal settlements represent those who are farthest behind and who should be given priority. This study demonstrated the potential of inclusive and participatory development planning in restoring the dignity of those groups. This dissertation is available in open access at AURA: Antioch University Repository and Archive, http://aura.antioch.edu/, and OhioLINK ETD Center, https://etd.ohiolink.ed

    Socio-Cognitive and Affective Computing

    Get PDF
    Social cognition focuses on how people process, store, and apply information about other people and social situations. It focuses on the role that cognitive processes play in social interactions. On the other hand, the term cognitive computing is generally used to refer to new hardware and/or software that mimics the functioning of the human brain and helps to improve human decision-making. In this sense, it is a type of computing with the goal of discovering more accurate models of how the human brain/mind senses, reasons, and responds to stimuli. Socio-Cognitive Computing should be understood as a set of theoretical interdisciplinary frameworks, methodologies, methods and hardware/software tools to model how the human brain mediates social interactions. In addition, Affective Computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affects, a fundamental aspect of socio-cognitive neuroscience. It is an interdisciplinary field spanning computer science, electrical engineering, psychology, and cognitive science. Physiological Computing is a category of technology in which electrophysiological data recorded directly from human activity are used to interface with a computing device. This technology becomes even more relevant when computing can be integrated pervasively in everyday life environments. Thus, Socio-Cognitive and Affective Computing systems should be able to adapt their behavior according to the Physiological Computing paradigm. This book integrates proposals from researchers who use signals from the brain and/or body to infer people's intentions and psychological state in smart computing systems. The design of this kind of systems combines knowledge and methods of ubiquitous and pervasive computing, as well as physiological data measurement and processing, with those of socio-cognitive and affective computing

    Mesoscopic Physics of Quantum Systems and Neural Networks

    Get PDF
    We study three different kinds of mesoscopic systems – in the intermediate region between macroscopic and microscopic scales consisting of many interacting constituents: We consider particle entanglement in one-dimensional chains of interacting fermions. By employing a field theoretical bosonization calculation, we obtain the one-particle entanglement entropy in the ground state and its time evolution after an interaction quantum quench which causes relaxation towards non-equilibrium steady states. By pushing the boundaries of the numerical exact diagonalization and density matrix renormalization group computations, we are able to accurately scale to the thermodynamic limit where we make contact to the analytic field theory model. This allows to fix an interaction cutoff required in the continuum bosonization calculation to account for the short range interaction of the lattice model, such that the bosonization result provides accurate predictions for the one-body reduced density matrix in the Luttinger liquid phase. Establishing a better understanding of how to control entanglement in mesoscopic systems is also crucial for building qubits for a quantum computer. We further study a popular scalable qubit architecture that is based on Majorana zero modes in topological superconductors. The two major challenges with realizing Majorana qubits currently lie in trivial pseudo-Majorana states that mimic signatures of the topological bound states and in strong disorder in the proposed topological hybrid systems that destroys the topological phase. We study coherent transport through interferometers with a Majorana wire embedded into one arm. By combining analytical and numerical considerations, we explain the occurrence of an amplitude maximum as a function of the Zeeman field at the onset of the topological phase – a signature unique to MZMs – which has recently been measured experimentally [Whiticar et al., Nature Communications, 11(1):3212, 2020]. By placing an array of gates in proximity to the nanowire, we made a fruitful connection to the field of Machine Learning by using the CMA-ES algorithm to tune the gate voltages in order to maximize the amplitude of coherent transmission. We find that the algorithm is capable of learning disorder profiles and even to restore Majorana modes that were fully destroyed by strong disorder by optimizing a feasible number of gates. Deep neural networks are another popular machine learning approach which not only has many direct applications to physical systems but which also behaves similarly to physical mesoscopic systems. In order to comprehend the effects of the complex dynamics from the training, we employ Random Matrix Theory (RMT) as a zero-information hypothesis: before training, the weights are randomly initialized and therefore are perfectly described by RMT. After training, we attribute deviations from these predictions to learned information in the weight matrices. Conducting a careful numerical analysis, we verify that the spectra of weight matrices consists of a random bulk and a few important large singular values and corresponding vectors that carry almost all learned information. By further adding label noise to the training data, we find that more singular values in intermediate parts of the spectrum contribute by fitting the randomly labeled images. Based on these observations, we propose a noise filtering algorithm that both removes the singular values storing the noise and reverts the level repulsion of the large singular values due to the random bulk

    Moot courts and mock trials in North American legal education : an appraisal, source book and annotated bibliography

    Get PDF
    201 leaves ; 28 cm.Includes bibliographies.Abstract unavailable

    Traversing the Inner Seas: Contacts and Continuity in and around Scotland, the Hebrides, and the North of Ireland

    Get PDF
    Throughout the medieval period, the ‘Inner Seas’ linking Scotland, the Hebrides, and the north of Ireland represented a confluence and crucible of identity. The region’s myriad islands served as stepping stones in a maritime network across which people, property, and perceptions travelled freely and purposefully. Encompassing three main themes, ten authors, and a multitude of interdisciplinary insights, this peer-reviewed volume represents some of the foremost research from the most recent residential conferences of the Scottish Society for Northern Studies, exploring the turbulent history and legacy of this interconnected seascape as both centre and periphery
    corecore